Python: Introduction to Natural Language Processing (NLP)

 As a universal programming language Python is used in a huge variety of application domains and is often used in data analysis tasks. For the analysis of textual data and especially in the interdisciplinary field of Natural Language Processing (NLP), Python is a very powerful tool.

NLP lies at the intersection of computational linguistics and artificial intelligence. It is an increasingly used domain as NLP enables computers to understand human languages and retrieve meaning from their analysis. Applications of NLP can be found in Machine Translation, Sentiment Analysis, Chatbots, Intelligent Systems, Spell Checking, Predictive Typing, Grammar correction etc.

General information

Duration 9 hours
  • Writing and running Python in iPython/Anaconda
  • Tokenization
  • NLTK corpora
  • Noise removal (digits, hyperlinks, contractions, punctuation marks, special characters, emoticons, whitespaces, spelling errors)
  • Text Normalization (stop words, lower case, stemming, lemmatization)
  • Information extraction (POS tagging, chunking, n-grams, named entities)
  • TF-IDF (with scikit-learn)
  • Semantic and sentiment analysis (lexical relations, synsets, semantic similarity)
APPB - Python Basics or equivalent knowledge is required. You should feel comfortable working with control structures, simple functions and different data types in Python.
This introductory course is directed for beginners and is suitable for anyone who wishes to analyze text in Python and gain a basic understanding of Natural Language Processing (NLP).
By the end of the introductory course, students will be able to
  • work with different file types in Python.
  • apply text pre-processing techniques for cleaning and preparing textual data.
  • extract information from textual data.
  • perform semantic and sentiment analysis.
In this introductory course, students will explore the basics of text analytics and NLP with the powerful Python package Natural Language Toolkit (NLTK) and in parts with scikit-learn. The course content is disseminated over 9 hours through slides, live coding of the instructor and in-class exercises in individual & pair work.


Code Referents Dates Available seats Place
HS24-ANLP1 Tsilimos Maria 14.10.2024 - 28.10.2024 (17:00 - 20:00 o'clock)
Online Course Course registration begins on 1 February for the spring semester and on 1 September for the autumn semester.