Course 11: Natural Language Processing
Rico Sennrich · University of Zurich
Lecturer

Rico Sennrich is Associate Professor of Natural Language Processing at the Department of Computational Linguistics, University of Zurich, and an Honorary Fellow at the University of Edinburgh and an ELLIS Fellow. His research focuses on machine translation, multilingual NLP, tokenization, neural architectures, and data augmentation — contributions that have influenced language models including ChatGPT. He is widely recognized for his work on byte pair encoding (BPE) for neural machine translation and for advancing low-resource and efficient NLP methods. Sennrich has supported the DARIAH-CH consortium and contributes to bridging computational linguistics and digital humanities research in Switzerland.
🤖 Bio generated by AI from public academic profile. Homepage · ORCID
Lecture Overview
Overview
Sennrich presents natural language processing as a flexible toolkit for making text collections more accessible and analyzable. He focuses especially on language modeling and machine translation, but keeps returning to the broader question of how NLP can support digital humanities workflows.
Main Points
- NLP can assist with OCR post-processing, normalization, translation, named entity recognition, sentiment analysis, and other large-scale text tasks.
- The lecture uses machine translation to explain the broader logic of modern NLP models and why many of these methods generalize across tasks.
- Sennrich contrasts rule-based systems with later statistical and neural approaches, showing why data-driven methods became dominant.
- Parallel corpora are central for training translation systems, and the lecture highlights how large such datasets can be.
- A recurring theme is practical application: digital humanities researchers need to understand both what NLP makes possible and what preparation historical text collections require.
Examples Mentioned
- Historical machine translation
- Language modeling as a general technique
- Europarl, OpenSubtitles, and web-crawled parallel corpora
- OCR, normalization, translation, and annotation pipelines
Source transcript: transcripts/Course 11_Sennrich_NLP.txt
Further Reading
See Zotero collection for 5 selected publications by this lecturer.