Course 8: Introduction into Corpus Linguistics
Noah Bubenhofer · University of Zurich
Lecturer

Noah Bubenhofer is Professor of Linguistics at the Department of German Studies at the University of Zurich, specializing in corpus linguistic methods, corpus pragmatics, and visual linguistics. He is co-director of the Digital Society Initiative’s Community Libraries and co-director of the Linguistic Research Infrastructure (LiRI), where he promotes open research data practices across the Swiss linguistics community. His research connects empirical observation of language use with questions of social and cultural meaning, examining how linguistic patterns function as traces of discourse, ideology, and cultural change. He previously held a professorship in Digital Linguistics at the ZHAW and has contributed to the development of corpus platforms and visualization tools for large-scale language analysis.
🤖 Bio generated by AI from public academic profile. Homepage · ORCID
Lecture Overview
Overview
Bubenhofer introduces corpus linguistics as an empirical approach to language based on actual usage rather than intuition alone. He presents corpora as large, machine-readable collections of texts that allow researchers to detect patterns, study meaning in context, and connect linguistic data to social questions.
Main Points
- Corpus linguistics asks questions about language through observation of real language use, not only through introspection.
- Concordances and collocation profiles help researchers move from individual examples to typical contexts and recurring patterns.
- The lecture links corpus linguistics to contextualism and distributionalism: meaning emerges from how words are used in context.
- Bubenhofer also connects corpus thinking to large language models, describing them as systems that learn typical language patterns.
- His notion of corpus pragmatics emphasizes that linguistic patterns are traces of social action, discourse, and cultural change.
Examples Mentioned
- “Solidarity” as a search term in corpora
- Corona vocabulary and language change
- Political language and gendered language use
- Metadata and linguistic annotation in corpora
Source transcript: transcripts/Course 8_Bubenhofer_CorpusLinguistics.txt
Further Reading
See Zotero collection for 5 selected publications by this lecturer.