Course 14: Early Modern History after the Machine Learning Turn

Tobias Hodel · University of Bern / DARIAH

Lecturer

Tobias Hodel is Associate Professor of Digital Humanities at the University of Bern and a key contributor to DARIAH-CH, with expertise in machine learning for historical documents, handwritten text recognition (HTR), and the digital processing of medieval and early modern archival sources. He holds a doctorate in history from the University of Zurich (2016) and has been instrumental in projects such as READ (Recognition and Enrichment of Archival Documents) and the digital edition of the Königsfelden monastery records. His research combines source-critical historical methodology with transformer-based text recognition, named entity recognition, and annotation workflows for large-scale manuscript corpora. He also leads e-learning initiatives including Ad Fontes, a platform for teaching paleography and historical document analysis.

🤖 Bio generated by AI from public academic profile. Homepage · ORCID

Lecture Overview

Overview

Hodel introduces machine learning from the perspective of historical source work. The lecture balances enthusiasm for new possibilities in handwritten text recognition and annotation with a strong insistence on methodological reflection, source criticism, and awareness of the assumptions built into models.

Main Points

Machine learning opens new possibilities for working with medieval and early modern documents, especially in text recognition and downstream annotation.
Humanities scholars must remain critical about what counts as text, what is being modeled, and what kinds of questions should be delegated to algorithms.
Hodel treats text as a documentary object tied to images and material witnesses, not just as an abstract sequence of characters.
The lecture explains supervised learning, neural networks, training data, validation, and error rates in accessible terms.
Recognized text can become the basis for further tasks such as named entity recognition, linking, contextualization, and historical analysis, but only if researchers understand the biases and limitations of the pipeline.

Examples Mentioned

Medieval and early modern corpora
Handwritten text recognition engines and transformer models
Character error rates for historical scripts
Workflows from image to recognized and annotated text

Source transcript: transcripts/Course 14_Hodel_EarlyModernHist.txt

Reuse

CC BY-SA 4.0