Session 8

Application (ChatGPT)

Author

Affiliations

Moritz Mähr

University of Bern

University of Basel

Published

April 11, 2025

Modified

August 9, 2025

TLDR

Recap
Leseauftrag gemeinsam anschauen
Applikationen diskutieren

Recap

mündlich

Leseauftrag “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?”

This influential 2021 paper by Emily Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell critically examines the rapid development of increasingly large language models (LMs) such as GPT-3 and Google’s Switch-C. It questions the assumption that “bigger is always better” in natural language processing (NLP) and highlights key risks that should concern technologists and humanists alike.

1. What Are Language Models Doing?

Large LMs are trained to predict and generate text based on statistical patterns in massive datasets. While they can produce impressively fluent text, the authors argue that this does not mean they understand language. Instead, they are “stochastic parrots”: systems that generate plausible-sounding output without actual comprehension or intent.

2. Environmental and Financial Costs

Training large LMs consumes enormous amounts of energy, contributing significantly to CO₂ emissions. This is a justice issue: the environmental impact disproportionately affects marginalized communities, while the benefits of these technologies accrue mostly to wealthy, English-speaking users and corporations.

3. Bias and Harm in Training Data

Most LMs are trained on huge, uncurated datasets scraped from the internet. These datasets overrepresent hegemonic, often discriminatory viewpoints and exclude marginalized voices. The models inherit and reproduce these biases, including racism, sexism, ableism, and more—creating risks of psychological harm and systemic discrimination.

4. Illusions of Understanding

Because these systems can produce text that appears coherent, people may falsely assume the content is meaningful, factual, or generated by a human. This creates dangers of automation bias, misinformation, and manipulation (e.g. through fake news, extremist content, or abusive language).

5. Accountability and Documentation

A core critique is that datasets are often undocumented or under-documented, making it impossible to understand or audit how LMs behave. The authors argue for a “documentation budget” and recommend curating smaller, well-understood datasets over massive opaque ones.

6. Displacement of Research Goals

Focusing on ever-larger LMs draws attention and resources away from alternative, potentially more equitable paths in language technology, such as: - Smaller, task-specific models - Multilingual or low-resource language research - Approaches centered on human linguistic meaning, not just surface-level form

7. Recommendations for Responsible Development

The authors propose shifting toward: - Pre-mortem analysis: anticipate harms before development begins - Value-sensitive design: include affected stakeholders in the design process - Environmental benchmarking: consider carbon and energy efficiency as research metrics - Research redirection: focus less on leaderboard metrics and more on social impact and inclusivity

Why It Matters for Digital Humanities: This paper bridges critical perspectives from linguistics, ethics, and STS (science and technology studies). For DH scholars, it encourages skepticism toward “black-box” AI systems and stresses the importance of: - Interrogating data sources - Understanding power and representation in digital systems - Advocating for inclusive, sustainable, and human-centered computational research

Applikationen diskutieren

mündlich