Session 7: Injustice & Biases in NLP

🎓 Course Materials

📑 Slides

📓 Notebooks

⚖️ Session 7: Injustice and Biases in NLP

In this session, we investigate one of the most pressing ethical issues in NLP: biases in language models and the broader implications of deploying LLMs in socially sensitive contexts.

We study where these biases come from, how they manifest, and what we can do to detect, mitigate, and monitor them — with a particular focus on Large Language Models like BERT and GPT.

We also explore the environmental costs of modern NLP, promoting not just fairness in output, but fairness in who pays the cost of progress.

🎯 Learning Objectives

Understand the different types of biases present in NLP systems.
Analyze real-world harms caused by bias in language technologies.
Explore how biases arise during training and deployment of LLMs.
Learn how to detect bias using statistical, adversarial, and prompt-based techniques.
Implement practical mitigation strategies: pre-, mid-, and post-training.
Understand the ecological footprint of LLMs and low-resource alternatives.

📚 Topics Covered

🧠 Foundations of Bias in NLP

Historical and societal roots of bias in AI.
Linguistic and cultural overrepresentation.
Gender, racial, and socioeconomic stereotyping in LLMs.
The "Stochastic Parrot" critique (Bender et al., 2021).

🔍 Detection Strategies

Statistical Fairness Criteria: Independence and separation metrics.
Prompt-based Bias Testing: e.g., Sheng et al. (2019) templates.
Sentiment Disparities: Analyzing polarity across demographic descriptors.
Occupation Prediction Bias: Kirk et al. (2021) methodology.

🛠️ Mitigation Approaches

Pre-training: Balanced datasets, multilingual corpora (e.g., BLOOM).
During Training: Fairness-aware loss functions (Chuang et al., 2021).
Post-training:
Self-debiasing (Schick et al., 2021).
Neural editing (Suau et al., 2022).

🌍 Environmental Impacts

Carbon footprint of LLMs (Strubell et al., Luccioni et al.)
Model compression techniques:
Distillation (Hinton et al., 2015)
Quantization
Pruning

🧠 Key Takeaways

Topic	Risk/Concern	Mitigation Strategy
Gender/Racial Bias	Reinforces stereotypes	Prompt analysis, fairness-aware training
Linguistic Inequality	Language exclusion	Multilingual training, inclusive benchmarks
Coherence vs. Understanding	Fluent but biased/misleading output	Self-diagnosis and auditing tools
Ecological Impact	High energy & emissions	Distillation, quantization, pruning

📖 Bibliography & Recommended Reading

The Social Dilemma – Documentary
Bender et al. (2021): On the Dangers of Stochastic Parrots – Paper
Blodgett et al. (2020): Language (Technology) is Power – Paper
Sheng et al. (2019): The Woman Worked as a Babysitter – Paper
Kirk et al. (2021): Bias in GPT Occupational Predictions – Paper
Chuang et al. (2021): Fairness Constraints in Loss – Paper
Schick et al. (2021): Self-Diagnosis and Debiasing – Paper
Suau et al. (2022): Neuron-Level Bias Mitigation – Paper
Strubell et al. (2019): Energy and Policy Considerations for Deep NLP – Paper
Luccioni et al. (2023): Carbon Footprint of BLOOM – Paper

💻 Practical Components

Prompt-Based Bias Detection: Use controlled sentence templates to assess gender and racial stereotypes in text generation.
Cross-Language Model Evaluation: Compare model predictions across languages to quantify linguistic fairness.
Reduce the size of a BERT model: Use distillation, quantization, and pruning to reduce the size of a BERT model.