Session 7: Injustice & Biases in NLP
🎓 Course Materials
📑 Slides
Download Session 7 Slides (PDF)
📓 Notebooks
- Detecting Gender Bias in LLMs with Prompting
- Evaluating Cross-Linguistic Fairness in Classification
- Reduce the size of a BERT model
⚖️ Session 7: Injustice and Biases in NLP
In this session, we investigate one of the most pressing ethical issues in NLP: biases in language models and the broader implications of deploying LLMs in socially sensitive contexts.
We study where these biases come from, how they manifest, and what we can do to detect, mitigate, and monitor them — with a particular focus on Large Language Models like BERT and GPT.
We also explore the environmental costs of modern NLP, promoting not just fairness in output, but fairness in who pays the cost of progress.
🎯 Learning Objectives
- Understand the different types of biases present in NLP systems.
- Analyze real-world harms caused by bias in language technologies.
- Explore how biases arise during training and deployment of LLMs.
- Learn how to detect bias using statistical, adversarial, and prompt-based techniques.
- Implement practical mitigation strategies: pre-, mid-, and post-training.
- Understand the ecological footprint of LLMs and low-resource alternatives.
📚 Topics Covered
🧠 Foundations of Bias in NLP
- Historical and societal roots of bias in AI.
- Linguistic and cultural overrepresentation.
- Gender, racial, and socioeconomic stereotyping in LLMs.
- The "Stochastic Parrot" critique (Bender et al., 2021).
🔍 Detection Strategies
- Statistical Fairness Criteria: Independence and separation metrics.
- Prompt-based Bias Testing: e.g., Sheng et al. (2019) templates.
- Sentiment Disparities: Analyzing polarity across demographic descriptors.
- Occupation Prediction Bias: Kirk et al. (2021) methodology.
🛠️ Mitigation Approaches
- Pre-training: Balanced datasets, multilingual corpora (e.g., BLOOM).
- During Training: Fairness-aware loss functions (Chuang et al., 2021).
-
Post-training:
-
Self-debiasing (Schick et al., 2021).
- Neural editing (Suau et al., 2022).
🌍 Environmental Impacts
- Carbon footprint of LLMs (Strubell et al., Luccioni et al.)
-
Model compression techniques:
-
Distillation (Hinton et al., 2015)
- Quantization
- Pruning
🧠 Key Takeaways
Topic | Risk/Concern | Mitigation Strategy |
---|---|---|
Gender/Racial Bias | Reinforces stereotypes | Prompt analysis, fairness-aware training |
Linguistic Inequality | Language exclusion | Multilingual training, inclusive benchmarks |
Coherence vs. Understanding | Fluent but biased/misleading output | Self-diagnosis and auditing tools |
Ecological Impact | High energy & emissions | Distillation, quantization, pruning |
📖 Bibliography & Recommended Reading
- The Social Dilemma – Documentary
- Bender et al. (2021): On the Dangers of Stochastic Parrots – Paper
- Blodgett et al. (2020): Language (Technology) is Power – Paper
- Sheng et al. (2019): The Woman Worked as a Babysitter – Paper
- Kirk et al. (2021): Bias in GPT Occupational Predictions – Paper
- Chuang et al. (2021): Fairness Constraints in Loss – Paper
- Schick et al. (2021): Self-Diagnosis and Debiasing – Paper
- Suau et al. (2022): Neuron-Level Bias Mitigation – Paper
- Strubell et al. (2019): Energy and Policy Considerations for Deep NLP – Paper
- Luccioni et al. (2023): Carbon Footprint of BLOOM – Paper
💻 Practical Components
- Prompt-Based Bias Detection: Use controlled sentence templates to assess gender and racial stereotypes in text generation.
- Cross-Language Model Evaluation: Compare model predictions across languages to quantify linguistic fairness.
- Reduce the size of a BERT model: Use distillation, quantization, and pruning to reduce the size of a BERT model.