Skip to content

Session 10: Agents

πŸŽ“ Course Materials

πŸ“‘ Slides

Download Session 10 Slides (PDF)

πŸ““ Notebooks


πŸš€ Session 10: Hallucinations and Agents in Large Language Models

In this session, we explore the challenges of hallucinations in LLMs and how to address them. We dive into concepts such as function calling and multi-LLM evaluation as solutions for reducing errors. Finally, we introduce agent-based frameworks (e.g., ReAct) as a next-level strategy for reasoning, planning, and tool use.

We connect theory to practice with real-world examples and Python code for building reliable, dynamic LLM agents.

🎯 Learning Objectives

  1. Understand the nature of hallucinations and outdated knowledge in LLMs.
  2. Explore mitigation strategies: prompt engineering, retrieval-augmented generation (RAG), function calling, and multi-LLM approaches.
  3. Learn how LLMs as judges can validate or compare outputs, increasing reliability.
  4. Discover how agents (ReAct framework) enable iterative planning, reasoning, and tool use.
  5. Identify failure modes (planning and tool execution errors) in agent-based systems.

πŸ“š Topics Covered

🌟 Hallucinations and Errors

  • Intrinsic vs. Extrinsic Causes: From data mismatch to model limitations.
  • Examples: Factual inaccuracies, outdated knowledge, and misinformation.
  • Consequences: Misinformation, trust erosion, and practical failures.

βš™οΈ Mitigation Techniques

  • Prompt Engineering: Crafting effective prompts to reduce ambiguity.
  • RAG: Grounding responses with external data.
  • Function Calling: Using APIs and tools for accurate, real-time answers.
  • Multi-LLM Evaluation: Using LLMs as judges for quality control.

πŸ€– Agents and Advanced Use Cases

  • Agent Limitations: Planning, tool execution, and efficiency challenges.
  • ReAct Framework: Combining reasoning and acting for iterative solutions.
  • ReAct Steps: Think β†’ Act β†’ Observe β†’ Repeat.
  • Implementation: LangChain/LlamaIndex agents in Python for multi-step problem solving.

🧠 Key Takeaways

Concept/Technique Purpose Benefit
Hallucination Analysis Identify sources of errors Better LLM reliability and trustworthiness
Prompt Engineering Clear instructions for LLM More precise and accurate outputs
RAG & Function Calling External knowledge integration Reduces hallucinations and outdated answers
LLM-as-a-Judge Output validation and ranking Automated quality assurance
Agent Frameworks (ReAct) Iterative tool-based reasoning Handle complex, multi-step tasks effectively


πŸ’» Practical Components

  • LLM Tools Integration: Function calling examples for real-time data.
  • LLM-as-a-Judge: Code snippets for output ranking and quality control.
  • ReAct Agent Implementation: LangChain-based examples with external tool usage.