3 Introduction to AI Agents

📍 Chapter Overview

Time: ~60 minutes | Level: Beginner | No prerequisites

“An AI Agent is not a chatbot that answers questions. It’s a system that perceives its environment, reasons about it, and takes actions to achieve goals.”

3.1 1.1 What Is an AI Agent?

When most people encounter AI, they think of chatbots: you type a message, the AI responds. That’s conversational AI — useful, but limited. An AI Agent is fundamentally different.

An AI Agent is a system that:

Perceives its environment (reads files, queries databases, browses the web)
Reasons using a large language model as its “brain”
Plans a sequence of actions to reach a goal
Acts by calling tools, writing files, or triggering APIs
Learns from feedback to improve over time

flowchart LR
    A[🌍 Environment<br/>Files, APIs, Databases] -->|Perceive| B[🧠 Agent Brain<br/>LLM]
    B -->|Plan & Decide| C[🛠️ Tools<br/>Search, Code, APIs]
    C -->|Act| A
    B -->|Respond| D[👤 Human]
    D -->|Feedback| B

3.2 1.2 The Spectrum of AI Systems

Not all AI is the same. Here’s the spectrum from simple to agentic:

Type	Description	Example	Autonomy
Chatbot	Responds to single prompts	ChatGPT Q&A	None
RAG System	Retrieves context before answering	Document Q&A	Low
Tool-Using LLM	Calls functions when needed	Calculator, search	Medium
Autonomous Agent	Plans and executes multi-step tasks	Researcher, coder	High
Multi-Agent System	Multiple agents collaborating	Research team	Very High

3.3 1.3 The Four Essential Components of an AI Agent

graph TD
    A[AI Agent] --> B[🧠 Brain<br/>LLM for Reasoning]
    A --> C[💾 Memory<br/>Context & History]
    A --> D[🛠️ Tools<br/>Capabilities]
    A --> E[🎯 Goals<br/>What to Achieve]

    C --> F[Short-term<br/>Conversation context]
    C --> G[Long-term<br/>Vector store memory]

    D --> H[Search the web]
    D --> I[Write & run code]
    D --> J[Read/write files]
    D --> K[Call APIs]

3.3.1 1.3.1 The Brain (LLM)

The LLM serves as the reasoning core. It reads the current situation, decides what action to take next, and generates outputs. Modern agents use Chain-of-Thought reasoning — the LLM literally “thinks out loud” before acting.

3.3.2 1.3.2 Memory

Without memory, every agent interaction starts from scratch. Agents use:

Working memory: The current conversation context
Episodic memory: Past interactions retrieved from vector stores
Semantic memory: Factual knowledge about the domain
Procedural memory: How to perform specific tasks

3.3.3 1.3.3 Tools

Tools extend what an agent can do beyond generating text:

# A simple tool definition
@tool
def search_web(query: str) -> str:
    """Search the web for current information."""
    # Uses Tavily, Serper, or similar API
    return tavily_client.search(query)

@tool
def execute_python(code: str) -> str:
    """Execute Python code and return the result."""
    # Safe sandboxed execution
    ...

@tool
def read_file(filepath: str) -> str:
    """Read the contents of a file."""
    return Path(filepath).read_text()

3.3.4 1.3.4 Goals

An agent needs a clear objective. This is specified in the system prompt and can range from a single task (“Find the best AI papers from 2025”) to an ongoing role (“You are a senior analyst who monitors market trends”).

3.4 1.4 How Agents “Think”: The ReAct Pattern

The most common reasoning pattern for agents is ReAct (Reasoning + Acting):

Thought: I need to find current AI trends. Let me search the web.
Action: search_web("AI trends 2025 2026")
Observation: [Results: LLMs, multimodal AI, agentic systems, ...]

Thought: I have the information. Let me also check the academic literature.
Action: search_web("top AI research papers 2025 site:arxiv.org")
Observation: [Results: ...]

Thought: I now have enough to write a comprehensive summary.
Action: write_final_answer("Here are the key AI trends...")
Final Answer: [Comprehensive summary]

This think → act → observe → think loop continues until the agent reaches its goal.

3.5 1.5 Real-World AI Agent Applications

Domain	Application	What the Agent Does
Business	Market Research Agent	Searches web, reads reports, synthesises findings
Finance	Portfolio Analysis	Queries databases, calculates metrics, flags anomalies
Legal	Contract Review	Reads documents, flags risks, summarises clauses
HR	Resume Screening	Parses CVs, scores against criteria, ranks candidates
Dev	Code Review Agent	Reads code, checks standards, suggests improvements
Customer Service	Support Agent	Checks order status, processes refunds, escalates issues

3.6 1.6 The Limits and Risks of Agents

⚠️ Important: Agents Are Not Magic

Understanding limitations is as important as understanding capabilities.

Risk	Description	Mitigation
Hallucination	Agent acts on incorrect information	Use RAG + verification steps
Runaway costs	Too many LLM calls	Set token/step limits
Unintended actions	Agent does the wrong thing	Human-in-the-loop approval
Prompt injection	Malicious input hijacks agent	Input sanitisation
Infinite loops	Agent gets stuck	Max iteration limits

3.7 1.7 Your First Agent (Preview)

Here’s a taste of what you’ll build in later chapters:

from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain_core.tools import tool
from langchain import hub

@tool
def get_company_info(company: str) -> str:
    """Get basic information about a company."""
    # This would query a real API
    return f"{company}: Tech company, founded 2010, 500 employees"

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression safely."""
    import ast
    return str(eval(ast.literal_eval(f'"{expression}"')))

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
tools = [get_company_info, calculate]

# Load the ReAct prompt template
prompt = hub.pull("hwchase17/react")

agent = create_react_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

result = executor.invoke({
    "input": "Look up Anthropic and calculate their founding year plus 10."
})
print(result["output"])

3.8 1.8 Interactive Simulation: Agent Decision-Making

🎮 Live Simulation

Watch an AI agent think step by step as it solves a task. Observe the ReAct loop: reasoning, tool calls, observations.

3.9 Chapter Summary

In this chapter you learned:

An AI agent perceives, reasons, plans, and acts — unlike a simple chatbot
Agents have four components: Brain (LLM), Memory, Tools, and Goals
The ReAct pattern (think → act → observe) is the most common reasoning loop
Agents have real limitations: hallucination, cost, unintended actions
The field spans from simple RAG to complex multi-agent systems

3.10 What’s Next

Chapter 2 dives deep into the engine powering every AI agent: Large Language Models. We’ll demystify transformers, attention mechanisms, and token prediction — the science behind the intelligence.

📚 Further Reading

Yao et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629
LangChain Agents Guide: docs.langchain.com/docs/concepts/agents
Anthropic’s Claude as an Agent: docs.anthropic.com

# Introduction to AI Agents {#sec-agents} ::: {.callout-note icon="false"} ## 📍 Chapter Overview **Time:** ~60 minutes | **Level:** Beginner | **No prerequisites** ::: > *"An AI Agent is not a chatbot that answers questions. It's a system that perceives its environment, reasons about it, and takes actions to achieve goals."* ## 1.1 What Is an AI Agent? When most people encounter AI, they think of chatbots: you type a message, the AI responds. That's conversational AI — useful, but limited. An **AI Agent** is fundamentally different. An AI Agent is a system that: 1. **Perceives** its environment (reads files, queries databases, browses the web) 2. **Reasons** using a large language model as its "brain" 3. **Plans** a sequence of actions to reach a goal 4. **Acts** by calling tools, writing files, or triggering APIs 5. **Learns** from feedback to improve over time ```{mermaid} flowchart LR A[🌍 Environment<br/>Files, APIs, Databases] -->|Perceive| B[🧠 Agent Brain<br/>LLM] B -->|Plan & Decide| C[🛠️ Tools<br/>Search, Code, APIs] C -->|Act| A B -->|Respond| D[👤 Human] D -->|Feedback| B ``` --- ## 1.2 The Spectrum of AI Systems Not all AI is the same. Here's the spectrum from simple to agentic: | Type | Description | Example | Autonomy | |------|-------------|---------|---------| | **Chatbot** | Responds to single prompts | ChatGPT Q&A | None | | **RAG System** | Retrieves context before answering | Document Q&A | Low | | **Tool-Using LLM** | Calls functions when needed | Calculator, search | Medium | | **Autonomous Agent** | Plans and executes multi-step tasks | Researcher, coder | High | | **Multi-Agent System** | Multiple agents collaborating | Research team | Very High | --- ## 1.3 The Four Essential Components of an AI Agent ```{mermaid} graph TD A[AI Agent] --> B[🧠 Brain<br/>LLM for Reasoning] A --> C[💾 Memory<br/>Context & History] A --> D[🛠️ Tools<br/>Capabilities] A --> E[🎯 Goals<br/>What to Achieve] C --> F[Short-term<br/>Conversation context] C --> G[Long-term<br/>Vector store memory] D --> H[Search the web] D --> I[Write & run code] D --> J[Read/write files] D --> K[Call APIs] ``` ### 1.3.1 The Brain (LLM) The LLM serves as the reasoning core. It reads the current situation, decides what action to take next, and generates outputs. Modern agents use **Chain-of-Thought reasoning** — the LLM literally "thinks out loud" before acting. ### 1.3.2 Memory Without memory, every agent interaction starts from scratch. Agents use: - **Working memory**: The current conversation context - **Episodic memory**: Past interactions retrieved from vector stores - **Semantic memory**: Factual knowledge about the domain - **Procedural memory**: How to perform specific tasks ### 1.3.3 Tools Tools extend what an agent can do beyond generating text: ```python # A simple tool definition @tool def search_web(query: str) -> str: """Search the web for current information.""" # Uses Tavily, Serper, or similar API return tavily_client.search(query) @tool def execute_python(code: str) -> str: """Execute Python code and return the result.""" # Safe sandboxed execution ... @tool def read_file(filepath: str) -> str: """Read the contents of a file.""" return Path(filepath).read_text() ``` ### 1.3.4 Goals An agent needs a clear objective. This is specified in the system prompt and can range from a single task ("Find the best AI papers from 2025") to an ongoing role ("You are a senior analyst who monitors market trends"). --- ## 1.4 How Agents "Think": The ReAct Pattern The most common reasoning pattern for agents is **ReAct** (Reasoning + Acting): ``` Thought: I need to find current AI trends. Let me search the web. Action: search_web("AI trends 2025 2026") Observation: [Results: LLMs, multimodal AI, agentic systems, ...] Thought: I have the information. Let me also check the academic literature. Action: search_web("top AI research papers 2025 site:arxiv.org") Observation: [Results: ...] Thought: I now have enough to write a comprehensive summary. Action: write_final_answer("Here are the key AI trends...") Final Answer: [Comprehensive summary] ``` This **think → act → observe → think** loop continues until the agent reaches its goal. --- ## 1.5 Real-World AI Agent Applications | Domain | Application | What the Agent Does | |--------|-------------|---------------------| | **Business** | Market Research Agent | Searches web, reads reports, synthesises findings | | **Finance** | Portfolio Analysis | Queries databases, calculates metrics, flags anomalies | | **Legal** | Contract Review | Reads documents, flags risks, summarises clauses | | **HR** | Resume Screening | Parses CVs, scores against criteria, ranks candidates | | **Dev** | Code Review Agent | Reads code, checks standards, suggests improvements | | **Customer Service** | Support Agent | Checks order status, processes refunds, escalates issues | --- ## 1.6 The Limits and Risks of Agents ::: {.callout-warning} ## ⚠️ Important: Agents Are Not Magic Understanding limitations is as important as understanding capabilities. ::: | Risk | Description | Mitigation | |------|-------------|-----------| | **Hallucination** | Agent acts on incorrect information | Use RAG + verification steps | | **Runaway costs** | Too many LLM calls | Set token/step limits | | **Unintended actions** | Agent does the wrong thing | Human-in-the-loop approval | | **Prompt injection** | Malicious input hijacks agent | Input sanitisation | | **Infinite loops** | Agent gets stuck | Max iteration limits | --- ## 1.7 Your First Agent (Preview) Here's a taste of what you'll build in later chapters: ```python from langchain_openai import ChatOpenAI from langchain.agents import create_react_agent, AgentExecutor from langchain_core.tools import tool from langchain import hub @tool def get_company_info(company: str) -> str: """Get basic information about a company.""" # This would query a real API return f"{company}: Tech company, founded 2010, 500 employees" @tool def calculate(expression: str) -> str: """Evaluate a mathematical expression safely.""" import ast return str(eval(ast.literal_eval(f'"{expression}"'))) llm = ChatOpenAI(model="gpt-4o-mini", temperature=0) tools = [get_company_info, calculate] # Load the ReAct prompt template prompt = hub.pull("hwchase17/react") agent = create_react_agent(llm, tools, prompt) executor = AgentExecutor(agent=agent, tools=tools, verbose=True) result = executor.invoke({ "input": "Look up Anthropic and calculate their founding year plus 10." }) print(result["output"]) ``` --- ## 1.8 Interactive Simulation: Agent Decision-Making ::: {.callout-note icon="false"} ## 🎮 Live Simulation Watch an AI agent think step by step as it solves a task. Observe the ReAct loop: reasoning, tool calls, observations. ::: ```{=html} <iframe src="../shiny/agent-simulator/index.html" width="100%" height="650px" style="border: 2px solid #e2e8f0; border-radius: 12px;"> </iframe> ``` --- ## Chapter Summary In this chapter you learned: - An AI agent **perceives, reasons, plans, and acts** — unlike a simple chatbot - Agents have four components: **Brain** (LLM), **Memory**, **Tools**, and **Goals** - The **ReAct pattern** (think → act → observe) is the most common reasoning loop - Agents have real limitations: hallucination, cost, unintended actions - The field spans from simple RAG to complex multi-agent systems ## What's Next Chapter 2 dives deep into the engine powering every AI agent: **Large Language Models**. We'll demystify transformers, attention mechanisms, and token prediction — the science behind the intelligence. --- ::: {.callout-note icon="false"} ## 📚 Further Reading - Yao et al. (2022). *ReAct: Synergizing Reasoning and Acting in Language Models*. [arXiv:2210.03629](https://arxiv.org/abs/2210.03629) - LangChain Agents Guide: [docs.langchain.com/docs/concepts/agents](https://docs.langchain.com) - Anthropic's Claude as an Agent: [docs.anthropic.com](https://docs.anthropic.com) :::