flowchart LR
A[🌍 Environment<br/>Files, APIs, Databases] -->|Perceive| B[🧠 Agent Brain<br/>LLM]
B -->|Plan & Decide| C[🛠️ Tools<br/>Search, Code, APIs]
C -->|Act| A
B -->|Respond| D[👤 Human]
D -->|Feedback| B
3 Introduction to AI Agents
“An AI Agent is not a chatbot that answers questions. It’s a system that perceives its environment, reasons about it, and takes actions to achieve goals.”
3.1 1.1 What Is an AI Agent?
When most people encounter AI, they think of chatbots: you type a message, the AI responds. That’s conversational AI — useful, but limited. An AI Agent is fundamentally different.
An AI Agent is a system that:
- Perceives its environment (reads files, queries databases, browses the web)
- Reasons using a large language model as its “brain”
- Plans a sequence of actions to reach a goal
- Acts by calling tools, writing files, or triggering APIs
- Learns from feedback to improve over time
3.2 1.2 The Spectrum of AI Systems
Not all AI is the same. Here’s the spectrum from simple to agentic:
| Type | Description | Example | Autonomy |
|---|---|---|---|
| Chatbot | Responds to single prompts | ChatGPT Q&A | None |
| RAG System | Retrieves context before answering | Document Q&A | Low |
| Tool-Using LLM | Calls functions when needed | Calculator, search | Medium |
| Autonomous Agent | Plans and executes multi-step tasks | Researcher, coder | High |
| Multi-Agent System | Multiple agents collaborating | Research team | Very High |
3.3 1.3 The Four Essential Components of an AI Agent
graph TD
A[AI Agent] --> B[🧠 Brain<br/>LLM for Reasoning]
A --> C[💾 Memory<br/>Context & History]
A --> D[🛠️ Tools<br/>Capabilities]
A --> E[🎯 Goals<br/>What to Achieve]
C --> F[Short-term<br/>Conversation context]
C --> G[Long-term<br/>Vector store memory]
D --> H[Search the web]
D --> I[Write & run code]
D --> J[Read/write files]
D --> K[Call APIs]
3.3.1 1.3.1 The Brain (LLM)
The LLM serves as the reasoning core. It reads the current situation, decides what action to take next, and generates outputs. Modern agents use Chain-of-Thought reasoning — the LLM literally “thinks out loud” before acting.
3.3.2 1.3.2 Memory
Without memory, every agent interaction starts from scratch. Agents use:
- Working memory: The current conversation context
- Episodic memory: Past interactions retrieved from vector stores
- Semantic memory: Factual knowledge about the domain
- Procedural memory: How to perform specific tasks
3.3.3 1.3.3 Tools
Tools extend what an agent can do beyond generating text:
# A simple tool definition
@tool
def search_web(query: str) -> str:
"""Search the web for current information."""
# Uses Tavily, Serper, or similar API
return tavily_client.search(query)
@tool
def execute_python(code: str) -> str:
"""Execute Python code and return the result."""
# Safe sandboxed execution
...
@tool
def read_file(filepath: str) -> str:
"""Read the contents of a file."""
return Path(filepath).read_text()3.3.4 1.3.4 Goals
An agent needs a clear objective. This is specified in the system prompt and can range from a single task (“Find the best AI papers from 2025”) to an ongoing role (“You are a senior analyst who monitors market trends”).
3.4 1.4 How Agents “Think”: The ReAct Pattern
The most common reasoning pattern for agents is ReAct (Reasoning + Acting):
Thought: I need to find current AI trends. Let me search the web.
Action: search_web("AI trends 2025 2026")
Observation: [Results: LLMs, multimodal AI, agentic systems, ...]
Thought: I have the information. Let me also check the academic literature.
Action: search_web("top AI research papers 2025 site:arxiv.org")
Observation: [Results: ...]
Thought: I now have enough to write a comprehensive summary.
Action: write_final_answer("Here are the key AI trends...")
Final Answer: [Comprehensive summary]
This think → act → observe → think loop continues until the agent reaches its goal.
3.5 1.5 Real-World AI Agent Applications
| Domain | Application | What the Agent Does |
|---|---|---|
| Business | Market Research Agent | Searches web, reads reports, synthesises findings |
| Finance | Portfolio Analysis | Queries databases, calculates metrics, flags anomalies |
| Legal | Contract Review | Reads documents, flags risks, summarises clauses |
| HR | Resume Screening | Parses CVs, scores against criteria, ranks candidates |
| Dev | Code Review Agent | Reads code, checks standards, suggests improvements |
| Customer Service | Support Agent | Checks order status, processes refunds, escalates issues |
3.6 1.6 The Limits and Risks of Agents
| Risk | Description | Mitigation |
|---|---|---|
| Hallucination | Agent acts on incorrect information | Use RAG + verification steps |
| Runaway costs | Too many LLM calls | Set token/step limits |
| Unintended actions | Agent does the wrong thing | Human-in-the-loop approval |
| Prompt injection | Malicious input hijacks agent | Input sanitisation |
| Infinite loops | Agent gets stuck | Max iteration limits |
3.7 1.7 Your First Agent (Preview)
Here’s a taste of what you’ll build in later chapters:
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain_core.tools import tool
from langchain import hub
@tool
def get_company_info(company: str) -> str:
"""Get basic information about a company."""
# This would query a real API
return f"{company}: Tech company, founded 2010, 500 employees"
@tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression safely."""
import ast
return str(eval(ast.literal_eval(f'"{expression}"')))
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
tools = [get_company_info, calculate]
# Load the ReAct prompt template
prompt = hub.pull("hwchase17/react")
agent = create_react_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
result = executor.invoke({
"input": "Look up Anthropic and calculate their founding year plus 10."
})
print(result["output"])3.8 1.8 Interactive Simulation: Agent Decision-Making
3.9 Chapter Summary
In this chapter you learned:
- An AI agent perceives, reasons, plans, and acts — unlike a simple chatbot
- Agents have four components: Brain (LLM), Memory, Tools, and Goals
- The ReAct pattern (think → act → observe) is the most common reasoning loop
- Agents have real limitations: hallucination, cost, unintended actions
- The field spans from simple RAG to complex multi-agent systems
3.10 What’s Next
Chapter 2 dives deep into the engine powering every AI agent: Large Language Models. We’ll demystify transformers, attention mechanisms, and token prediction — the science behind the intelligence.