Chapter 1: What is an Agent?
What is an Agent? in Building Agentic AI Systems.
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the agentic AI concept behind What is an Agent?.
- Apply What is an Agent? to design reliable, production-grade agent systems.
- Recognize operational trade-offs in tool use, orchestration, safety, and cost.
Chapter 1: What is an Agent?
From prompt-response to goal-directed autonomous systems
What is an Agent?
Definition
An agent is a system that perceives its environment, reasons about a goal, chooses actions from a set of tools, executes them, and uses the results to continue until the goal is met — without human intervention at each step.
Every LLM you have used is stateless by default: you send a prompt, you get a response. The LLM does not remember the exchange tomorrow, cannot search the web, cannot write a file, and cannot retry when something fails. Agents break all four of those constraints.
Four Defining Properties
Why this matters now
Function calling was available in GPT-4 in 2023. What changed in 2024–2026 is reliability: reasoning models like o1, o3, and DeepSeek-R1 lowered the error rate per step enough that long multi-step tasks became practical in production.
The Agent Loop
Every agent, regardless of its sophistication, runs some variation of this loop. Understanding it deeply is the foundation for reasoning about every design decision in later chapters.
High-level task
Read environment
Execute tool / output
Reason, decide action
The loop runs until the agent emits a FINISH signal or reaches a step/time limit.
What can go wrong in the loop?
| Failure Mode | Typical Cause | Mitigation |
|---|---|---|
| Infinite loop | No convergence signal, LLM repeats same action | Max iterations + state change check |
| Hallucinated tool result | LLM infers result instead of calling the tool | Force tool-use mode; don't allow free-form text as "observations" |
| Context overflow | History grows unbounded across iterations | Summarize or evict old observations (Chapter 9) |
| Irreversible action | Agent deletes or sends without confirmation | Human-in-the-loop gate for write operations (Chapter 16) |
Agent vs Chatbot
The distinction is not about model size or intelligence — it is about who executes side effects.
Chatbot / LLM
- Single-turn input → output
- No persistent memory across sessions
- No tool execution (unless explicitly added)
- Goal is to generate good text
- User drives every step
- Deterministic control flow
Agent
- Multi-step, goal-directed loop
- Persistent memory (episodic, semantic)
- Calls tools, APIs, databases
- Goal is to complete a task
- Agent drives its own steps
- Emergent, conditional control flow
When an agent is the wrong choice
Not every task benefits from an agent loop. If the problem can be solved in a single, well-structured prompt, adding an agent loop only introduces latency, cost, and failure surfaces. Use an agent when the task genuinely requires multiple steps, external data, or decisions based on intermediate results.
Two Foundational Paradigms
Modern agentic architectures fall into two lineages that differ in how they represent knowledge and make decisions. Most production systems blend both.
Symbolic / Classical
- Explicit rules and logic
- Planning via search (A*, STRIPS, HTN)
- Deterministic, auditable
- Brittle under distribution shift
- Dominates safety-critical domains (healthcare, aviation)
Neural / Generative
- LLM-driven reasoning and generation
- Planning via prompt instructions
- Flexible, adaptive, natural language
- Hallucination and non-determinism
- Dominates data-rich, adaptive domains
Neuro-Symbolic (Hybrid)
- LLM for understanding + symbolic planner for execution
- Combines flexibility with verifiability
- Emerging research area (2025–)
- Best of both paradigms
- Future direction for high-stakes agents
from openai import OpenAI
client = OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the web for current information",
"parameters": {
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"]
}
}
}
]
messages = [{"role": "user", "content": "What is the latest GPT-5 benchmark result?"}]
while True:
response = client.chat.completions.create(
model="gpt-4o", messages=messages, tools=tools
)
choice = response.choices[0]
# Agent decided to call a tool
if choice.finish_reason == "tool_calls":
for call in choice.message.tool_calls:
result = execute_tool(call.function.name, call.function.arguments)
messages.append({"role": "tool", "content": result, "tool_call_id": call.id})
# Agent decided it has enough information
elif choice.finish_reason == "stop":
print(choice.message.content)
break # Exit the loop — task complete
Chapter 1 takeaway
An agent is a loop. The LLM is the brain, tools are the hands, and memory is what makes it coherent across steps. Everything in the remaining 21 chapters is about making that loop reliable, safe, and scalable.
Chapter 1 Quiz
1. Which property distinguishes an agent from a standard LLM call?
2. In the agent loop, what is an "observation"?
3. A neuro-symbolic architecture combines: