Chapter 2: Anatomy of an Agent

Inside Every Agent

The agent loop from Chapter 1 is the high-level view. Zooming in, every agent is built from a specific set of components. Knowing what each component does — and what it does not do — is what separates a working agent from a fragile demo.

Interface

User / Operator Channel Incoming goals, system prompts, constraints

Reasoning

LLM Core GPT-4o / Claude / Gemini / local model

Scratchpad / CoT Internal reasoning trace

State

Working Memory In-context conversation state

Long-term Memory Vector DB · Knowledge graph · Session store

Execution

Tool Registry Available tools, schemas, permissions

Tool Executor Dispatches calls, validates outputs

Planner Task decomposition strategy

Component Deep-Dive

Orchestrator vs Executor

This distinction is critical in multi-agent systems (Chapter 10). The orchestrator decides what to do next; the executor carries out one specific action. A single agent often acts as both, but in scaled systems they are separate processes with separate permissions.

1 — LLM Core (Backbone)

The LLM is the only component that cannot be replaced with deterministic code. Its job is to map (context + instructions + tool list) → (next action decision). Model choice matters:

Model Family	Strength in Agent Context	Weakness
GPT-4o / GPT-4.1	Strong function calling, broad knowledge	Cost at scale
o3 / o4-mini	Long-horizon reasoning, complex multi-step	Higher latency per step
Claude 3.5 / 4	Instruction following, long context	Different tool calling syntax
Gemini 2.0 / 2.5	Multimodal, natively long context	Newer, less agent tooling maturity
Qwen / DeepSeek	Cost-effective, open weights for fine-tuning	Smaller ecosystem

2 — Memory Subsystems (Preview)

Memory is covered in depth in Chapter 7. Here, the key insight is that memory is not the context window — it is a set of retrieval mechanisms that surface relevant information into the context window on demand.

1
Working MemoryThe active context window — cheapest, fastest, smallest (32K–2M tokens)
2
Episodic MemoryTimestamped logs of past interactions — "what happened before?"
3
Semantic MemoryVector embeddings of knowledge — "what do I know about X?"
4
Procedural MemoryStored tool schemas, prompt templates, learned skills

3 — Tool Interface

Tools are functions the agent can invoke. Each tool has a JSON schema describing its name, description, and parameter types. The LLM reads these schemas at inference time and generates a structured call when it decides to use one. Chapter 5 covers this fully.

4 — Planner

The planner translates a goal into a sequence of steps. In simple agents, the LLM is the planner (implicit planning in CoT). In advanced agents, the planner is a separate system prompt or even a separate LLM call dedicated to decomposition before execution begins.

5 — Action Space

Everything the agent is allowed to do. This is a designed constraint, not an emergent property. The action space should be the minimum set of capabilities needed to accomplish the goal — broader action spaces mean larger attack surfaces.

Read-only actions

Web search
Database query
File read
API GET request
Memory retrieval

Write / side-effect actions

Send email / Slack
Create / modify files
API POST / PATCH / DELETE
Execute code
Spawn sub-agents

Principle of Least Privilege

Never give an agent a write action it does not need for the current task. A research agent that can only search and read is far safer than one that also has email access — even if it never uses email.

Selecting the Right Backbone

The backbone model determines the ceiling on your agent's reasoning quality. But it is not the only factor — prompt structure, tool schemas, and memory retrieval quality often matter more than model choice for a given task.

📋

System Prompt

Role, goals, constraints

+

🔧

Tool Schemas

Available actions

+

🧩

Memory Retrieval

Relevant past context

+

💬

Conversation

Turns + observations

→

🤖

LLM Decision

Next action

The LLM sees all of this assembled into a single prompt. Context assembly quality is as important as model quality. If the assembled context is noisy, contradictory, or overflowing, even the best model will produce poor decisions.

Agent Class Skeleton

Here is a minimal but well-structured Python class that expresses the component map above. Chapters 5–12 will expand each method significantly.

python — agent_core.py

from __future__ import annotations
from dataclasses import dataclass, field
from typing import Any, Callable

@dataclass
class Tool:
    name: str
    description: str
    parameters: dict[str, Any]
    fn: Callable[..., str]

    def to_schema(self) -> dict[str, Any]:
        return {
            "type": "function",
            "function": {
                "name": self.name,
                "description": self.description,
                "parameters": self.parameters,
            },
        }


class AgentCore:
    """
    Single-agent loop with a pluggable LLM client, tool registry,
    and swappable memory backend.
    """

    def __init__(
        self,
        llm_client: Any,
        model: str,
        system_prompt: str,
        tools: list[Tool] | None = None,
        memory_backend: Any | None = None,
        max_iterations: int = 25,
    ) -> None:
        self.llm = llm_client
        self.model = model
        self.system_prompt = system_prompt
        self.tools: dict[str, Tool] = {t.name: t for t in (tools or [])}
        self.memory = memory_backend
        self.max_iterations = max_iterations
        self._messages: list[dict[str, Any]] = [
            {"role": "system", "content": system_prompt}
        ]

    # ── Public API ──────────────────────────────────────────────────────────

    def run(self, goal: str) -> str:
        self._messages.append({"role": "user", "content": goal})
        self._inject_memory(goal)          # retrieve relevant past context

        for _ in range(self.max_iterations):
            response = self._call_llm()
            choice = response.choices[0]

            if choice.finish_reason == "tool_calls":
                for call in choice.message.tool_calls:
                    result = self._execute_tool(call)
                    self._messages.append({
                        "role": "tool",
                        "content": result,
                        "tool_call_id": call.id,
                    })

            elif choice.finish_reason == "stop":
                answer = choice.message.content
                self._persist_memory(goal, answer)   # store to long-term memory
                return answer

        raise RuntimeError("Max iterations reached without FINISH signal")

    # ── Private helpers ──────────────────────────────────────────────────────

    def _call_llm(self) -> Any:
        return self.llm.chat.completions.create(
            model=self.model,
            messages=self._messages,
            tools=[t.to_schema() for t in self.tools.values()] or None,
        )

    def _execute_tool(self, call: Any) -> str:
        tool = self.tools.get(call.function.name)
        if tool is None:
            return f"Error: unknown tool '{call.function.name}'"
        import json
        args = json.loads(call.function.arguments)
        return tool.fn(**args)

    def _inject_memory(self, query: str) -> None:
        if self.memory:
            relevant = self.memory.search(query, top_k=5)
            if relevant:
                self._messages.append({
                    "role": "system",
                    "content": "Relevant past context:\n" + "\n".join(relevant),
                })

    def _persist_memory(self, query: str, answer: str) -> None:
        if self.memory:
            self.memory.store(query=query, answer=answer)

By the end of this chapter, you will be able to:

Chapter 2: Anatomy of an Agent

Inside Every Agent

Component Deep-Dive

Orchestrator vs Executor

1 — LLM Core (Backbone)

2 — Memory Subsystems (Preview)

3 — Tool Interface

4 — Planner

5 — Action Space

Read-only actions

Write / side-effect actions

Principle of Least Privilege

Selecting the Right Backbone

Agent Class Skeleton

Chapter 2 Quiz

By the end of this chapter, you will be able to:

Chapter 2: Anatomy of an Agent

Inside Every Agent

Component Deep-Dive

Orchestrator vs Executor

1 — LLM Core (Backbone)

2 — Memory Subsystems (Preview)

3 — Tool Interface

4 — Planner

5 — Action Space

Read-only actions

Write / side-effect actions

Principle of Least Privilege

Selecting the Right Backbone

Agent Class Skeleton

Chapter 2 Quiz

Search