Chapter 3: Agent Taxonomy
Agent Taxonomy in Building Agentic AI Systems.
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the agentic AI concept behind Agent Taxonomy.
- Apply Agent Taxonomy to design reliable, production-grade agent systems.
- Recognize operational trade-offs in tool use, orchestration, safety, and cost.
Chapter 3: Agent Taxonomy
Eight architecture types — and a decision matrix for choosing between them
Picking the Right Architecture
Most engineering mistakes in agentic AI come from applying a complex architecture to a problem that did not need it — or a simple one to a task that demanded more. This chapter gives you the vocabulary and the criteria to match architecture to problem.
The eight types below are not mutually exclusive. A production system often combines a planner that coordinates a team of tool-using workers, each of which uses reflection to verify its output.
Reactive
Direct stimulus → action, no planning
Tool-Using
Single-step function calls
ReAct
Reason + Act loop
Planning
Decompose then execute
Reflection
Self-critique + revise
Critic
Separate judge model
Computer-Use
Vision + mouse/keyboard
Code-Execution
Write & run code
Architecture Types in Depth
1. Reactive Agent
No memory, no planning. Maps the current observation directly to an action using a lookup or a single LLM call. Fast and cheap. Use for narrow, well-defined, stateless tasks — e.g., classifying an incoming support ticket.
2. Tool-Using Agent (Single-Step)
Adds a tool call layer. The LLM decides which tool to invoke based on the user request, executes once, and returns the result. Still single-turn from the user's perspective. Example: a flight search that calls one API.
3. ReAct Agent (Reason + Act)
Interleaves reasoning (Thought) with tool calls (Action) and reads back results (Observation). Loops until the goal is met. This is the most common architecture for general-purpose agents. Introduced by Yao et al. (2022).
Reason about next step
Call tool
Tool result
4. Planning Agent
Before any tool call, generates a full plan. The plan is then executed step by step by an executor. Better for long-horizon tasks where upfront decomposition reduces errors. The planner can also replan if an execution step fails unexpectedly.
Decompose goal
Ordered subtasks
Run each step
5. Reflection Agent
After producing an output, the same LLM (or a second call) critiques it and decides whether to revise. Key insight: a model is better at evaluating an output than generating it correctly on the first attempt. Self-consistency and Best-of-N sampling are simpler variants of reflection.
6. Critic Agent
A separate model (judge) evaluates the primary agent's output and provides structured feedback. The primary agent uses that feedback to revise. More expensive than self-reflection but higher quality — the critic can be prompted to focus on specific failure modes.
Score + feedback
7. Computer-Use Agent
Takes screenshots as observations, decides on keyboard/mouse actions, and interacts with any application without a dedicated API. Required when no programmatic interface exists. Claude's "computer use" capability is the canonical example. Slow, costly, and harder to debug — use only when necessary.
8. Code-Execution Agent
Writes Python (or another language) to solve problems, runs it in a sandboxed interpreter, reads stdout/stderr as observations, and iterates until the output is correct. Extremely powerful for data analysis, mathematical computation, and automated testing. Always run code in an isolated environment (e.g., E2B, Modal, Docker).
Decision Matrix
Use this table to select a starting architecture for a new task. Start with the simplest option that satisfies the requirements and complexity column.
| Architecture | Task Horizon | Needs Memory? | External Tools? | Verification? | Cost/Step |
|---|---|---|---|---|---|
| Reactive | Single turn | No | No | No | Lowest |
| Tool-Using | Single turn | No | Yes (1 call) | No | Low |
| ReAct | Multi-step | Short-term | Yes (multi) | Implicit | Medium |
| Planning | Long-horizon | Yes | Yes | At plan level | Medium |
| Reflection | Any | Optional | Optional | Self-critique | Medium+ |
| Critic | High-quality output | Optional | Optional | Separate judge | High |
| Computer-Use | Multi-step | Yes | Any app (visual) | Screenshot diff | Very High |
| Code-Execution | Multi-step | Session state | Via code | stdout/test | High |
Practical guidance
Start with ReAct for 80% of general-purpose agent tasks. Add a separate Planner when you observe the agent making poor mid-task decisions. Add a Critic when output quality is the bottleneck, not task completion. Use Code-Execution when the task is computation-heavy. Computer-Use is the last resort.
Chapter 3 Quiz
1. A ReAct agent interleaves Thought, Action, and Observation. What is the purpose of the "Thought" step?
2. When should you prefer a Critic Agent over self-reflection?
3. A Computer-Use Agent's "observation" is: