Learning Objectives

By the end of this chapter, you will be able to:

Explain the agentic AI concept behind Multi-Agent Systems.
Apply Multi-Agent Systems to design reliable, production-grade agent systems.
Recognize operational trade-offs in tool use, orchestration, safety, and cost.

Section 3 — Multi-Agent Systems & Orchestration

Chapter 10: Multi-Agent Systems

Why go multi-agent, communication patterns, trust models, and handoffs

From One Agent to Many

A single agent with all capabilities in one context is the simplest design — and the right starting point. Multi-agent systems are justified only when single-agent limits actually matter for the task at hand. This chapter explains those limits and when crossing them is worth the added complexity.

Single-Agent Limits

Context window caps — can't hold everything for complex, multi-document tasks
Specialization trade-offs — a general agent is mediocre at all domains
Sequential execution — one agent does one thing at a time
Single point of failure — one model, one reasoning path
No adversarial verification — no second opinion

Multi-Agent Gains

Parallelism — multiple agents work simultaneously on different subtasks
Specialization — each agent optimized for its domain
Scale — tasks that exceed one context window are split across agents
Verification — a second agent checks the first's work
Resilience — one agent failing doesn't stop the whole pipeline

Multi-agent complexity cost

Each agent boundary introduces: (1) latency (inter-agent communication), (2) context loss (agents only know what they're told), (3) debugging difficulty (tracing across agents is harder than tracing within one), and (4) coordination cost. Over 40% of agentic AI projects fail by end of 2027 — and most failures come from premature over-engineering, not weak models.

When Multi-Agent is Worth It

Signal	Single Agent	Multi-Agent
Task fits in 1 context window	✓	—
Task requires domain expertise (legal + code + finance)	Mediocre on all	✓ Specialists
Subtasks are independent and parallelizable	Sequential, slow	✓ Parallel
Output quality needs adversarial verification	Self-critique only	✓ Separate critic
Pipeline has clear handoff points (research → write → edit)	Role confusion	✓ Staged roles
Simple, single-turn task	✓	Overkill

Multi-agent system: Orchestrator + Specialists

Orchestrator Agent

↓ delegates tasks

Research Agent

Code Agent

Writer Agent

↓ use tools

search()

execute_code()

write_file()

read_doc()

Agent Communication Patterns

Shared State

All agents read from and write to a common state store (dictionary, database). Simple but requires careful concurrency control.

Use when: agents work on different parts of a shared document

Message Passing

Agents send structured messages to each other via a message queue. Decoupled, scalable, auditable.

Use when: agents run in separate processes or asynchronously

Event-Driven

Agents subscribe to event topics; an event from one agent triggers another. Loosely coupled.

Use when: pipelines are reactive (trigger on completion, not scheduled)

Direct Function Call (Agents-as-Tools)

The orchestrator calls a sub-agent exactly like a tool — synchronous, in-process. Simplest integration pattern.

Use when: sub-agent work is synchronous and short-lived

Trust Between Agents & Handoffs

The trust problem

When Agent A sends a message to Agent B, should Agent B treat that message as trusted (like a developer instruction) or untrusted (like user input)? This is not a hypothetical — a compromised sub-agent could inject malicious instructions into its output that the orchestrator passes to other agents. The general rule: output from any agent that processed user input should be treated as untrusted until validated.

Handoff Protocol

A handoff transfers task context from one agent to another. A good handoff message includes:

1
Task summaryWhat has been done so far and why
2
Current stateAll relevant data the receiving agent needs (don't rely on it having access to the full history)
3
Remaining objectiveExactly what the receiving agent needs to accomplish
4
Constraints and contextUser preferences, deadlines, quality standards that must be preserved

python — agents-as-tools pattern

from openai import OpenAI

client = OpenAI()

def research_agent(topic: str) -> str:
    """Specialist: searches and synthesizes information about a topic."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a research specialist. Search the web and synthesize findings."},
            {"role": "user", "content": f"Research this topic thoroughly: {topic}"},
        ],
        tools=SEARCH_TOOLS,
    )
    # ... execute tool calls, return final answer
    return final_answer


# Register the specialist as a tool for the orchestrator
ORCHESTRATOR_TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "research_agent",
            "description": "Delegate deep research on a specific topic to a specialist agent. Returns a comprehensive summary.",
            "parameters": {
                "type": "object",
                "properties": {"topic": {"type": "string", "description": "Topic to research"}},
                "required": ["topic"],
            },
        },
    }
]

# The orchestrator uses research_agent exactly like any other tool
def orchestrator(user_goal: str) -> str:
    messages = [
        {"role": "system", "content": "You are an orchestrator. Delegate research tasks to the research_agent tool."},
        {"role": "user", "content": user_goal},
    ]
    # ... standard agent loop with ORCHESTRATOR_TOOLS