Course Building Agentic AI Systems Chapter 14 Difficulty advanced Estimated Time 600 min

Chapter 14: Advanced Multi-Agent Patterns

Advanced Multi-Agent Patterns in Building Agentic AI Systems.

64% complete

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the agentic AI concept behind Advanced Multi-Agent Patterns.
  • Apply Advanced Multi-Agent Patterns to design reliable, production-grade agent systems.
  • Recognize operational trade-offs in tool use, orchestration, safety, and cost.

Chapter 14: Advanced Multi-Agent Patterns

Role-aware memory, meta-agents, adversarial pairs, and consensus mechanisms

Beyond Basic Coordination

Chapters 10–13 covered how to build and run multi-agent systems. This chapter covers the patterns that appear when you push multi-agent systems toward higher autonomy, quality, or scale — patterns that require deliberate design rather than falling out naturally from the frameworks.

Role-Aware Memory

Different agents need different memories — the orchestrator needs task-level context; workers need execution-level detail.

Meta-Agents

Agents that spawn, configure, and manage other agents dynamically based on task requirements.

Adversarial Pairs

A red-team agent tries to break or manipulate outputs; a blue-team agent validates and defends.

Consensus Mechanisms

Multiple agents independently produce answers; majority vote or confidence aggregation selects the final output.

Role-Aware Memory

Naive multi-agent systems give all agents access to the same memory store. This creates two problems: (1) agents see irrelevant context that wastes tokens, and (2) agents that process sensitive data leak information to less-trusted agents. Role-aware memory customizes what each agent can access.

LatentMem (2026)

LatentMem addresses multi-agent memory by role-aware customization and token-efficient latent memory synthesis. Instead of giving all agents the same memory, it includes: (1) an experience bank storing interaction trajectories per role, and (2) a memory composer that synthesizes compact, role-specific memories. Results: up to 19.36% performance gains over vanilla shared memory, with significant token savings.

LEGOMem: orchestrator vs executor memory separation

LEGOMem's key finding: orchestrator memory is critical for task decomposition (knowing what sub-tasks to assign and in what order), while fine-grained agent memory improves execution accuracy (knowing the details of past tool call results for similar tasks). Mixing these memory types in a single shared store causes both to degrade.

Role-scoped memory access

Orchestrator
→ reads
Task decomposition memory
Worker Agent A
→ reads
Role A execution history
Worker Agent B
→ reads
Role B execution history

No cross-agent memory access unless explicitly granted

Meta-Agents: Agents That Build Agents

A meta-agent dynamically spawns, configures, and routes to specialized sub-agents based on task analysis. Rather than hardcoding the team composition, the meta-agent decides at runtime what agents are needed.

python — meta-agent spawning specialists dynamically
from openai import OpenAI
from dataclasses import dataclass

@dataclass
class AgentSpec:
    name: str
    role: str
    tools: list[str]
    model: str = "gpt-4o"

def meta_agent(task: str, available_tool_registry: dict) -> str:
    """
    Analyzes the task and dynamically spawns the right specialist agents.
    Returns the final assembled result.
    """
    client = OpenAI()

    # Phase 1: meta-agent decides what agents are needed
    planning_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are a team architect. Given a task, decide what specialist agents are needed. "
                f"Available tools: {list(available_tool_registry.keys())}. "
                "Return JSON: [{name, role, tools, reason}]"
            )},
            {"role": "user", "content": f"Task: {task}"},
        ],
        response_format={"type": "json_object"},
    )

    import json
    team_specs = json.loads(planning_response.choices[0].message.content)["agents"]

    # Phase 2: spawn and run each specialist
    results = {}
    for spec in team_specs:
        agent_tools = [available_tool_registry[t] for t in spec["tools"] if t in available_tool_registry]
        result = run_specialist(
            role=spec["role"],
            task=f"Your part of the task: {spec['reason']}. Full context: {task}",
            tools=agent_tools,
            model=spec.get("model", "gpt-4o"),
        )
        results[spec["name"]] = result

    # Phase 3: meta-agent assembles final output
    final = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Synthesize the specialist outputs into a single coherent result."},
            {"role": "user", "content": f"Original task: {task}\n\nSpecialist results:\n{json.dumps(results, indent=2)}"},
        ],
    )
    return final.choices[0].message.content

Adversarial Agent Pairs & Consensus

Red-Team / Blue-Team Agents

Inspired by security red-teaming, this pattern pairs a generator agent with a dedicated adversarial critic whose sole job is to find flaws, contradictions, security vulnerabilities, or unsafe content in the generator's output.

🤖
Generator

Produces output

🔴
Red-Team Agent

Adversarially attacks: "How could this fail?"

🔵
Blue-Team Agent

Reviews attack; decides if issue is real

🔄
Revise

If issue confirmed

Consensus Mechanisms

When the cost of a single agent being wrong is high, run the same task with N independent agents and aggregate their outputs.

MechanismHow it worksBest for
Majority VoteN agents answer independently; most common answer winsDiscrete classification, fact checking
Confidence AggregationEach agent provides a confidence score; highest confidence (or weighted average) winsWhen agents can estimate their own uncertainty
Synthesis JudgeA fourth "judge" agent reads all N responses and synthesizes the final answer, reconciling contradictionsOpen-ended generation where exact agreement is unlikely
Best-of-NA scoring function (LLM-as-judge or automated test) selects the best of N candidatesCode generation, where tests can verify correctness

Cost-quality trade-off in consensus

Running N=3 agents with majority vote costs 3× per task but substantially reduces per-task error rates on knowledge-intensive queries. The break-even depends on your error cost: if a single wrong answer costs $1000 in downstream damage, spending $0.30 extra on 3× inference is highly rational. For low-stakes tasks, don't bother — single-agent is fine.

Chapter 14 Quiz

1. According to LEGOMem research, which type of memory is most critical for an orchestrator agent?

2. What makes a "meta-agent" different from a standard orchestrator?

3. The "Best-of-N" consensus mechanism is particularly well-suited for code generation. Why?