Building Agentic AI Systems
Production Handbook: 22 chapters across 5 sections covering architectures, tool use, MCP, memory, multi-agent orchestration, LangGraph, safety, evaluation, fine-tuning, and frontier research. Prerequisite: Agentic AI Foundations.
Course Overview
What You Will Build Toward
- Navigate the Building Agentic AI Systems learning path across 22 chapters.
- Choose the right chapter based on your current goal and prerequisites.
- Move from overview material into the canonical chapter experience.
Chapter Path
Start With Any Chapter
Before You Start
Recommended Background
- Working knowledge of the course category.
- Willingness to work through examples and short checks.
Building Agentic AI Systems
Production Handbook
A comprehensive advanced course for AI/ML practitioners. Go from understanding agent theory to building production-grade multi-agent systems with LangGraph, MCP, memory, safety, and fine-tuning. New to agentic AI? Start with the Agentic AI Foundations course first.
Foundations & Mental Models
Ch 1 — What is an Agent?
Prompt-response vs goal-directed. The agent loop: observe → think → plan → act. Dual-paradigm: symbolic vs neural vs hybrid.
Start →Ch 2 — Anatomy of an Agent
Every component explained: LLM core, memory subsystems, tool interface, planning engine, action space, orchestrator vs executor.
Start →Ch 3 — Agent Taxonomy
Reactive, ReAct, planning, reflection, critic, computer-use, code-execution, deep research. Decision matrix: which architecture for which problem.
Start →Ch 4 — Reasoning Deep-Dive
Chain-of-thought, scratchpad reasoning, o1/o3/DeepSeek-R1, SMTL pattern, Mind-Map agents, Tree-of-Thoughts, Self-Consistency.
Start →Core Building Blocks
Ch 5 — Tool Use
Tool anatomy, schemas, parallel vs sequential calls, failure handling, retry logic, tool result integration.
Start →Ch 6 — Model Context Protocol
MCP architecture: hosts, clients, servers. JSON-RPC 2.0. Building an MCP server from scratch. Production gaps: CABP, ATBA, SERF.
Start →Ch 7 — Memory Systems
Working, episodic, semantic, procedural memory. Temporal knowledge graphs (Zep). Delta compression. Hybrid search with BM25 + reranking.
Start →Ch 8 — Planning
ReAct loop (Thought-Action-Observation), Plan-and-Execute, dynamic replanning, hierarchical task networks, MCTS with LLM guidance.
Start →Ch 9 — Context Management
Context window budgeting, summarization strategies, system prompt design for agents, structured output, instruction hierarchy.
Start →Multi-Agent Systems & Orchestration
Ch 10 — Multi-Agent Systems
Why and when to go multi-agent. Communication protocols, trust models, agent handoffs, agents-as-tools pattern.
Start →Ch 11 — Orchestration Patterns
Supervisor, Swarm, Pipeline, Fan-out + Aggregator, Critic loop. Decision matrix and state management strategies.
Start →Ch 12 — Building with LangGraph
Graph-based state machines, checkpointing, human-in-the-loop, subgraphs. Full project: research assistant with supervisor + specialists.
Start →Ch 13 — CrewAI & OpenAI SDK
Roles, backstories, crews vs flows. OpenAI Agents SDK: agents, runners, guardrails, handoffs. Triage agent pattern.
Start →Ch 14 — Advanced Multi-Agent
Role-aware memory (LatentMem, LEGOMem), meta-agents, adversarial red/blue pairs, consensus mechanisms.
Start →Production Engineering
Ch 15 — Evaluation
Trajectory-level vs endpoint evaluation. GAIA, AgentBench, SWE-bench, ATBench. LLM-as-judge pitfalls. Building your own eval harness.
Start →Ch 16 — Safety & Guardrails
Prompt injection (direct/indirect), ART benchmark findings, input/output guardrails, least-privilege tools, audit logs.
Start →Ch 17 — Observability & Debugging
Distributed tracing across agents, spans and events, LangSmith/LangFuse/Arize Phoenix, cost attribution, debugging failed trajectories.
Start →Ch 18 — Deployment & Scaling
Stateless vs stateful deployment, async queues, model routing, caching strategies, cost breakdown, auto-scaling.
Start →Ch 19 — CI/CD for Agents
Unit and integration testing, golden trajectory regression, canary deployments, prompt version control, rollback procedures.
Start →Advanced Topics & Frontiers
Ch 20 — Fine-Tuning for Agentic Behavior
SFT on tool-use demonstrations, RLVR (VerlTool, Trinity-RFT), ATLAS for small models, Tool-R0 self-play, ToolPO credit assignment.
Start →Ch 21 — Domain-Specific Agents
Coding agents (SWE-bench), research agents (Deep Research), computer-use, data analysis, customer support, healthcare/legal constraints.
Start →Ch 22 — The Frontier
Embodied AI, world models, agent economies, persistent personal agents, regulation (EU AI Act), open research problems.
Start →