Category Deep Learning Chapters 22 Difficulty advanced Estimated Time 600 min

Building Agentic AI Systems

Production Handbook: 22 chapters across 5 sections covering architectures, tool use, MCP, memory, multi-agent orchestration, LangGraph, safety, evaluation, fine-tuning, and frontier research. Prerequisite: Agentic AI Foundations.

Course Overview

What You Will Build Toward

  • Navigate the Building Agentic AI Systems learning path across 22 chapters.
  • Choose the right chapter based on your current goal and prerequisites.
  • Move from overview material into the canonical chapter experience.

Chapter Path

Start With Any Chapter

Before You Start

Recommended Background

  • Working knowledge of the course category.
  • Willingness to work through examples and short checks.

Start Chapter 1

Building Agentic AI Systems

Production Handbook

A comprehensive advanced course for AI/ML practitioners. Go from understanding agent theory to building production-grade multi-agent systems with LangGraph, MCP, memory, safety, and fine-tuning. New to agentic AI? Start with the Agentic AI Foundations course first.

22
Chapters
5
Sections
10h+
Content
Advanced
Level
1

Foundations & Mental Models

Ch 1 — What is an Agent?

Prompt-response vs goal-directed. The agent loop: observe → think → plan → act. Dual-paradigm: symbolic vs neural vs hybrid.

FoundationArchitecture
Start →

Ch 2 — Anatomy of an Agent

Every component explained: LLM core, memory subsystems, tool interface, planning engine, action space, orchestrator vs executor.

ArchitectureComponents
Start →

Ch 3 — Agent Taxonomy

Reactive, ReAct, planning, reflection, critic, computer-use, code-execution, deep research. Decision matrix: which architecture for which problem.

TaxonomyDecision Matrix
Start →

Ch 4 — Reasoning Deep-Dive

Chain-of-thought, scratchpad reasoning, o1/o3/DeepSeek-R1, SMTL pattern, Mind-Map agents, Tree-of-Thoughts, Self-Consistency.

ReasoningCoTo1/o3
Start →
2

Core Building Blocks

Ch 5 — Tool Use

Tool anatomy, schemas, parallel vs sequential calls, failure handling, retry logic, tool result integration.

ToolsFunction Calling
Start →

Ch 6 — Model Context Protocol

MCP architecture: hosts, clients, servers. JSON-RPC 2.0. Building an MCP server from scratch. Production gaps: CABP, ATBA, SERF.

MCPProtocol
Start →

Ch 7 — Memory Systems

Working, episodic, semantic, procedural memory. Temporal knowledge graphs (Zep). Delta compression. Hybrid search with BM25 + reranking.

MemoryRAGVector DB
Start →

Ch 8 — Planning

ReAct loop (Thought-Action-Observation), Plan-and-Execute, dynamic replanning, hierarchical task networks, MCTS with LLM guidance.

PlanningReActMCTS
Start →

Ch 9 — Context Management

Context window budgeting, summarization strategies, system prompt design for agents, structured output, instruction hierarchy.

Prompt EngineeringContext
Start →
3

Multi-Agent Systems & Orchestration

Ch 10 — Multi-Agent Systems

Why and when to go multi-agent. Communication protocols, trust models, agent handoffs, agents-as-tools pattern.

Multi-AgentOrchestration
Start →

Ch 11 — Orchestration Patterns

Supervisor, Swarm, Pipeline, Fan-out + Aggregator, Critic loop. Decision matrix and state management strategies.

PatternsDesign
Start →

Ch 12 — Building with LangGraph

Graph-based state machines, checkpointing, human-in-the-loop, subgraphs. Full project: research assistant with supervisor + specialists.

LangGraphStateful
Start →

Ch 13 — CrewAI & OpenAI SDK

Roles, backstories, crews vs flows. OpenAI Agents SDK: agents, runners, guardrails, handoffs. Triage agent pattern.

CrewAIOpenAI SDK
Start →

Ch 14 — Advanced Multi-Agent

Role-aware memory (LatentMem, LEGOMem), meta-agents, adversarial red/blue pairs, consensus mechanisms.

AdvancedMeta-Agents
Start →
4

Production Engineering

Ch 15 — Evaluation

Trajectory-level vs endpoint evaluation. GAIA, AgentBench, SWE-bench, ATBench. LLM-as-judge pitfalls. Building your own eval harness.

EvaluationBenchmarks
Start →

Ch 16 — Safety & Guardrails

Prompt injection (direct/indirect), ART benchmark findings, input/output guardrails, least-privilege tools, audit logs.

SafetySecurity
Start →

Ch 17 — Observability & Debugging

Distributed tracing across agents, spans and events, LangSmith/LangFuse/Arize Phoenix, cost attribution, debugging failed trajectories.

ObservabilityTracing
Start →

Ch 18 — Deployment & Scaling

Stateless vs stateful deployment, async queues, model routing, caching strategies, cost breakdown, auto-scaling.

DeploymentScalingCost
Start →

Ch 19 — CI/CD for Agents

Unit and integration testing, golden trajectory regression, canary deployments, prompt version control, rollback procedures.

CI/CDTesting
Start →
5

Advanced Topics & Frontiers

Ch 20 — Fine-Tuning for Agentic Behavior

SFT on tool-use demonstrations, RLVR (VerlTool, Trinity-RFT), ATLAS for small models, Tool-R0 self-play, ToolPO credit assignment.

Fine-TuningRLVRRL
Start →

Ch 21 — Domain-Specific Agents

Coding agents (SWE-bench), research agents (Deep Research), computer-use, data analysis, customer support, healthcare/legal constraints.

DomainsApplied AI
Start →

Ch 22 — The Frontier

Embodied AI, world models, agent economies, persistent personal agents, regulation (EU AI Act), open research problems.

ResearchFuture
Start →