Course Building Agentic AI Systems Chapter 6 Difficulty advanced Estimated Time 600 min

Chapter 6: Model Context Protocol (MCP)

Model Context Protocol (MCP) in Building Agentic AI Systems.

27% complete

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the agentic AI concept behind Model Context Protocol (MCP).
  • Apply Model Context Protocol (MCP) to design reliable, production-grade agent systems.
  • Recognize operational trade-offs in tool use, orchestration, safety, and cost.

Chapter 6: Model Context Protocol (MCP)

The universal standard for agent-to-tool communication

What is MCP?

The Model Context Protocol (MCP) is an open standard — introduced by Anthropic in November 2024 and donated to the Linux Foundation in December 2025 — that defines how AI agents discover and invoke external tools. Think of it as USB-C for AI tools: any agent that speaks MCP can talk to any MCP-compliant server without custom integration code.

Scale of adoption (early 2026)

  • 10,000+ active MCP servers in production
  • 500+ MCP client integrations (Claude, ChatGPT, Cursor, VS Code, Replit)
  • 97 million monthly SDK downloads
  • All major agent frameworks support MCP natively: LangGraph, CrewAI, OpenAI Agents SDK, Google ADK

Before MCP, every agent needed custom integration code for every tool. With MCP, you write a tool server once, and any MCP-compatible agent can use it.

MCP Architecture

MCP uses a three-tier model: the host (the application running the agent), the client (the MCP client embedded in the host), and the server (a process exposing tools).

Host
AI Application (Claude, Cursor, your agent) Manages the LLM; decides which MCP servers to connect to
Client
MCP Client (1:1 with each server) JSON-RPC 2.0 connection; handles capability negotiation and message routing
Transport
stdio Local subprocess — lowest latency
HTTP Streaming (SSE) Remote server — production web deployments
Streamable HTTP Bidirectional HTTP — newest transport
Server
Tools Callable functions (search, DB, APIs)
Resources Read-only data sources (files, DB rows)
Prompts Templated workflows the client can use

The Three Primitives

PrimitivePurposeAgent interaction
ToolsFunctions the model can invokeModel calls tools/call; receives result
ResourcesContext data the model can readModel requests resources/read; URI identifies the resource
PromptsPre-built workflow templatesUser or system selects prompt; rendered into conversation

Building an MCP Server from Scratch

python — minimal MCP server (using mcp SDK)
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp import types
import httpx

app = Server("my-tools")

# ── Declare available tools ──────────────────────────────────────────────────

@app.list_tools()
async def list_tools() -> list[types.Tool]:
    """MCP clients call this at startup to discover what the server offers."""
    return [
        types.Tool(
            name="get_weather",
            description="Get current weather for a city. Returns temperature, conditions, and humidity.",
            inputSchema={
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name, e.g. 'London'"},
                    "units": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius"},
                },
                "required": ["city"],
            },
        )
    ]

# ── Handle tool calls ────────────────────────────────────────────────────────

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
    """MCP clients call this to invoke a tool; return content blocks."""
    if name == "get_weather":
        city = arguments["city"]
        units = arguments.get("units", "celsius")

        async with httpx.AsyncClient() as client:
            response = await client.get(
                "https://api.openweathermap.org/data/2.5/weather",
                params={"q": city, "units": "metric" if units == "celsius" else "imperial", "appid": WEATHER_API_KEY},
            )
        data = response.json()
        return [types.TextContent(
            type="text",
            text=f"Weather in {city}: {data['weather'][0]['description']}, "
                 f"Temp: {data['main']['temp']}°, Humidity: {data['main']['humidity']}%",
        )]

    raise ValueError(f"Unknown tool: {name}")

# ── Run the server ────────────────────────────────────────────────────────────

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await app.run(read_stream, write_stream, app.create_initialization_options())

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

MCP server in 3 steps

1. pip install mcp. 2. Declare tools with @app.list_tools(). 3. Handle calls with @app.call_tool(). Clients discover your tools automatically via the initialization handshake — no further registration needed.

Production Gaps & Solutions

MCP standardizes discovery and invocation, but does not yet cover identity, budgeting, or structured errors. For production systems, three additional patterns are needed.

CABP — Context-Aware Broker Protocol

Attaches a user/tenant identity to every tool request, enabling per-user rate limiting, audit logs, and permission enforcement at the server level.

Problem solved: "Which user triggered this tool call?"

ATBA — Adaptive Timeout Budget Allocation

Tracks cumulative tool call latency per turn. If total time is exceeding budget, deprioritizes or cancels slow non-critical calls.

Problem solved: "This agent is taking 45 seconds per turn."

SERF — Structured Error Recovery Framework

Returns machine-readable error codes (TOOL_RATE_LIMITED, TOOL_PERMISSION_DENIED) instead of unstructured error strings, enabling the agent to take different recovery actions per error type.

Problem solved: "The agent doesn't know why the tool failed."

Context window overload

When an agent connects to thousands of MCP tools, hundreds of thousands of tokens can be consumed by tool schemas before the first user message. Solution: use semantic tool routing — embed the user query, retrieve the top-K most relevant tool descriptions, and inject only those into the current context. This is the same principle as RAG but applied to tool registries.

Chapter 6 Quiz

1. In MCP architecture, what is the "host"?

2. What is an MCP "Resource"?

3. What problem does the SERF pattern solve?