Chapter 6: Model Context Protocol (MCP)
Model Context Protocol (MCP) in Building Agentic AI Systems.
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the agentic AI concept behind Model Context Protocol (MCP).
- Apply Model Context Protocol (MCP) to design reliable, production-grade agent systems.
- Recognize operational trade-offs in tool use, orchestration, safety, and cost.
Chapter 6: Model Context Protocol (MCP)
The universal standard for agent-to-tool communication
What is MCP?
The Model Context Protocol (MCP) is an open standard — introduced by Anthropic in November 2024 and donated to the Linux Foundation in December 2025 — that defines how AI agents discover and invoke external tools. Think of it as USB-C for AI tools: any agent that speaks MCP can talk to any MCP-compliant server without custom integration code.
Scale of adoption (early 2026)
- 10,000+ active MCP servers in production
- 500+ MCP client integrations (Claude, ChatGPT, Cursor, VS Code, Replit)
- 97 million monthly SDK downloads
- All major agent frameworks support MCP natively: LangGraph, CrewAI, OpenAI Agents SDK, Google ADK
Before MCP, every agent needed custom integration code for every tool. With MCP, you write a tool server once, and any MCP-compatible agent can use it.
MCP Architecture
MCP uses a three-tier model: the host (the application running the agent), the client (the MCP client embedded in the host), and the server (a process exposing tools).
The Three Primitives
| Primitive | Purpose | Agent interaction |
|---|---|---|
| Tools | Functions the model can invoke | Model calls tools/call; receives result |
| Resources | Context data the model can read | Model requests resources/read; URI identifies the resource |
| Prompts | Pre-built workflow templates | User or system selects prompt; rendered into conversation |
Building an MCP Server from Scratch
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp import types
import httpx
app = Server("my-tools")
# ── Declare available tools ──────────────────────────────────────────────────
@app.list_tools()
async def list_tools() -> list[types.Tool]:
"""MCP clients call this at startup to discover what the server offers."""
return [
types.Tool(
name="get_weather",
description="Get current weather for a city. Returns temperature, conditions, and humidity.",
inputSchema={
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name, e.g. 'London'"},
"units": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius"},
},
"required": ["city"],
},
)
]
# ── Handle tool calls ────────────────────────────────────────────────────────
@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
"""MCP clients call this to invoke a tool; return content blocks."""
if name == "get_weather":
city = arguments["city"]
units = arguments.get("units", "celsius")
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.openweathermap.org/data/2.5/weather",
params={"q": city, "units": "metric" if units == "celsius" else "imperial", "appid": WEATHER_API_KEY},
)
data = response.json()
return [types.TextContent(
type="text",
text=f"Weather in {city}: {data['weather'][0]['description']}, "
f"Temp: {data['main']['temp']}°, Humidity: {data['main']['humidity']}%",
)]
raise ValueError(f"Unknown tool: {name}")
# ── Run the server ────────────────────────────────────────────────────────────
async def main():
async with stdio_server() as (read_stream, write_stream):
await app.run(read_stream, write_stream, app.create_initialization_options())
if __name__ == "__main__":
import asyncio
asyncio.run(main())
MCP server in 3 steps
1. pip install mcp. 2. Declare tools with @app.list_tools(). 3. Handle calls with @app.call_tool(). Clients discover your tools automatically via the initialization handshake — no further registration needed.
Production Gaps & Solutions
MCP standardizes discovery and invocation, but does not yet cover identity, budgeting, or structured errors. For production systems, three additional patterns are needed.
CABP — Context-Aware Broker Protocol
Attaches a user/tenant identity to every tool request, enabling per-user rate limiting, audit logs, and permission enforcement at the server level.
ATBA — Adaptive Timeout Budget Allocation
Tracks cumulative tool call latency per turn. If total time is exceeding budget, deprioritizes or cancels slow non-critical calls.
SERF — Structured Error Recovery Framework
Returns machine-readable error codes (TOOL_RATE_LIMITED, TOOL_PERMISSION_DENIED) instead of unstructured error strings, enabling the agent to take different recovery actions per error type.
Context window overload
When an agent connects to thousands of MCP tools, hundreds of thousands of tokens can be consumed by tool schemas before the first user message. Solution: use semantic tool routing — embed the user query, retrieve the top-K most relevant tool descriptions, and inject only those into the current context. This is the same principle as RAG but applied to tool registries.
Chapter 6 Quiz
1. In MCP architecture, what is the "host"?
2. What is an MCP "Resource"?
3. What problem does the SERF pattern solve?