How Starlette Powers MCP Servers: The Invisible Infrastructure Behind AI's USB Protocol
Problem
Last month I started building my first MCP server. I hit a weird middleware issue - my CORS headers weren’t being passed correctly, and the browser client couldn’t read the Mcp-Session-Id header.
When I dug into the stack trace, I saw Starlette everywhere. I thought I was just using the MCP Python SDK, but it turned out Starlette was doing all the heavy lifting.
“Wild how much invisible infrastructure this thing powers,” as someone on Reddit put it.
If you’re building AI tools with Claude, GitHub Copilot, or custom LLM apps, you’re probably using Starlette without realizing it. The MCP Python SDK (21.9k GitHub stars) is built on top of it.
This post explains why MCP chose Starlette, how the architecture works, and how to debug middleware issues like the one I hit.
What is MCP?
MCP (Model Context Protocol) is Anthropic’s open standard for connecting LLMs to external tools and data. Think of it as a “USB interface for AI.”
Before MCP, powerful models were trapped - they could generate text but couldn’t access files, APIs, or terminals. They were like super-smart assistants locked in a room with no phone or internet.
MCP solves this with three core abstractions:
- Tools: Functions the model can call (API calls, database queries, file operations)
- Resources: Data sources the app controls (file contents, API responses)
- Prompts: User-controlled templates (slash commands, reusable interactions)
The architecture follows a Client-Host-Server pattern:
┌─────────────────┐│ AI Application │ (Claude, Copilot, etc.)└────────┬────────┘ │ JSON-RPC 2.0┌────────▼────────┐│ MCP Client │└────────┬────────┘ │ HTTP / SSE / Streamable HTTP┌────────▼────────┐│ MCP Server │ ← This is where Starlette lives│ (Starlette) │└────────┬────────┘ │┌────────▼────────┐│ Tools/Resources│└─────────────────┘Why does this matter in 2026? MCP isn’t just Anthropic anymore. OpenAI adopted it in March 2025, Google and Microsoft followed, and it was donated to the Agent AI Foundation in December 2025 for community governance. It’s becoming the cross-vendor standard for AI agents.
Why Starlette?
When the MCP team built the Python SDK, they needed an ASGI framework. They chose Starlette over Flask and Django. Here’s why.
The ASGI Advantage
Starlette is ASGI-native. Flask is WSGI (synchronous only). Django is too heavy.
Starlette is also minimal - ~1.5K lines of core code vs Django’s 100K+. It gives you routing, middleware, and WebSocket support without the baggage.
The request lifecycle looks like this:
HTTP Request → Uvicorn (ASGI Server) → Starlette (Routing & Middleware)→ MCP Handler → JSON-RPC ResponseUvicorn handles the raw HTTP socket. Starlette routes requests to the right handler and applies middleware. The MCP handler translates JSON-RPC into Python function calls.
What Starlette Enables for MCP
1. HTTP/WebSocket Transport Layers
MCP needs to move JSON-RPC 2.0 messages between client and server. Starlette provides:
- HTTP endpoints for standard requests
- Server-Sent Events (SSE) for streaming
- Streamable HTTP (the new 2025 transport)
- WebSocket support for community implementations
2. Middleware System
This is where I hit my issue. Starlette’s middleware intercepts requests/responses for:
- CORS handling (critical for browser-based clients)
- Authentication (OAuth 2.1 resource servers)
- Debugging hooks
- Custom request processing
3. Routing & Mounting
You can run multiple MCP servers on one app:
from mcp.server.fastmcp import FastMCPfrom starlette.applications import Starlettefrom starlette.routing import Mount
echo_mcp = FastMCP("EchoServer")math_mcp = FastMCP("MathServer")
app = Starlette(routes=[ Mount("/echo", app=echo_mcp.streamable_http_app()), Mount("/math", app=math_mcp.streamable_http_app()),])4. Async/Await Native
AI workloads are I/O-heavy. The model might call 10 tools concurrently. With Starlette’s async support, those calls don’t block each other.
Flask can’t do this efficiently. Django added async support later, but Starlette was designed for it from day one.
Transport Layers: SSE vs Streamable HTTP
MCP has evolved. Originally it used HTTP + SSE (Server-Sent Events). Now it recommends Streamable HTTP.
SSE (Legacy)
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("MyServer")
# Mount SSE transportapp = Starlette(routes=[Mount("/", app=mcp.sse_app())])SSE keeps a connection open and pushes events. Problem: It’s stateful. You can’t scale horizontally without session affinity.
Streamable HTTP (Recommended)
from mcp.server.fastmcp import FastMCP
# Stateless MCP servermcp = FastMCP("MyServer", stateless_http=True, json_response=True)
# Mount Streamable HTTPapp = Starlette(routes=[Mount("/mcp", app=mcp.streamable_http_app())])Streamable HTTP:
- Is stateless (scale across multiple nodes)
- Supports event stores for resumability
- Returns JSON or SSE formats
- Is the official recommendation as of 2025
The key difference is architectural. SSE ties a client to one server instance. Streamable HTTP lets any instance handle any request.
The Middleware Issue I Hit
Here’s the error I was getting:
Browser Error: Cannot read header 'Mcp-Session-Id'CORS violation: Response header not exposedMy Starlette app had CORS middleware:
from starlette.middleware.cors import CORSMiddleware
app = CORSMiddleware( app, allow_origins=["*"], allow_methods=["GET", "POST", "DELETE"],)But I was missing the critical piece:
app = CORSMiddleware( app, allow_origins=["*"], allow_methods=["GET", "POST", "DELETE"], expose_headers=["Mcp-Session-Id"], # ← This was missing!)The browser client needs Mcp-Session-Id for session management. Without expose_headers, the CORS spec blocks JavaScript from reading it.
Common Middleware Pitfalls
1. Reading the Request Body
# WRONG: This breaks MCPclass LoggingMiddleware(BaseHTTPMiddleware): async def dispatch(self, request: Request, call_next): body = await request.body() # Consumes the stream! print(f"Body: {body}") return await call_next(request)
# CORRECT: Use request.stateclass LoggingMiddleware(BaseHTTPMiddleware): async def dispatch(self, request: Request, call_next): print(f"Path: {request.url.path}") response = await call_next(request) return responseWhen you consume request.body(), the MCP handler can’t read it anymore. ASGI streams are one-pass.
2. Not Using Async
# WRONG: Blocks the event loopdef sync_middleware(request: Request, call_next): time.sleep(1) # Blocks all requests! return call_next(request)
# CORRECT: Async middlewareasync def async_middleware(request: Request, call_next): await asyncio.sleep(1) # Only this request waits return await call_next(request)3. Middleware Order
app = Starlette( routes=[...], middleware=[ # Runs first (outermost) Middleware(CORSMiddleware, ...), # Runs second Middleware(AuthMiddleware, ...), # Runs last (innermost) Middleware(LoggingMiddleware, ...), ])Request flows inward, response flows outward. CORS should be outermost so it applies to everything.
Building a Real MCP Server
Here’s a complete example with database connections:
"""mcp_database_server.py"""import asyncpgfrom contextlib import asynccontextmanagerfrom mcp.server.fastmcp import FastMCP, Contextfrom mcp.server.session import ServerSessionfrom starlette.applications import Starlettefrom starlette.routing import Mount
@asynccontextmanagerasync def db_lifespan(server: FastMCP): # Startup: Connect to database conn = await asyncpg.connect("postgres://user:pass@localhost/db") yield {"conn": conn} # Shutdown: Cleanup await conn.close()
mcp = FastMCP("PostgresServer", lifespan=db_lifespan, stateless_http=True)
@mcp.tool()async def query(sql: str, ctx: Context[ServerSession, dict]) -> list[dict]: """Execute a SQL query""" conn = ctx.request_context.lifespan_context["conn"] rows = await conn.fetch(sql) return [dict(r) for r in rows]
@mcp.resource("config://settings")def get_config() -> str: """Get server configuration""" return '{"version": "1.0", "max_rows": 1000}'
# Mount to Starletteapp = Starlette(routes=[Mount("/mcp", app=mcp.streamable_http_app())])
if __name__ == "__main__": import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000)Key points:
lifespanmanages database connections (startup/shutdown)ctx.request_context.lifespan_contextaccesses the connection@mcp.resourcedefines data sources@mcp.tooldefines callable functions
Structured Output
You can return Pydantic models for automatic JSON schema generation:
from pydantic import BaseModel
class WeatherData(BaseModel): temperature: float humidity: float condition: str
@mcp.tool()def get_weather(city: str) -> WeatherData: """Returns structured weather data""" return WeatherData( temperature=22.5, humidity=65.0, condition="sunny" )The client receives both text and structured data:
{ "content": [{"type": "text", "text": "Weather: sunny, 22.5°C"}], "structuredContent": { "temperature": 22.5, "humidity": 65.0, "condition": "sunny" }}This lets the AI read the summary but also parse the structured data.
Debugging MCP Servers
Enable Debug Logging
import logging
logging.basicConfig(level=logging.DEBUG)mcp = FastMCP("DebugServer", debug=True)Trace Request Flow
from starlette.middleware.base import BaseHTTPMiddleware
class TracingMiddleware(BaseHTTPMiddleware): async def dispatch(self, request: Request, call_next): print(f"→ {request.method} {request.url.path}") print(f" Headers: {dict(request.headers)}") response = await call_next(request) print(f"← {response.status_code}") print(f" Headers: {dict(response.headers)}") return response
app = Starlette( routes=[Mount("/mcp", app=mcp.streamable_http_app())], middleware=[Middleware(TracingMiddleware)])Common Stack Traces
Issue: ASGI Not Callable
TypeError: 'ASGIMiddleware' object is not callableFix: Ensure middleware is wrapped correctly:
app = Starlette(middleware=[Middleware(YourMiddleware)])Issue: Route Not Found
404 Not Found: /mcp/Fix: Check mount path. Default is /mcp, but if you mount at /api/mcp:
mcp.settings.streamable_http_path = "/api/mcp"Issue: CORS Session ID Missing
Browser client cannot read Mcp-Session-Id headerFix: Add expose_headers:
CORSMiddleware( allow_origins=["*"], expose_headers=["Mcp-Session-Id"])Test with MCP Inspector
# Start serveruv run server.py
# Test with inspector (separate terminal)npx -y @modelcontextprotocol/inspector# Connect to http://localhost:8000/mcpThe Inspector lets you call tools and see responses without writing client code.
Client-Side Testing
import asynciofrom mcp import ClientSessionfrom mcp.client.streamable_http import streamable_http_client
async def test(): async with streamable_http_client("http://localhost:8000/mcp") as (r, w, _): async with ClientSession(r, w) as session: await session.initialize() tools = await session.list_tools() print(f"Tools: {[t.name for t in tools.tools]}")
asyncio.run(test())Production Best Practices
1. Use Stateless Configuration
mcp = FastMCP( "ProductionServer", stateless_http=True, # Horizontal scaling json_response=True # Better parsing)2. Secure CORS
CORSMiddleware( app, allow_origins=["https://yourdomain.com"], # Not "*" allow_methods=["GET", "POST", "DELETE"], expose_headers=["Mcp-Session-Id"], allow_headers=["Authorization", "Content-Type"])3. Add Health Checks
from starlette.responses import JSONResponse
@app.get("/health")async def health_check(): return JSONResponse({"status": "healthy", "mcp": "ready"})4. Deployment Options
# Option 1: Uvicorn (recommended)uvicorn server:app --host 0.0.0.0 --port 8000
# Option 2: Gunicorn + Uvicorn workersgunicorn server:app -w 4 -k uvicorn.workers.UvicornWorker
# Option 3: DockerFROM python:3.12RUN pip install mcp[cli]CMD ["uvicorn", "server:app", "--host", "0.0.0.0"]Real-World Examples
GitHub MCP Server (Official)
Repository: anthropics/mcp-server-git
Tools: read_file, write_file, list_directory
This is what Claude Code uses for file operations. It mounts via Starlette’s Streamable HTTP transport.
Browser MCP Server
Repository: modelcontextprotocol/mcp-server-browser
Tools: browse (web scraping), screenshot
Uses Starlette middleware for custom session management across browser instances.
Database MCP Server
@asynccontextmanagerasync def db_lifespan(server): conn = await asyncpg.connect("postgres://...") yield {"conn": conn} await conn.close()
@mcp.tool()async def query(sql: str, ctx: Context) -> list[dict]: conn = ctx.request_context.lifespan_context["conn"] rows = await conn.fetch(sql) return [dict(r) for r in rows]Pattern: Use lifespan for connection pooling, access via context.
File System MCP Server
import aiofiles
@mcp.tool()async def read_file(path: str) -> str: """Read file contents""" async with aiofiles.open(path, "r") as f: return await f.read()
@mcp.resource("file://{path}")async def file_resource(path: str) -> str: return await read_file(path)Pattern: Use aiofiles for async file I/O, resources for data access.
Why Starlette Will Keep Dominating
Looking at 2026 trends:
- MCP everywhere - Claude Desktop, VS Code, Cursor, Windsurf all use it
- Standardization - OpenAI, Google, Microsoft adopted MCP
- Ecosystem growth - 1000+ community MCP servers
Starlette is positioned perfectly:
- Async-first: Matches AI’s concurrent tool call pattern
- Minimal: Low overhead for microservices
- Battle-tested: 10M+ downloads/day, production proven
- Community: FastAPI (50K+ GitHub stars) is built on it
The MCP protocol will keep evolving (OAuth 2.1 auth, structured output, Tasks), but the ASGI foundation stays the same.
What I Learned
- Starlette is everywhere in the AI/LLM stack, even when you don’t see it
- Middleware order matters - CORS goes outermost
- Stateless is better - Streamable HTTP over SSE for production
- Context propagation - Use
lifespan_contextfor database connections - Debug with Inspector - Don’t write custom clients for testing
The middleware issue I hit? Fixed in 5 minutes once I understood the request flow. The invisible infrastructure wasn’t so invisible after all.
Build Your First MCP Server
pip install mcp[cli]uv run mcp create my-servercd my-serveruv run dev.pyThen connect with npx @modelcontextprotocol/inspector and start building tools.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 MCP Python SDK
- 👨💻 Model Context Protocol Documentation
- 👨💻 Starlette Framework
- 👨💻 FastAPI Architecture Core
- 👨💻 MCP 入门完全指南
- 👨💻 MCP 协议更新详解
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments