How Starlette Powers MCP Servers: The Invisible Infrastructure Behind AI's USB Protocol

Feb 27, 2026

Problem

Last month I started building my first MCP server. I hit a weird middleware issue - my CORS headers weren’t being passed correctly, and the browser client couldn’t read the Mcp-Session-Id header.

When I dug into the stack trace, I saw Starlette everywhere. I thought I was just using the MCP Python SDK, but it turned out Starlette was doing all the heavy lifting.

“Wild how much invisible infrastructure this thing powers,” as someone on Reddit put it.

If you’re building AI tools with Claude, GitHub Copilot, or custom LLM apps, you’re probably using Starlette without realizing it. The MCP Python SDK (21.9k GitHub stars) is built on top of it.

This post explains why MCP chose Starlette, how the architecture works, and how to debug middleware issues like the one I hit.

What is MCP?

MCP (Model Context Protocol) is Anthropic’s open standard for connecting LLMs to external tools and data. Think of it as a “USB interface for AI.”

Before MCP, powerful models were trapped - they could generate text but couldn’t access files, APIs, or terminals. They were like super-smart assistants locked in a room with no phone or internet.

MCP solves this with three core abstractions:

Tools: Functions the model can call (API calls, database queries, file operations)
Resources: Data sources the app controls (file contents, API responses)
Prompts: User-controlled templates (slash commands, reusable interactions)

The architecture follows a Client-Host-Server pattern:

┌─────────────────┐
│  AI Application │ (Claude, Copilot, etc.)
└────────┬────────┘
         │ JSON-RPC 2.0
┌────────▼────────┐
│  MCP Client     │
└────────┬────────┘
         │ HTTP / SSE / Streamable HTTP
┌────────▼────────┐
│  MCP Server     │ ← This is where Starlette lives
│  (Starlette)    │
└────────┬────────┘
         │
┌────────▼────────┐
│  Tools/Resources│
└─────────────────┘

Why does this matter in 2026? MCP isn’t just Anthropic anymore. OpenAI adopted it in March 2025, Google and Microsoft followed, and it was donated to the Agent AI Foundation in December 2025 for community governance. It’s becoming the cross-vendor standard for AI agents.

Why Starlette?

When the MCP team built the Python SDK, they needed an ASGI framework. They chose Starlette over Flask and Django. Here’s why.

The ASGI Advantage

Starlette is ASGI-native. Flask is WSGI (synchronous only). Django is too heavy.

Starlette is also minimal - ~1.5K lines of core code vs Django’s 100K+. It gives you routing, middleware, and WebSocket support without the baggage.

The request lifecycle looks like this:

HTTP Request → Uvicorn (ASGI Server) → Starlette (Routing & Middleware)
→ MCP Handler → JSON-RPC Response

Uvicorn handles the raw HTTP socket. Starlette routes requests to the right handler and applies middleware. The MCP handler translates JSON-RPC into Python function calls.

What Starlette Enables for MCP

1. HTTP/WebSocket Transport Layers

MCP needs to move JSON-RPC 2.0 messages between client and server. Starlette provides:

HTTP endpoints for standard requests
Server-Sent Events (SSE) for streaming
Streamable HTTP (the new 2025 transport)
WebSocket support for community implementations

2. Middleware System

This is where I hit my issue. Starlette’s middleware intercepts requests/responses for:

CORS handling (critical for browser-based clients)
Authentication (OAuth 2.1 resource servers)
Debugging hooks
Custom request processing

3. Routing & Mounting

You can run multiple MCP servers on one app:

from mcp.server.fastmcp import FastMCP
from starlette.applications import Starlette
from starlette.routing import Mount

echo_mcp = FastMCP("EchoServer")
math_mcp = FastMCP("MathServer")

app = Starlette(routes=[
    Mount("/echo", app=echo_mcp.streamable_http_app()),
    Mount("/math", app=math_mcp.streamable_http_app()),
])

4. Async/Await Native

AI workloads are I/O-heavy. The model might call 10 tools concurrently. With Starlette’s async support, those calls don’t block each other.

Flask can’t do this efficiently. Django added async support later, but Starlette was designed for it from day one.

Transport Layers: SSE vs Streamable HTTP

MCP has evolved. Originally it used HTTP + SSE (Server-Sent Events). Now it recommends Streamable HTTP.

SSE (Legacy)

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("MyServer")

# Mount SSE transport
app = Starlette(routes=[Mount("/", app=mcp.sse_app())])

SSE keeps a connection open and pushes events. Problem: It’s stateful. You can’t scale horizontally without session affinity.

Streamable HTTP (Recommended)

from mcp.server.fastmcp import FastMCP

# Stateless MCP server
mcp = FastMCP("MyServer", stateless_http=True, json_response=True)

# Mount Streamable HTTP
app = Starlette(routes=[Mount("/mcp", app=mcp.streamable_http_app())])

Streamable HTTP:

Is stateless (scale across multiple nodes)
Supports event stores for resumability
Returns JSON or SSE formats
Is the official recommendation as of 2025

The key difference is architectural. SSE ties a client to one server instance. Streamable HTTP lets any instance handle any request.

The Middleware Issue I Hit

Here’s the error I was getting:

Browser Error: Cannot read header 'Mcp-Session-Id'
CORS violation: Response header not exposed

My Starlette app had CORS middleware:

from starlette.middleware.cors import CORSMiddleware

app = CORSMiddleware(
    app,
    allow_origins=["*"],
    allow_methods=["GET", "POST", "DELETE"],
)

But I was missing the critical piece:

app = CORSMiddleware(
    app,
    allow_origins=["*"],
    allow_methods=["GET", "POST", "DELETE"],
    expose_headers=["Mcp-Session-Id"],  # ← This was missing!
)

The browser client needs Mcp-Session-Id for session management. Without expose_headers, the CORS spec blocks JavaScript from reading it.

Common Middleware Pitfalls

1. Reading the Request Body

# WRONG: This breaks MCP
class LoggingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        body = await request.body()  # Consumes the stream!
        print(f"Body: {body}")
        return await call_next(request)

# CORRECT: Use request.state
class LoggingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        print(f"Path: {request.url.path}")
        response = await call_next(request)
        return response

When you consume request.body(), the MCP handler can’t read it anymore. ASGI streams are one-pass.

2. Not Using Async

# WRONG: Blocks the event loop
def sync_middleware(request: Request, call_next):
    time.sleep(1)  # Blocks all requests!
    return call_next(request)

# CORRECT: Async middleware
async def async_middleware(request: Request, call_next):
    await asyncio.sleep(1)  # Only this request waits
    return await call_next(request)

3. Middleware Order

app = Starlette(
    routes=[...],
    middleware=[
        # Runs first (outermost)
        Middleware(CORSMiddleware, ...),
        # Runs second
        Middleware(AuthMiddleware, ...),
        # Runs last (innermost)
        Middleware(LoggingMiddleware, ...),
    ]
)

Request flows inward, response flows outward. CORS should be outermost so it applies to everything.

Building a Real MCP Server

Here’s a complete example with database connections:

"""mcp_database_server.py"""
import asyncpg
from contextlib import asynccontextmanager
from mcp.server.fastmcp import FastMCP, Context
from mcp.server.session import ServerSession
from starlette.applications import Starlette
from starlette.routing import Mount

@asynccontextmanager
async def db_lifespan(server: FastMCP):
    # Startup: Connect to database
    conn = await asyncpg.connect("postgres://user:pass@localhost/db")
    yield {"conn": conn}
    # Shutdown: Cleanup
    await conn.close()

mcp = FastMCP("PostgresServer", lifespan=db_lifespan, stateless_http=True)

@mcp.tool()
async def query(sql: str, ctx: Context[ServerSession, dict]) -> list[dict]:
    """Execute a SQL query"""
    conn = ctx.request_context.lifespan_context["conn"]
    rows = await conn.fetch(sql)
    return [dict(r) for r in rows]

@mcp.resource("config://settings")
def get_config() -> str:
    """Get server configuration"""
    return '{"version": "1.0", "max_rows": 1000}'

# Mount to Starlette
app = Starlette(routes=[Mount("/mcp", app=mcp.streamable_http_app())])

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

Key points:

lifespan manages database connections (startup/shutdown)
ctx.request_context.lifespan_context accesses the connection
@mcp.resource defines data sources
@mcp.tool defines callable functions

Structured Output

You can return Pydantic models for automatic JSON schema generation:

from pydantic import BaseModel

class WeatherData(BaseModel):
    temperature: float
    humidity: float
    condition: str

@mcp.tool()
def get_weather(city: str) -> WeatherData:
    """Returns structured weather data"""
    return WeatherData(
        temperature=22.5,
        humidity=65.0,
        condition="sunny"
    )

The client receives both text and structured data:

{
  "content": [{"type": "text", "text": "Weather: sunny, 22.5°C"}],
  "structuredContent": {
    "temperature": 22.5,
    "humidity": 65.0,
    "condition": "sunny"
  }
}

This lets the AI read the summary but also parse the structured data.

Debugging MCP Servers

Enable Debug Logging

import logging

logging.basicConfig(level=logging.DEBUG)
mcp = FastMCP("DebugServer", debug=True)

Trace Request Flow

from starlette.middleware.base import BaseHTTPMiddleware

class TracingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        print(f"→ {request.method} {request.url.path}")
        print(f"  Headers: {dict(request.headers)}")
        response = await call_next(request)
        print(f"← {response.status_code}")
        print(f"  Headers: {dict(response.headers)}")
        return response

app = Starlette(
    routes=[Mount("/mcp", app=mcp.streamable_http_app())],
    middleware=[Middleware(TracingMiddleware)]
)

Common Stack Traces

Issue: ASGI Not Callable

TypeError: 'ASGIMiddleware' object is not callable

Fix: Ensure middleware is wrapped correctly:

app = Starlette(middleware=[Middleware(YourMiddleware)])

Issue: Route Not Found

404 Not Found: /mcp/

Fix: Check mount path. Default is /mcp, but if you mount at /api/mcp:

mcp.settings.streamable_http_path = "/api/mcp"

Issue: CORS Session ID Missing

Browser client cannot read Mcp-Session-Id header

Fix: Add expose_headers:

CORSMiddleware(
    allow_origins=["*"],
    expose_headers=["Mcp-Session-Id"]
)

Test with MCP Inspector

# Start server
uv run server.py

# Test with inspector (separate terminal)
npx -y @modelcontextprotocol/inspector
# Connect to http://localhost:8000/mcp

The Inspector lets you call tools and see responses without writing client code.

Client-Side Testing

import asyncio
from mcp import ClientSession
from mcp.client.streamable_http import streamable_http_client

async def test():
    async with streamable_http_client("http://localhost:8000/mcp") as (r, w, _):
        async with ClientSession(r, w) as session:
            await session.initialize()
            tools = await session.list_tools()
            print(f"Tools: {[t.name for t in tools.tools]}")

asyncio.run(test())

Production Best Practices

1. Use Stateless Configuration

mcp = FastMCP(
    "ProductionServer",
    stateless_http=True,      # Horizontal scaling
    json_response=True         # Better parsing
)

2. Secure CORS

CORSMiddleware(
    app,
    allow_origins=["https://yourdomain.com"],  # Not "*"
    allow_methods=["GET", "POST", "DELETE"],
    expose_headers=["Mcp-Session-Id"],
    allow_headers=["Authorization", "Content-Type"]
)

3. Add Health Checks

from starlette.responses import JSONResponse

@app.get("/health")
async def health_check():
    return JSONResponse({"status": "healthy", "mcp": "ready"})

4. Deployment Options

# Option 1: Uvicorn (recommended)
uvicorn server:app --host 0.0.0.0 --port 8000

# Option 2: Gunicorn + Uvicorn workers
gunicorn server:app -w 4 -k uvicorn.workers.UvicornWorker

# Option 3: Docker
FROM python:3.12
RUN pip install mcp[cli]
CMD ["uvicorn", "server:app", "--host", "0.0.0.0"]

Real-World Examples

GitHub MCP Server (Official)

Repository: anthropics/mcp-server-git

Tools: read_file, write_file, list_directory

This is what Claude Code uses for file operations. It mounts via Starlette’s Streamable HTTP transport.

Browser MCP Server

Repository: modelcontextprotocol/mcp-server-browser

Tools: browse (web scraping), screenshot

Uses Starlette middleware for custom session management across browser instances.

Database MCP Server

@asynccontextmanager
async def db_lifespan(server):
    conn = await asyncpg.connect("postgres://...")
    yield {"conn": conn}
    await conn.close()

@mcp.tool()
async def query(sql: str, ctx: Context) -> list[dict]:
    conn = ctx.request_context.lifespan_context["conn"]
    rows = await conn.fetch(sql)
    return [dict(r) for r in rows]

Pattern: Use lifespan for connection pooling, access via context.

File System MCP Server

import aiofiles

@mcp.tool()
async def read_file(path: str) -> str:
    """Read file contents"""
    async with aiofiles.open(path, "r") as f:
        return await f.read()

@mcp.resource("file://{path}")
async def file_resource(path: str) -> str:
    return await read_file(path)

Pattern: Use aiofiles for async file I/O, resources for data access.

Why Starlette Will Keep Dominating

Looking at 2026 trends:

MCP everywhere - Claude Desktop, VS Code, Cursor, Windsurf all use it
Standardization - OpenAI, Google, Microsoft adopted MCP
Ecosystem growth - 1000+ community MCP servers

Starlette is positioned perfectly:

Async-first: Matches AI’s concurrent tool call pattern
Minimal: Low overhead for microservices
Battle-tested: 10M+ downloads/day, production proven
Community: FastAPI (50K+ GitHub stars) is built on it

The MCP protocol will keep evolving (OAuth 2.1 auth, structured output, Tasks), but the ASGI foundation stays the same.

What I Learned

Starlette is everywhere in the AI/LLM stack, even when you don’t see it
Middleware order matters - CORS goes outermost
Stateless is better - Streamable HTTP over SSE for production
Context propagation - Use lifespan_context for database connections
Debug with Inspector - Don’t write custom clients for testing

The middleware issue I hit? Fixed in 5 minutes once I understood the request flow. The invisible infrastructure wasn’t so invisible after all.

Build Your First MCP Server

pip install mcp[cli]
uv run mcp create my-server
cd my-server
uv run dev.py

Then connect with npx @modelcontextprotocol/inspector and start building tools.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 MCP Python SDK
👨‍💻 Model Context Protocol Documentation
👨‍💻 Starlette Framework
👨‍💻 FastAPI Architecture Core
👨‍💻 MCP 入门完全指南
👨‍💻 MCP 协议更新详解

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!