Skip to content

How to Stop MCP Servers From Eating Your Claude Context Tokens?

Why is my fresh context size at 100k tokens before I’ve even asked a question?

I stared at my Claude Code setup, confused. I’d connected 15 MCP servers—everything from file system access to web search, database connectors to git tools. Each one seemed useful. Each one felt necessary. But together? They were eating my context window alive.

Here’s what I learned about MCP server token consumption and how I fixed it.

The Problem: Tools in Your System Prompt

Every MCP server you connect loads all its tool definitions into Claude’s system prompt. Not when you use them—before you even start working.

Let me show you what this looks like. Here’s a simplified version of what gets injected:

System Prompt Injection
You have access to the following tools:
mcp_filesystem__read_file - Read the complete contents of a file...
mcp_filesystem__write_file - Write content to a file...
mcp_filesystem__list_directory - List all files in a directory...
mcp_filesystem__create_directory - Create a new directory...
mcp_filesystem__search_files - Search for files matching a pattern...
mcp_github__create_issue - Create a new GitHub issue...
mcp_github__create_pull_request - Create a pull request...
mcp_github__list_repositories - List all repositories...
mcp_postgres__query - Execute a SQL query...
mcp_postgres__list_tables - List all tables in database...
[...and so on for every tool in every server]

Each tool definition includes:

  • Tool name
  • Description
  • Parameter schema (often verbose JSON)
  • Usage examples in some cases

When I connected 15 servers, I was loading hundreds of tool definitions before typing a single word.

My Token Audit

I decided to measure the actual impact. Here’s what I found:

ConfigurationServersEst. Token CostUsable Context
Maximum Load15~100k tokensSeverely limited
Moderate8~50k tokensReduced
Lean6~30k tokensHealthy
Minimal3~15k tokensOptimal

The math is brutal. If you have a 200k context window and burn 100k on tool definitions, you’ve already lost half your capacity.

As one Reddit user put it:

“MCP eats tons of tokens for things that should be bash scripts.”

They were right. I had MCP servers for operations that could be done with simple shell commands.

Solution 1: The 3-6 Server Rule

I applied a simple constraint: only keep servers that provide genuine value beyond what bash can do.

Server Selection Criteria
KEEP if:
- Requires complex authentication (GitHub API)
- Needs persistent state (database connections)
- Provides structured data (web search APIs)
- Has non-trivial logic (file watchers, linters)
REMOVE if:
- Simple file operations (use bash)
- Basic HTTP calls (use curl)
- One-off operations (write a script)
- Redundant with other tools

My final list:

  1. context7 - Official documentation search (unique value)
  2. github - PR/issue management (complex auth)
  3. postgres - Database queries (persistent connection)
  4. filesystem - Only because I need frequent file access

Four servers. That’s it. Everything else? Shell scripts.

Solution 2: Shell Script Wrappers

For operations that don’t need MCP complexity, I created simple bash scripts.

~/bin/web-lookup
#!/bin/bash
# Simple web search using Serper API
query="$1"
curl -s -X POST "https://google.serper.dev/search" \
-H "X-API-KEY: $SERPER_API_KEY" \
-H "Content-Type: application/json" \
-d "{\"q\": \"$query\"}" | jq '.organic[0:5]'
~/bin/quick-curl
#!/bin/bash
# Wrapper for common API calls
url="$1"
curl -s "$url" | jq .

These scripts exist in a directory structure that Claude can discover:

Progressive Tool Disclosure
~/bin/
├── web-lookup # Search the web
├── quick-curl # Simple HTTP GET
├── json-extract # jq wrapper for JSON
└── date-utils # Date formatting helpers

The beauty? Claude discovers these tools through the filesystem MCP or bash exploration, not through system prompt injection. Progressive disclosure instead of upfront loading.

Solution 3: Selective Enable/Disable

Sometimes I need specific tools for specific tasks. Instead of keeping everything loaded, I toggle servers on and off.

MCP Server Toggle Script
#!/bin/bash
# mcp-toggle.sh - Enable/disable MCP servers
config_file="$HOME/.claude/config.json"
enable_server() {
jq ".mcpServers.$1.disabled = false" "$config_file" > tmp.json
mv tmp.json "$config_file"
echo "Enabled: $1"
}
disable_server() {
jq ".mcpServers.$1.disabled = true" "$config_file" > tmp.json
mv tmp.json "$config_file"
echo "Disabled: $1"
}
case "$1" in
enable) enable_server "$2" ;;
disable) disable_server "$2" ;;
list) jq '.mcpServers | keys[]' "$config_file" ;;
*) echo "Usage: mcp-toggle {enable|disable|list} [server-name]" ;;
esac

This approach mirrors what desktop users do:

“In Claude Desktop, I’ll use one or two MCP servers and selectively turn them on and off.”

The key insight: load tools when you need them, not because you might need them.

The Result

After implementing these changes:

MetricBeforeAfter
MCP Servers154
Estimated Token Burn~100k~25k
Shell Scripts Added08
Context AvailableLimitedAbundant

More importantly, Claude works better. With fewer tools to choose from, it makes faster decisions. The signal-to-noise ratio improved dramatically.

When to Keep an MCP Server

Not all MCP servers are wasteful. Keep servers that:

  1. Manage state - Database connections, file watchers
  2. Handle auth - OAuth flows, API keys you don’t want in shell history
  3. Transform data - Complex parsing, format conversion
  4. Provide structure - Type-safe tool calls with validation

Remove servers that:

  1. Wrap simple commands - File reads, directory listings
  2. Duplicate functionality - Multiple ways to do the same thing
  3. You rarely use - “Just in case” tools are context parasites

A Mental Model

Think of MCP servers like browser tabs. Each one has overhead. Keeping 50 tabs open “just in case” slows everything down. Better to open what you need, close what you don’t.

The same applies to your context window. Every token spent on tool definitions is a token not spent on actual reasoning.

Summary

The solution to MCP token consumption isn’t to avoid MCP—it’s to be intentional about what you load:

  1. Limit to 3-6 essential servers - Be ruthless about necessity
  2. Use shell scripts for simple operations - Progressive disclosure over upfront loading
  3. Toggle servers based on task - Load what you need, when you need it

Less is more honestly. Every MCP you add is more tools the agent has to choose from. More context burned before you start. More noise in the signal.

Your context window is precious. Protect it.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments