How Do I Optimize MCP Tool Schemas to Reduce Token Costs?
Problem
I was burning through tokens. My MCP server registered 29 tools. Every time a session started, the LLM loaded all 29 tool schemas into context. That’s about 4,250 tokens before any real work even began.
Here’s what my token usage looked like:
Session start: 4,250 tokens (tool schemas)Per conversation: ~3,500 tokens wastedMonthly cost: Way too muchWhen I profiled the issue, I found the real culprit: schema bloat.
The Root Cause
My tool schemas were bloated in four ways:
- Over-documentation: Every parameter had a verbose description
- Tool proliferation: 29 separate tools when I needed far fewer
- Eager loading: All tools loaded at session start
- Redundant schemas: Same sub-schemas repeated across tools
A typical tool in my server looked like this:
{ "name": "sdl.symbol.search", "description": "Searches for symbols in the codebase using fuzzy matching and returns results with metadata including file location, type, and documentation", "inputSchema": { "type": "object", "properties": { "query": { "type": "string", "description": "The search query string to match against symbol names, supports wildcards and fuzzy matching" }, "scope": { "type": "string", "enum": ["local", "global", "workspace"], "description": "The scope of the search - local for current file, global for entire project, workspace for all open folders" }, "maxResults": { "type": "number", "description": "Maximum number of results to return, defaults to 50 if not specified" } }, "required": ["query"] }}That’s one tool. Times 29. You can see why my token costs were exploding.
Solution: Four Compression Strategies
I applied four strategies and reduced 4,250 tokens to 725 tokens. Here’s how.
Strategy 1: Gateway Pattern (70-85% Savings)
Instead of 29 flat tools, I consolidated them into 4 namespace-scoped gateway tools:
Before: 29 individual tools- sdl.repo.register- sdl.repo.clone- sdl.repo.list- sdl.symbol.search- sdl.symbol.resolve- sdl.code.needWindow- sdl.code.applyEdit- sdl.agent.orchestrate- ... (21 more)
After: 4 gateway tools- sdl.query (9 read-only actions)- sdl.code (3 gated code actions)- sdl.repo (6 repository actions)- sdl.agent (11 agentic operations)The gateway uses a discriminated union pattern:
const gatewayTools = [ { name: "sdl.query", description: "Query actions: search|resolve|list. Args: {action, query?, scope?}", inputSchema: { type: "object", properties: { action: { type: "string", enum: ["search", "resolve", "list", "get", "find", "inspect"] }, query: { type: "string" }, scope: { type: "string" } }, required: ["action"], additionalProperties: true } }];The router dispatches based on the action parameter:
function handleQuery(args: { action: string; query?: string; scope?: string }) { const handlers = { search: () => searchSymbols(args.query, args.scope), resolve: () => resolveSymbol(args.query), list: () => listSymbols(args.scope), get: () => getSymbolDetails(args.query), find: () => findReferences(args.query), inspect: () => inspectSymbol(args.query) };
const handler = handlers[args.action]; if (!handler) { throw new Error(`Unknown action: ${args.action}`); } return handler();}Strategy 2: Thin Wire Schemas (50-70% Savings)
I replaced full schemas with minimal envelopes using additionalProperties: true:
// Before: Full schema (150+ tokens per tool)const bloatedSchema = { type: "object", properties: { path: { type: "string", description: "The file path..." }, content: { type: "string", description: "The content to write..." }, encoding: { type: "string", description: "Character encoding..." }, overwrite: { type: "boolean", description: "Whether to overwrite..." } }, required: ["path", "content"]};
// After: Thin envelope (30 tokens)const thinSchema = { type: "object", properties: { action: { type: "string" }, path: { type: "string" } }, required: ["action", "path"], additionalProperties: true};The additionalProperties: true lets me pass extra parameters without defining them in the schema. Validation happens server-side where it belongs.
Strategy 3: Description Stripping (20-40% Savings)
I removed redundant description fields. The LLM can infer from context:
// Before: Verbose descriptions{ properties: { query: { type: "string", description: "The search query string to match against symbol names" }, scope: { type: "string", description: "The scope of the search operation" } }}
// After: Minimal or no descriptions{ properties: { query: { type: "string" }, scope: { type: "string" } }}The tool name and context provide enough information. I kept one-line descriptions on the tool itself, not on every parameter.
Strategy 4: Lazy Loading (50-90% Per-Session Savings)
Not all tools are needed for every conversation. I split tools into packs:
// Core tools always loadedconst coreTools = ["sdl.query", "sdl.code"];
// On-demand tool packsconst toolPacks = { git: ["git.commit", "git.push", "git.pull", "git.branch"], database: ["db.query", "db.migrate", "db.seed"], cloud: ["cloud.deploy", "cloud.logs", "cloud.scale"]};
const loadedPacks = new Set<string>();
async function loadToolPack(packName: string) { if (loadedPacks.has(packName)) { return; // Already loaded }
const tools = toolPacks[packName]; if (!tools) { throw new Error(`Unknown pack: ${packName}`); }
for (const toolName of tools) { const module = await import(`./tools/${packName}/${toolName}.ts`); mcpServer.addTool(module.default); }
loadedPacks.add(packName);}
// Trigger loading when neededasync function handleRequest(action: string, args: any) { if (action.startsWith("git.")) { await loadToolPack("git"); } else if (action.startsWith("db.")) { await loadToolPack("database"); } // ... dispatch to handler}Results
Here’s what I achieved:
| Metric | Before | After | Savings |
|---|---|---|---|
| Tools registered | 29 | 4 | 86% |
| Characters | ~17,000 | ~2,900 | 83% |
| Estimated tokens | ~4,250 | ~725 | 83% |
| Per-conversation | ~3,525 tokens wasted | Minimal | - |
A Reddit user reported similar results:
“I am collapsing 25 tools down to 4. Originally 3,742 tokens that is now 713 tokens.”
Complete Example: File Operations Gateway
Here’s a complete gateway for file operations:
import { Server } from '@modelcontextprotocol/sdk/server/index.js';import * as fs from 'fs/promises';import * as path from 'path';
// Single gateway tool replaces 5 file toolsconst fileGateway = { name: "file_ops", description: "File operations: create|read|update|delete|list", inputSchema: { type: "object", properties: { action: { type: "string", enum: ["create", "read", "update", "delete", "list"] }, path: { type: "string" }, content: { type: "string" } }, required: ["action", "path"], additionalProperties: true }};
async function handleFileOps(args: { action: string; path: string; content?: string; encoding?: string;}) { const { action, path: filePath, content, encoding = 'utf-8' } = args;
switch (action) { case "create": await fs.writeFile(filePath, content ?? '', encoding); return { success: true, message: `Created ${filePath}` };
case "read": const data = await fs.readFile(filePath, encoding); return { success: true, content: data };
case "update": await fs.appendFile(filePath, content ?? '', encoding); return { success: true, message: `Updated ${filePath}` };
case "delete": await fs.unlink(filePath); return { success: true, message: `Deleted ${filePath}` };
case "list": const entries = await fs.readdir(filePath, { withFileTypes: true }); return { success: true, entries: entries.map(e => ({ name: e.name, type: e.isDirectory() ? 'directory' : 'file' })) };
default: throw new Error(`Unknown action: ${action}`); }}
// Register with MCP serverserver.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [fileGateway]}));
server.setRequestHandler(CallToolRequestSchema, async (request) => { const { name, arguments: args } = request.params;
if (name === "file_ops") { return handleFileOps(args as any); }
throw new Error(`Unknown tool: ${name}`);});Token Savings Summary
| Strategy | Token Savings | Effort |
|---|---|---|
| Gateway consolidation | 70-85% | Medium |
| Thin wire schemas | 50-70% | Low |
| Description stripping | 20-40% | Low |
| $defs deduplication | 10-30% | Low |
| Lazy loading | 50-90% per-session | Medium |
Common Mistakes
Over-documenting parameters
// WRONG: Token bloatproperties: { path: { type: "string", description: "The absolute or relative path to the file that you want to read from the filesystem" }}
// RIGHT: Minimalproperties: { path: { type: "string" }}Creating separate tools for minor variants
// WRONG: Tool proliferationtools = [ { name: "user_create", ... }, { name: "user_create_admin", ... }, { name: "user_create_guest", ... }]
// RIGHT: Single tool with action paramtools = [ { name: "user_ops", inputSchema: { action: { enum: ["create", "create_admin", "create_guest"] }}}]Loading all tools eagerly
// WRONG: Everything at startupfor (const tool of allTools) { server.addTool(tool);}
// RIGHT: Load on demandloadToolPack("core"); // Only core at startup// Load git, database, cloud packs when first accessedSummary
In this post, I showed how to optimize MCP tool schemas to reduce token costs. The key strategies are: gateway pattern consolidation (collapse many tools into few), thin wire schemas with additionalProperties: true, description stripping, and lazy loading. I reduced 4,250 tokens to 725 tokens—an 83% reduction.
The gateway pattern is the biggest win. Instead of registering 29 individual tools, I register 4 gateways that dispatch to handlers based on an action parameter. The LLM still gets the same capabilities, but the schema footprint is dramatically smaller.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 SDL-MCP Tool Gateway Documentation
- 👨💻 Model Context Protocol Specification
- 👨💻 Reddit Discussion: MCP Token Optimization
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments