Skip to content

How Do I Optimize MCP Tool Schemas to Reduce Token Costs?

Problem

I was burning through tokens. My MCP server registered 29 tools. Every time a session started, the LLM loaded all 29 tool schemas into context. That’s about 4,250 tokens before any real work even began.

Here’s what my token usage looked like:

token-usage.txt
Session start: 4,250 tokens (tool schemas)
Per conversation: ~3,500 tokens wasted
Monthly cost: Way too much

When I profiled the issue, I found the real culprit: schema bloat.

The Root Cause

My tool schemas were bloated in four ways:

  1. Over-documentation: Every parameter had a verbose description
  2. Tool proliferation: 29 separate tools when I needed far fewer
  3. Eager loading: All tools loaded at session start
  4. Redundant schemas: Same sub-schemas repeated across tools

A typical tool in my server looked like this:

bloated-schema.json
{
"name": "sdl.symbol.search",
"description": "Searches for symbols in the codebase using fuzzy matching and returns results with metadata including file location, type, and documentation",
"inputSchema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query string to match against symbol names, supports wildcards and fuzzy matching"
},
"scope": {
"type": "string",
"enum": ["local", "global", "workspace"],
"description": "The scope of the search - local for current file, global for entire project, workspace for all open folders"
},
"maxResults": {
"type": "number",
"description": "Maximum number of results to return, defaults to 50 if not specified"
}
},
"required": ["query"]
}
}

That’s one tool. Times 29. You can see why my token costs were exploding.

Solution: Four Compression Strategies

I applied four strategies and reduced 4,250 tokens to 725 tokens. Here’s how.

Strategy 1: Gateway Pattern (70-85% Savings)

Instead of 29 flat tools, I consolidated them into 4 namespace-scoped gateway tools:

consolidation.txt
Before: 29 individual tools
- sdl.repo.register
- sdl.repo.clone
- sdl.repo.list
- sdl.symbol.search
- sdl.symbol.resolve
- sdl.code.needWindow
- sdl.code.applyEdit
- sdl.agent.orchestrate
- ... (21 more)
After: 4 gateway tools
- sdl.query (9 read-only actions)
- sdl.code (3 gated code actions)
- sdl.repo (6 repository actions)
- sdl.agent (11 agentic operations)

The gateway uses a discriminated union pattern:

gateway-tool.ts
const gatewayTools = [
{
name: "sdl.query",
description: "Query actions: search|resolve|list. Args: {action, query?, scope?}",
inputSchema: {
type: "object",
properties: {
action: {
type: "string",
enum: ["search", "resolve", "list", "get", "find", "inspect"]
},
query: { type: "string" },
scope: { type: "string" }
},
required: ["action"],
additionalProperties: true
}
}
];

The router dispatches based on the action parameter:

router.ts
function handleQuery(args: { action: string; query?: string; scope?: string }) {
const handlers = {
search: () => searchSymbols(args.query, args.scope),
resolve: () => resolveSymbol(args.query),
list: () => listSymbols(args.scope),
get: () => getSymbolDetails(args.query),
find: () => findReferences(args.query),
inspect: () => inspectSymbol(args.query)
};
const handler = handlers[args.action];
if (!handler) {
throw new Error(`Unknown action: ${args.action}`);
}
return handler();
}

Strategy 2: Thin Wire Schemas (50-70% Savings)

I replaced full schemas with minimal envelopes using additionalProperties: true:

thin-wire.ts
// Before: Full schema (150+ tokens per tool)
const bloatedSchema = {
type: "object",
properties: {
path: { type: "string", description: "The file path..." },
content: { type: "string", description: "The content to write..." },
encoding: { type: "string", description: "Character encoding..." },
overwrite: { type: "boolean", description: "Whether to overwrite..." }
},
required: ["path", "content"]
};
// After: Thin envelope (30 tokens)
const thinSchema = {
type: "object",
properties: {
action: { type: "string" },
path: { type: "string" }
},
required: ["action", "path"],
additionalProperties: true
};

The additionalProperties: true lets me pass extra parameters without defining them in the schema. Validation happens server-side where it belongs.

Strategy 3: Description Stripping (20-40% Savings)

I removed redundant description fields. The LLM can infer from context:

descriptions.ts
// Before: Verbose descriptions
{
properties: {
query: {
type: "string",
description: "The search query string to match against symbol names"
},
scope: {
type: "string",
description: "The scope of the search operation"
}
}
}
// After: Minimal or no descriptions
{
properties: {
query: { type: "string" },
scope: { type: "string" }
}
}

The tool name and context provide enough information. I kept one-line descriptions on the tool itself, not on every parameter.

Strategy 4: Lazy Loading (50-90% Per-Session Savings)

Not all tools are needed for every conversation. I split tools into packs:

lazy-loading.ts
// Core tools always loaded
const coreTools = ["sdl.query", "sdl.code"];
// On-demand tool packs
const toolPacks = {
git: ["git.commit", "git.push", "git.pull", "git.branch"],
database: ["db.query", "db.migrate", "db.seed"],
cloud: ["cloud.deploy", "cloud.logs", "cloud.scale"]
};
const loadedPacks = new Set<string>();
async function loadToolPack(packName: string) {
if (loadedPacks.has(packName)) {
return; // Already loaded
}
const tools = toolPacks[packName];
if (!tools) {
throw new Error(`Unknown pack: ${packName}`);
}
for (const toolName of tools) {
const module = await import(`./tools/${packName}/${toolName}.ts`);
mcpServer.addTool(module.default);
}
loadedPacks.add(packName);
}
// Trigger loading when needed
async function handleRequest(action: string, args: any) {
if (action.startsWith("git.")) {
await loadToolPack("git");
} else if (action.startsWith("db.")) {
await loadToolPack("database");
}
// ... dispatch to handler
}

Results

Here’s what I achieved:

MetricBeforeAfterSavings
Tools registered29486%
Characters~17,000~2,90083%
Estimated tokens~4,250~72583%
Per-conversation~3,525 tokens wastedMinimal-

A Reddit user reported similar results:

“I am collapsing 25 tools down to 4. Originally 3,742 tokens that is now 713 tokens.”

Complete Example: File Operations Gateway

Here’s a complete gateway for file operations:

file-gateway.ts
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import * as fs from 'fs/promises';
import * as path from 'path';
// Single gateway tool replaces 5 file tools
const fileGateway = {
name: "file_ops",
description: "File operations: create|read|update|delete|list",
inputSchema: {
type: "object",
properties: {
action: {
type: "string",
enum: ["create", "read", "update", "delete", "list"]
},
path: { type: "string" },
content: { type: "string" }
},
required: ["action", "path"],
additionalProperties: true
}
};
async function handleFileOps(args: {
action: string;
path: string;
content?: string;
encoding?: string;
}) {
const { action, path: filePath, content, encoding = 'utf-8' } = args;
switch (action) {
case "create":
await fs.writeFile(filePath, content ?? '', encoding);
return { success: true, message: `Created ${filePath}` };
case "read":
const data = await fs.readFile(filePath, encoding);
return { success: true, content: data };
case "update":
await fs.appendFile(filePath, content ?? '', encoding);
return { success: true, message: `Updated ${filePath}` };
case "delete":
await fs.unlink(filePath);
return { success: true, message: `Deleted ${filePath}` };
case "list":
const entries = await fs.readdir(filePath, { withFileTypes: true });
return {
success: true,
entries: entries.map(e => ({
name: e.name,
type: e.isDirectory() ? 'directory' : 'file'
}))
};
default:
throw new Error(`Unknown action: ${action}`);
}
}
// Register with MCP server
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [fileGateway]
}));
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
if (name === "file_ops") {
return handleFileOps(args as any);
}
throw new Error(`Unknown tool: ${name}`);
});

Token Savings Summary

StrategyToken SavingsEffort
Gateway consolidation70-85%Medium
Thin wire schemas50-70%Low
Description stripping20-40%Low
$defs deduplication10-30%Low
Lazy loading50-90% per-sessionMedium

Common Mistakes

Over-documenting parameters

// WRONG: Token bloat
properties: {
path: {
type: "string",
description: "The absolute or relative path to the file that you want to read from the filesystem"
}
}
// RIGHT: Minimal
properties: {
path: { type: "string" }
}

Creating separate tools for minor variants

// WRONG: Tool proliferation
tools = [
{ name: "user_create", ... },
{ name: "user_create_admin", ... },
{ name: "user_create_guest", ... }
]
// RIGHT: Single tool with action param
tools = [
{ name: "user_ops", inputSchema: { action: { enum: ["create", "create_admin", "create_guest"] }}}
]

Loading all tools eagerly

// WRONG: Everything at startup
for (const tool of allTools) {
server.addTool(tool);
}
// RIGHT: Load on demand
loadToolPack("core"); // Only core at startup
// Load git, database, cloud packs when first accessed

Summary

In this post, I showed how to optimize MCP tool schemas to reduce token costs. The key strategies are: gateway pattern consolidation (collapse many tools into few), thin wire schemas with additionalProperties: true, description stripping, and lazy loading. I reduced 4,250 tokens to 725 tokens—an 83% reduction.

The gateway pattern is the biggest win. Instead of registering 29 individual tools, I register 4 gateways that dispatch to handlers based on an action parameter. The LLM still gets the same capabilities, but the schema footprint is dramatically smaller.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments