How Do I Optimize MCP Tool Schemas to Reduce Token Costs?

Mar 17, 2026

Problem

I was burning through tokens. My MCP server registered 29 tools. Every time a session started, the LLM loaded all 29 tool schemas into context. That’s about 4,250 tokens before any real work even began.

Here’s what my token usage looked like:

Session start:     4,250 tokens (tool schemas)
Per conversation:  ~3,500 tokens wasted
Monthly cost:      Way too much

When I profiled the issue, I found the real culprit: schema bloat.

The Root Cause

My tool schemas were bloated in four ways:

Over-documentation: Every parameter had a verbose description
Tool proliferation: 29 separate tools when I needed far fewer
Eager loading: All tools loaded at session start
Redundant schemas: Same sub-schemas repeated across tools

A typical tool in my server looked like this:

{
  "name": "sdl.symbol.search",
  "description": "Searches for symbols in the codebase using fuzzy matching and returns results with metadata including file location, type, and documentation",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The search query string to match against symbol names, supports wildcards and fuzzy matching"
      },
      "scope": {
        "type": "string",
        "enum": ["local", "global", "workspace"],
        "description": "The scope of the search - local for current file, global for entire project, workspace for all open folders"
      },
      "maxResults": {
        "type": "number",
        "description": "Maximum number of results to return, defaults to 50 if not specified"
      }
    },
    "required": ["query"]
  }
}

That’s one tool. Times 29. You can see why my token costs were exploding.

Solution: Four Compression Strategies

I applied four strategies and reduced 4,250 tokens to 725 tokens. Here’s how.

Strategy 1: Gateway Pattern (70-85% Savings)

Instead of 29 flat tools, I consolidated them into 4 namespace-scoped gateway tools:

Before: 29 individual tools
- sdl.repo.register
- sdl.repo.clone
- sdl.repo.list
- sdl.symbol.search
- sdl.symbol.resolve
- sdl.code.needWindow
- sdl.code.applyEdit
- sdl.agent.orchestrate
- ... (21 more)

After: 4 gateway tools
- sdl.query    (9 read-only actions)
- sdl.code     (3 gated code actions)
- sdl.repo     (6 repository actions)
- sdl.agent    (11 agentic operations)

The gateway uses a discriminated union pattern:

const gatewayTools = [
  {
    name: "sdl.query",
    description: "Query actions: search|resolve|list. Args: {action, query?, scope?}",
    inputSchema: {
      type: "object",
      properties: {
        action: {
          type: "string",
          enum: ["search", "resolve", "list", "get", "find", "inspect"]
        },
        query: { type: "string" },
        scope: { type: "string" }
      },
      required: ["action"],
      additionalProperties: true
    }
  }
];

The router dispatches based on the action parameter:

function handleQuery(args: { action: string; query?: string; scope?: string }) {
  const handlers = {
    search: () => searchSymbols(args.query, args.scope),
    resolve: () => resolveSymbol(args.query),
    list: () => listSymbols(args.scope),
    get: () => getSymbolDetails(args.query),
    find: () => findReferences(args.query),
    inspect: () => inspectSymbol(args.query)
  };

  const handler = handlers[args.action];
  if (!handler) {
    throw new Error(`Unknown action: ${args.action}`);
  }
  return handler();
}

Strategy 2: Thin Wire Schemas (50-70% Savings)

I replaced full schemas with minimal envelopes using additionalProperties: true:

// Before: Full schema (150+ tokens per tool)
const bloatedSchema = {
  type: "object",
  properties: {
    path: { type: "string", description: "The file path..." },
    content: { type: "string", description: "The content to write..." },
    encoding: { type: "string", description: "Character encoding..." },
    overwrite: { type: "boolean", description: "Whether to overwrite..." }
  },
  required: ["path", "content"]
};

// After: Thin envelope (30 tokens)
const thinSchema = {
  type: "object",
  properties: {
    action: { type: "string" },
    path: { type: "string" }
  },
  required: ["action", "path"],
  additionalProperties: true
};

The additionalProperties: true lets me pass extra parameters without defining them in the schema. Validation happens server-side where it belongs.

Strategy 3: Description Stripping (20-40% Savings)

I removed redundant description fields. The LLM can infer from context:

// Before: Verbose descriptions
{
  properties: {
    query: {
      type: "string",
      description: "The search query string to match against symbol names"
    },
    scope: {
      type: "string",
      description: "The scope of the search operation"
    }
  }
}

// After: Minimal or no descriptions
{
  properties: {
    query: { type: "string" },
    scope: { type: "string" }
  }
}

The tool name and context provide enough information. I kept one-line descriptions on the tool itself, not on every parameter.

Strategy 4: Lazy Loading (50-90% Per-Session Savings)

Not all tools are needed for every conversation. I split tools into packs:

// Core tools always loaded
const coreTools = ["sdl.query", "sdl.code"];

// On-demand tool packs
const toolPacks = {
  git: ["git.commit", "git.push", "git.pull", "git.branch"],
  database: ["db.query", "db.migrate", "db.seed"],
  cloud: ["cloud.deploy", "cloud.logs", "cloud.scale"]
};

const loadedPacks = new Set<string>();

async function loadToolPack(packName: string) {
  if (loadedPacks.has(packName)) {
    return; // Already loaded
  }

  const tools = toolPacks[packName];
  if (!tools) {
    throw new Error(`Unknown pack: ${packName}`);
  }

  for (const toolName of tools) {
    const module = await import(`./tools/${packName}/${toolName}.ts`);
    mcpServer.addTool(module.default);
  }

  loadedPacks.add(packName);
}

// Trigger loading when needed
async function handleRequest(action: string, args: any) {
  if (action.startsWith("git.")) {
    await loadToolPack("git");
  } else if (action.startsWith("db.")) {
    await loadToolPack("database");
  }
  // ... dispatch to handler
}

Results

Here’s what I achieved:

Metric	Before	After	Savings
Tools registered	29	4	86%
Characters	~17,000	~2,900	83%
Estimated tokens	~4,250	~725	83%
Per-conversation	~3,525 tokens wasted	Minimal	-

A Reddit user reported similar results:

“I am collapsing 25 tools down to 4. Originally 3,742 tokens that is now 713 tokens.”

Complete Example: File Operations Gateway

Here’s a complete gateway for file operations:

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import * as fs from 'fs/promises';
import * as path from 'path';

// Single gateway tool replaces 5 file tools
const fileGateway = {
  name: "file_ops",
  description: "File operations: create|read|update|delete|list",
  inputSchema: {
    type: "object",
    properties: {
      action: {
        type: "string",
        enum: ["create", "read", "update", "delete", "list"]
      },
      path: { type: "string" },
      content: { type: "string" }
    },
    required: ["action", "path"],
    additionalProperties: true
  }
};

async function handleFileOps(args: {
  action: string;
  path: string;
  content?: string;
  encoding?: string;
}) {
  const { action, path: filePath, content, encoding = 'utf-8' } = args;

  switch (action) {
    case "create":
      await fs.writeFile(filePath, content ?? '', encoding);
      return { success: true, message: `Created ${filePath}` };

    case "read":
      const data = await fs.readFile(filePath, encoding);
      return { success: true, content: data };

    case "update":
      await fs.appendFile(filePath, content ?? '', encoding);
      return { success: true, message: `Updated ${filePath}` };

    case "delete":
      await fs.unlink(filePath);
      return { success: true, message: `Deleted ${filePath}` };

    case "list":
      const entries = await fs.readdir(filePath, { withFileTypes: true });
      return {
        success: true,
        entries: entries.map(e => ({
          name: e.name,
          type: e.isDirectory() ? 'directory' : 'file'
        }))
      };

    default:
      throw new Error(`Unknown action: ${action}`);
  }
}

// Register with MCP server
server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [fileGateway]
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: args } = request.params;

  if (name === "file_ops") {
    return handleFileOps(args as any);
  }

  throw new Error(`Unknown tool: ${name}`);
});

Token Savings Summary

Strategy	Token Savings	Effort
Gateway consolidation	70-85%	Medium
Thin wire schemas	50-70%	Low
Description stripping	20-40%	Low
$defs deduplication	10-30%	Low
Lazy loading	50-90% per-session	Medium

Common Mistakes

Over-documenting parameters

// WRONG: Token bloat
properties: {
  path: {
    type: "string",
    description: "The absolute or relative path to the file that you want to read from the filesystem"
  }
}

// RIGHT: Minimal
properties: {
  path: { type: "string" }
}

Creating separate tools for minor variants

// WRONG: Tool proliferation
tools = [
  { name: "user_create", ... },
  { name: "user_create_admin", ... },
  { name: "user_create_guest", ... }
]

// RIGHT: Single tool with action param
tools = [
  { name: "user_ops", inputSchema: { action: { enum: ["create", "create_admin", "create_guest"] }}}
]

Loading all tools eagerly

// WRONG: Everything at startup
for (const tool of allTools) {
  server.addTool(tool);
}

// RIGHT: Load on demand
loadToolPack("core");  // Only core at startup
// Load git, database, cloud packs when first accessed

Summary

In this post, I showed how to optimize MCP tool schemas to reduce token costs. The key strategies are: gateway pattern consolidation (collapse many tools into few), thin wire schemas with additionalProperties: true, description stripping, and lazy loading. I reduced 4,250 tokens to 725 tokens—an 83% reduction.

The gateway pattern is the biggest win. Instead of registering 29 individual tools, I register 4 gateways that dispatch to handlers based on an action parameter. The LLM still gets the same capabilities, but the schema footprint is dramatically smaller.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 SDL-MCP Tool Gateway Documentation
👨‍💻 Model Context Protocol Specification
👨‍💻 Reddit Discussion: MCP Token Optimization

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!