MCP vs CLI for AI Agent Tools: When to Use Which?
I’ve been building AI agents for a while now. And I kept running into the same question: should my tools use MCP or CLI?
Turns out, the answer matters for both cost and capability. Let me break down what I found.
The Core Tension
MCP (Model Context Protocol) gives you tool discovery and structured I/O. CLI gives you Unix composability and zero overhead.
+------------------+ +------------------+| MCP Approach | | CLI Approach |+------------------+ +------------------+| ~100 tokens | | 0 tokens || to load schemas | | to start || | | || +50 tokens | | Just the command || per call after | | itself || | | || Rich metadata | | Unix pipes work || Typed JSON | | Grep/jq/sed |+------------------+ +------------------+When CLI Wins
I use CLI for three specific scenarios:
1. One-off Queries
If I’m running a single lookup, CLI is strictly cheaper. No schema loading means zero overhead.
#!/usr/bin/env bash# No MCP setup needed - just run itcurl -s "https://api.example.com/status" | jq '.health'2. Sub-Agent Tasks
Here’s the key insight: sub-agents cannot access MCP tools.
Only the main orchestrator has MCP access. When you spawn a sub-agent, it can only use CLI commands.
from langchain.agents import create_react_agent
# Sub-agent gets CLI tools onlytools = [ ShellTool(), # CLI commands BashProcessTool() # Shell pipelines]
# MCP tools won't work here!# The sub-agent can't see vault_remember or other MCP tools3. Pipeline Transformations
When I need grep, awk, sed, or jq in the middle of a workflow, CLI is natural.
# Find all TODO comments, extract file:line, sortgrep -rn "TODO" ./src | awk -F: '{print $1":"$2}' | sort | head -20When MCP Wins
MCP shines in long-lived orchestrator flows. Here’s why:
1. Tool Discovery
Claude sees all available tools with typed parameters and rich docstrings. No need to explain how a tool works.
from mcp.server import Serverfrom mcp.types import Tool, TextContent
server = Server("my-tools")
@server.list_tools()async def list_tools(): return [ Tool( name="vault_remember", description="Store a key-value pair for later retrieval in the session", inputSchema={ "type": "object", "properties": { "key": { "type": "string", "description": "Unique identifier for the memory" }, "value": { "type": "string", "description": "Content to remember" } }, "required": ["key", "value"] } ) ]Claude reads this schema once (~100 tokens), then knows exactly how to call vault_remember.
2. Structured Output
MCP returns typed JSON. Claude parses it natively. No string munging.
@server.call_tool()async def call_tool(name: str, arguments: dict): if name == "search_docs": results = await search_index(arguments["query"])
# Structured JSON output return { "results": results, "count": len(results), "query": arguments["query"] }3. Multi-Turn Efficiency
After the initial schema load, each MCP call costs only ~50 tokens overhead.
Session token breakdown:-------------------------------Initial schema load: ~100 tokensCall #1 (search): ~50 tokens overheadCall #2 (remember): ~50 tokens overheadCall #3 (retrieve): ~50 tokens overhead-------------------------------Total for 3 calls: ~250 tokensFor multi-turn sessions, this is efficient.
4. Semantic Clarity
Tool names like vault_remember communicate intent. The model understands what the tool does without explanation.
The Hybrid Architecture
I settled on a hybrid approach:
+------------------------------------------+| Main Orchestrator || (Uses MCP Tools) |+------------------------------------------+ | | v v+---------------+ +---------------+| MCP Tools | | Spawn Sub- || - discovery | | Agents || - structured | | (CLI only) || - context | +---------------++---------------+ | v +---------------+ | CLI Tools | | - pipelines | | - one-shots | | - grep/jq/sed | +---------------+Main Orchestrator with MCP
from mcp import Client
async def run_orchestrator(): async with Client("my-mcp-server") as client: # MCP for structured, long-lived operations docs = await client.call_tool("search_docs", { "query": "authentication best practices" })
await client.call_tool("vault_remember", { "key": "research_topic", "value": docs["results"][0]["title"] })
# Spawn sub-agent for CLI work result = await spawn_sub_agent( task="Analyze these docs for security issues", context=docs )Sub-Agent with CLI
import subprocess
def analyze_with_cli(docs_path: str): # CLI advantage: compose with standard tools result = subprocess.run([ "bash", "-c", f"cat {docs_path} | grep -i 'security' | wc -l" ], capture_output=True, text=True)
return int(result.stdout.strip())Decision Matrix
| Scenario | Use MCP | Use CLI |
|---|---|---|
| Long orchestrator flow | Yes | No |
| One-shot query | No | Yes |
| Sub-agent task | N/A (can’t) | Yes |
| Need grep/jq/sed | No | Yes |
| Tool discovery needed | Yes | No |
| Structured output | Yes | Parse yourself |
| Multi-turn session | Yes | No |
Real Token Costs
I measured actual costs in a production system:
MCP Session (10 tool calls):- Schema load: 100 tokens- 10 calls @ 50: 500 tokens- Total: 600 tokens
Equivalent CLI Session:- 10 shell commands: ~200 tokens- Output parsing: ~100 tokens (Claude reads stdout)- Total: ~300 tokensCLI is cheaper for simple operations. But when I factor in:
- Claude understanding tool semantics
- No string parsing errors
- Type safety
The MCP overhead pays for itself in complex flows.
Key Takeaway
A Reddit practitioner put it well:
“MCP for long-lived orchestrator flows, CLI for sub-agents and quick one-shot jobs. Claude handled typed args way better over MCP, while CLI was nicer when I needed pipes, grep, or jq in the middle.”
I follow this rule now. My main agents use MCP. Sub-agents and quick lookups use CLI. Simple, effective, and cost-aware.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments