How to Reduce Token Usage in Claude Code: Practical Tips
After tracking 100 million tokens in Claude Code, I discovered something critical: 99.4% of those tokens were input. Claude was reading 166 times more than it was writing.
This revelation changed my entire approach to using Claude Code. The bottleneck isn’t inference speed or code generation - it’s the re-reading loop. Every action requires re-reading the entire relevant context.
The good news? Once you understand this, you can reduce token usage by 30-50%. Not by writing less code, but by being smarter about what gets read.
The Real Opportunity
Here’s the insight that matters:
Wrong approach: "Write less code"Right approach: "Read less context"
99.4% of tokens = input (reading)0.6% of tokens = output (writing)
Optimize the 99.4%, not the 0.6%Let me show you exactly how I reduced my token usage after realizing this.
Strategy 1: Optimize Your CLAUDE.md
Your CLAUDE.md file is read on every single turn. Keep it under 200 lines.
The Problem
I had a bloated CLAUDE.md with everything I thought Claude might need:
# Project Overview
This is a Next.js 14 project with TypeScript, Tailwind CSS, and PostgreSQL...
## Full Architecture
[100 lines of detailed architecture documentation]
## All Components
[80 lines listing every component]
## Every API Endpoint
[150 lines documenting all endpoints]
## Coding Standards
[50 lines of generic coding guidelines]
## Git Workflow
[30 lines of git commands]
## Testing Requirements
[40 lines of testing philosophy]Every turn, Claude read all 450 lines. That’s wasted tokens on information it rarely needed.
The Solution
I trimmed it to essentials only:
# Project OverviewNext.js 14 + TypeScript + PostgreSQL + Tailwind
## Key Patterns- Server Components by default- Repository pattern for data access- Zod for all validation
## Critical Files- `src/lib/db.ts` - Database connection- `src/lib/auth.ts` - Authentication- `src/middleware.ts` - Route protection
## Coding Standards- Functional components only- No mutation - always return new objects- Error handling required on all async operations
## Testing- 80% coverage minimum- Jest + Testing Library- Run `pnpm test` before commitsToken savings: ~60% on CLAUDE.md reads
The Rule
✅ Include: Project-specific patterns, critical file locations✅ Include: Non-obvious conventions unique to this project✅ Include: Common pitfalls and how to avoid them
❌ Remove: Generic coding standards (Claude knows these)❌ Remove: Exhaustive file listings (use .claudeignore instead)❌ Remove: Documentation that can be inferred from code❌ Remove: Git commands (Claude knows git)Strategy 2: Use .claudeignore Aggressively
Claude Code reads files to understand your project. But not all files deserve to be read.
What Claude Doesn’t Need to Read
Create a .claudeignore file in your project root:
# Dependenciesnode_modules/.pnpm-store/
# Build outputs.next/dist/build/out/
# Cache directories.cache/.turbo/.eslintcache/
# Test coveragecoverage/.nyc_output/
# Generated files*.generated.ts*.generated.js
# Lock files (Claude doesn't need to read dependency versions)package-lock.jsonpnpm-lock.yamlyarn.lock
# Environment files (for security AND token savings).env.local.env.*.local
# Large static assetspublic/images/public/videos/*.svg*.png*.jpg
# Documentation (read manually when needed)docs/*.md!CLAUDE.md
# Config files Claude rarely needs.vscode/.idea/*.config.jsToken Impact
Project scan: Read 1,247 filesAverage context per request: 45,000 tokens
After .claudeignore:Project scan: Read 312 filesAverage context per request: 18,000 tokens
Savings: 60% reduction in context readingThe Heuristic
Ask: "Does Claude need to read this to write better code?"
YES: Keep it readable - Source code (.ts, .tsx, .js, .jsx) - Type definitions (.d.ts) - Test files (.test.ts, .spec.ts) - Key configuration (tsconfig.json, tailwind.config.ts)
NO: Add to .claudeignore - Dependencies (node_modules) - Build outputs (dist, .next) - Lock files (package-lock.json) - Assets (images, fonts) - Generated code - Documentation (except CLAUDE.md)Strategy 3: Strategic Session Management
Every session starts fresh. Use this to your advantage.
The Wrong Approach
One long session for everything:
Session start: 9 AM- Feature A implementation (50k tokens)- Bug fix in Feature A (45k tokens - re-reading Feature A context)- Feature B implementation (60k tokens - reading Features A + B)- Refactor shared utility (70k tokens - re-reading everything)- More bugs in Feature A (55k tokens - re-reading again)Session end: 6 PMTotal: 280k tokensThe Right Approach
Fresh sessions for distinct tasks:
Session 1 (9-11 AM): Feature A implementation - Start fresh - Focus on Feature A only - End session Tokens: 40k
Session 2 (11 AM-1 PM): Feature B implementation - Start fresh - Focus on Feature B only - End session Tokens: 35k
Session 3 (2-4 PM): Shared refactor - Start fresh - Focus on shared utilities - End session Tokens: 25k
Session 4 (4-5 PM): Bug fixes - Start fresh - Fix bugs in isolation Tokens: 20k
Total: 120k tokens (57% savings)The Principle
✅ Start fresh session for: - New feature development - Refactoring tasks - Bug investigation - Documentation updates
❌ Don't drag on sessions: - Context accumulates - Each turn re-reads more - Efficiency degradesStrategy 4: Maximize Prompt Caching
Claude supports prompt caching for repeated content. Use it.
What Gets Cached
High cache value: - CLAUDE.md (read every turn) - Large source files (read frequently) - System prompts (always present)
Low cache value: - User messages (unique each time) - Small files (minimal savings) - Generated output (never re-read)Implementation
Claude Code automatically caches:
- System prompts
- CLAUDE.md content
- Frequently accessed files
You can optimize this by:
1. Keep CLAUDE.md stable (don't edit mid-session)2. Avoid reformatting entire files3. Make targeted edits, not wholesale rewrites4. Use consistent file structureCache Hit Rate
Without optimization: - Cache hit rate: ~40% - Effective token cost: $2.50/M input
With optimization: - Cache hit rate: ~75% - Effective token cost: $1.25/M input
Savings: 50% reduction in input costsStrategy 5: Context Scope Control
Don’t let Claude read everything when it only needs one module.
Explicit Scope
When asking for changes:
❌ Bad: "Fix the authentication bug" → Claude reads entire codebase
✅ Good: "In src/lib/auth.ts, the validateToken function returns true for expired tokens. Fix the expiration check." → Claude focuses on auth.tsFile References
❌ Bad: "Add a new API endpoint" → Claude scans all files looking for API patterns
✅ Good: "Add a new endpoint to src/app/api/users/route.ts following the pattern in src/app/api/posts/route.ts" → Claude reads two specific filesThe Impact
Unscoped request: Files read: 45 Tokens consumed: 35,000
Scoped request: Files read: 3 Tokens consumed: 4,200
Savings: 88% reduction for that requestPractical Checklist
Here’s my daily checklist for token efficiency:
Before starting:□ CLAUDE.md under 200 lines?□ .claudeignore configured?□ Session has clear, narrow scope?
During session:□ Scope requests to specific files/modules?□ Ending session at natural breakpoints?□ Avoiding context-heavy tangents?
After session:□ Review token usage (Claude shows this)□ Identify what Claude read unnecessarily□ Update .claudeignore if neededExpected Token Savings
Based on my experiments:
Strategy 1: Optimized CLAUDE.md → 15-25% savingsStrategy 2: .claudeignore → 20-40% savingsStrategy 3: Session management → 10-30% savingsStrategy 4: Prompt caching → 10-20% savingsStrategy 5: Context scoping → 15-50% savings
Combined potential: → 30-50% total reductionThe exact savings depend on your project size and workflow. Larger projects see bigger gains from .claudeignore and scoping. Smaller projects benefit more from session management.
Common Mistakes
Mistake 1: Removing Useful Context
❌ Too aggressive: .claudeignore includes src/tests/ → Claude can't understand test patterns → Writes untestable code
✅ Balanced: .claudeignore includes coverage/ → Still reads test files → Understands testing patternsMistake 2: Session Fragmentation
❌ New session for every tiny change → Lost context between related fixes → Claude re-reads same files repeatedly
✅ Natural session boundaries: → One session per logical task → Related changes stay togetherMistake 3: Generic CLAUDE.md
❌ Generic content: "Always write clean code" "Use meaningful variable names" → Claude already knows this → Wastes tokens on every read
✅ Project-specific content: "All API routes must check rate limits before auth" "Database queries use the repository pattern in src/repositories/" → Claude might not infer this → Actually usefulFrequently Asked Questions
Should I optimize CLAUDE.md for token usage or helpfulness?
Helpfulness first. An extra 100 lines that prevent mistakes is worth more than token savings. But remove redundancy and generic advice.
How often should I start new sessions?
At natural breakpoints: after completing a feature, before starting a new task, when switching contexts. Not mid-task.
Does .claudeignore affect all Claude Code features?
Yes. Claude won’t read ignored files for any operation. Make sure you’re not hiding files it genuinely needs.
What’s the optimal CLAUDE.md size?
Under 200 lines is a good target. If you need more, consider whether the extra content is truly project-specific.
Can prompt caching compensate for larger CLAUDE.md?
Partially. Caching reduces the cost of repeated reads, but not the context window usage. Keep CLAUDE.md focused regardless.
Summary
Reducing token usage in Claude Code isn’t about writing less code. It’s about reading less context.
The key insights:
- 99.4% input ratio means your optimization target is reading, not writing
- CLAUDE.md should be under 200 lines of project-specific, non-obvious information
- .claudeignore prevents unnecessary file reads - use it aggressively
- Session management leverages fresh starts to avoid context accumulation
- Prompt caching and scoping reduce repeated reads of the same content
The strategies compound. A 60-line CLAUDE.md, well-configured .claudeignore, strategic sessions, and scoped requests together can reduce your token usage by 30-50%.
Start with .claudeignore - it’s the easiest win. Then audit your CLAUDE.md. Finally, be mindful of session boundaries and request scoping.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments