Saying 'Hey' Cost Me 22% of My Usage Limits: Claude Code Token Management
Problem
I opened Claude Code one morning and typed a simple greeting: “hey”
The response came back instantly. But then I ran /cost to check my session, and my jaw dropped:
Input tokens: 127,432Output tokens: 847Total: 128,279 tokens
Percentage of daily limit used: 22%Twenty-two percent. For saying “hey.”
I had been working on a long conversation the day before. The context had accumulated thousands of lines of code, multiple file reads, and extensive discussion. When I came back the next morning and said “hey,” Claude had to reload the entire context window just to respond to my greeting.
This was a painful lesson in how Claude Code manages tokens. Here’s what I learned about checking usage and keeping it under control.
What happened?
I searched for an explanation and found a Reddit thread titled “Saying hey cost me 22% of my usage limits” with 500+ upvotes. The comments revealed I wasn’t alone.
The issue: when you return to an old conversation and send any message, Claude must “resurrect” the entire context. Every file you read, every code block, every discussion gets loaded back into memory.
┌─────────────────────────────────────────────────────────┐│ Long conversation from yesterday ││ ├── 50 file reads (each 500-2000 lines) ││ ├── 30 code suggestions ││ ├── Multiple debugging sessions ││ └── Extensive architecture discussion ││ ││ You type: "hey" ││ ││ Claude must reload ALL of this just to respond ││ Result: 100K+ tokens consumed for a greeting │└─────────────────────────────────────────────────────────┘One commenter explained: “Stop reviving old, long conversations. It’s cheaper to start a new chat.”
Another added: “Use /compact command to summarize chat history before walking away.”
This was my first mistake. But there were more.
How to check token usage
Claude Code has three commands for monitoring usage:
/cost - Current session stats
Session Statistics:───────────────────────Input tokens: 45,231Output tokens: 3,128Cache creation: 12,450Cache read: 8,320
Total tokens: 48,359Session cost: $0.42This shows the immediate cost of your current session. I now run this before and after major tasks.
/stats - Daily usage visualization
Daily Usage (Last 7 Days):─────────────────────────────────────────────────Mon ████████████████████░░░░ 78%Tue ██████████████░░░░░░░░░░ 52%Wed ████████████████████████ 95% ← Limit hitThu ████████░░░░░░░░░░░░░░░░ 28%Fri ████████████████░░░░░░░░ 62%Sat ░░░░░░░░░░░░░░░░░░░░░░░░ 5%Sun ░░░░░░░░░░░░░░░░░░░░░░░░ 0%
Rate limit window: 5-hour rollingThis revealed I had hit my limit on Wednesday. The 5-hour rolling window means spikes affect availability for hours.
/context - Context breakdown
Context Usage Breakdown:─────────────────────────────────────────────────Tool outputs: 38,420 tokens (45%)Conversation: 28,150 tokens (33%)System prompt: 12,000 tokens (14%)MCP resources: 6,890 tokens (8%)
Total: 85,460 tokensOptimization tip: Consider /compact to reduce tool output historyThis showed exactly where my tokens were going. Tool outputs from file reads were consuming nearly half my context.
How to reduce token usage
After learning to monitor usage, I needed ways to reduce it. These commands help:
/compact - Compress context
> /compact Keep the current implementation plan and API design decisions
Compacting conversation...─────────────────────────────────────────────────Preserved:- API design decisions from earlier discussion- Current implementation plan- Key bug fixes applied
Compressed:- 12 file read outputs- 8 debugging iterations- Full code history
Result: Context reduced from 142,000 to 23,000 tokensThe key insight: you can pass instructions to /compact to tell it what to preserve. Without instructions, it makes its own decisions about what’s important.
/clear - Reset conversation entirely
> /clear
Conversation history cleared. Context released.Starting fresh session.This is the nuclear option. Use it when the conversation has grown too large and you’re starting a new task.
My workflow now
After that 22% lesson, I changed how I work with Claude Code:
┌─────────────────────────────────────────────────────────┐│ 1. Start session → Run /cost to note baseline ││ ││ 2. Work on task → Monitor periodically with /stats ││ ││ 3. Before break → /compact "Save [key decisions]" ││ ││ 4. Return → Check /cost delta (not "hey"!) ││ ││ 5. End of task → Either /clear or save summary ││ ││ NEVER: Revive old long chats with casual greetings ││ ALWAYS: Compact before walking away ││ CHECK: /stats daily to catch usage patterns │└─────────────────────────────────────────────────────────┘Let me show you what this looks like in practice:
Scenario 1: Working on a feature
Start: /cost shows 0 tokens
[Read 5 files, write code, debug errors, run tests]
Check: /cost shows 45,000 tokens
Before lunch: /compact "Preserve the auth logic and pending API changes"
Lunch break...
Return: /cost shows 8,000 tokens (compressed!)
[Continue work]
End of day: /clear (starting fresh tomorrow)Scenario 2: The “hey” trap (what NOT to do)
Yesterday: Long session, 100K+ tokens accumulated Left conversation open
Today: Typed "hey" to resume Cost: 22% of daily limit Reason: Full context resurrection
Better approach:Yesterday: /compact "Save progress on feature X" Close conversationToday: Start new conversation with fresh context Reference saved summary if neededThe cache factor
The status line shows two important metrics:
cache_creation_input_tokens: 12,450 ← First-time costcache_read_input_tokens: 8,320 ← Reduced costcache_creation_input_tokens - New tokens being cached. This is the “first-time cost” when Claude hasn’t seen this context before.
cache_read_input_tokens - Tokens read from cache. These cost less because they’re reused from a previous cache.
Maximizing cache hits reduces your effective token costs. This is why staying in the same conversation can sometimes be cheaper than starting new ones - if the context hasn’t changed much.
But there’s a balance. If the conversation has grown too large, the resurrection cost outweighs any cache benefit.
Configuration options
You can configure auto-compact behavior:
{ "autoCompact": false}Or set a custom threshold:
# Triggers auto-compact at 300k tokens (default is lower)export CLAUDE_CODE_AUTO_COMPACT_WINDOW=300000I prefer to control compaction manually with the /compact command. Auto-compact at arbitrary moments can interrupt workflow and lose important context.
You can also set up a PreCompact hook to preserve critical context:
{ "hooks": { "PreCompact": [{ "hooks": [{ "type": "prompt", "prompt": "Before compacting, save any critical in-progress work to a task file" }] }] }}This ensures important work gets saved before automatic compaction triggers.
Common mistakes I made
| Mistake | Impact | What I do now |
|---|---|---|
| Reviving old chats with “hey” | 22%+ token burn | Start fresh or compact first |
| Ignoring context bloat | Degraded reasoning | Monitor /context weekly |
| Not checking /cost | Surprise overages | Check before and after tasks |
| Letting auto-compact trigger mid-task | Lost context | Compact at logical boundaries |
| Reading entire large files | 84K+ tokens per read | Use targeted reads or MCP indexing |
The biggest lesson: a simple greeting in a large conversation can cost more than a full coding session. Always compact before walking away, and never revive old chats without thinking about the context cost.
Summary
In this post, I showed how a simple “hey” cost me 22% of my daily usage limits and what I learned about Claude Code token management.
The key commands are:
/cost- Check current session usage/stats- View daily usage trends/context- See where tokens are going/compact [instructions]- Compress context while preserving key points/clear- Reset conversation entirely
The most important habit: never revive old long conversations. Either compact before you leave, or start fresh when you return. The context resurrection cost is simply too high.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments