Saying 'Hey' Cost Me 22% of My Usage Limits: Claude Code Token Management

Mar 26, 2026

Problem

I opened Claude Code one morning and typed a simple greeting: “hey”

The response came back instantly. But then I ran /cost to check my session, and my jaw dropped:

Input tokens: 127,432
Output tokens: 847
Total: 128,279 tokens

Percentage of daily limit used: 22%

Twenty-two percent. For saying “hey.”

I had been working on a long conversation the day before. The context had accumulated thousands of lines of code, multiple file reads, and extensive discussion. When I came back the next morning and said “hey,” Claude had to reload the entire context window just to respond to my greeting.

This was a painful lesson in how Claude Code manages tokens. Here’s what I learned about checking usage and keeping it under control.

What happened?

I searched for an explanation and found a Reddit thread titled “Saying hey cost me 22% of my usage limits” with 500+ upvotes. The comments revealed I wasn’t alone.

The issue: when you return to an old conversation and send any message, Claude must “resurrect” the entire context. Every file you read, every code block, every discussion gets loaded back into memory.

┌─────────────────────────────────────────────────────────┐
│  Long conversation from yesterday                        │
│  ├── 50 file reads (each 500-2000 lines)               │
│  ├── 30 code suggestions                                │
│  ├── Multiple debugging sessions                        │
│  └── Extensive architecture discussion                  │
│                                                         │
│  You type: "hey"                                        │
│                                                         │
│  Claude must reload ALL of this just to respond        │
│  Result: 100K+ tokens consumed for a greeting          │
└─────────────────────────────────────────────────────────┘

One commenter explained: “Stop reviving old, long conversations. It’s cheaper to start a new chat.”

Another added: “Use /compact command to summarize chat history before walking away.”

This was my first mistake. But there were more.

How to check token usage

Claude Code has three commands for monitoring usage:

/cost - Current session stats

Session Statistics:
───────────────────────
Input tokens:      45,231
Output tokens:     3,128
Cache creation:    12,450
Cache read:        8,320

Total tokens:      48,359
Session cost:      $0.42

This shows the immediate cost of your current session. I now run this before and after major tasks.

/stats - Daily usage visualization

Daily Usage (Last 7 Days):
─────────────────────────────────────────────────
Mon ████████████████████░░░░ 78%
Tue ██████████████░░░░░░░░░░ 52%
Wed ████████████████████████ 95%  ← Limit hit
Thu ████████░░░░░░░░░░░░░░░░ 28%
Fri ████████████████░░░░░░░░ 62%
Sat ░░░░░░░░░░░░░░░░░░░░░░░░ 5%
Sun ░░░░░░░░░░░░░░░░░░░░░░░░ 0%

Rate limit window: 5-hour rolling

This revealed I had hit my limit on Wednesday. The 5-hour rolling window means spikes affect availability for hours.

/context - Context breakdown

Context Usage Breakdown:
─────────────────────────────────────────────────
Tool outputs:      38,420 tokens (45%)
Conversation:      28,150 tokens (33%)
System prompt:     12,000 tokens (14%)
MCP resources:      6,890 tokens (8%)

Total:             85,460 tokens
Optimization tip: Consider /compact to reduce tool output history

This showed exactly where my tokens were going. Tool outputs from file reads were consuming nearly half my context.

How to reduce token usage

After learning to monitor usage, I needed ways to reduce it. These commands help:

/compact - Compress context

> /compact Keep the current implementation plan and API design decisions

Compacting conversation...
─────────────────────────────────────────────────
Preserved:
- API design decisions from earlier discussion
- Current implementation plan
- Key bug fixes applied

Compressed:
- 12 file read outputs
- 8 debugging iterations
- Full code history

Result: Context reduced from 142,000 to 23,000 tokens

The key insight: you can pass instructions to /compact to tell it what to preserve. Without instructions, it makes its own decisions about what’s important.

/clear - Reset conversation entirely

> /clear

Conversation history cleared. Context released.
Starting fresh session.

This is the nuclear option. Use it when the conversation has grown too large and you’re starting a new task.

My workflow now

After that 22% lesson, I changed how I work with Claude Code:

┌─────────────────────────────────────────────────────────┐
│  1. Start session → Run /cost to note baseline         │
│                                                         │
│  2. Work on task → Monitor periodically with /stats    │
│                                                         │
│  3. Before break → /compact "Save [key decisions]"     │
│                                                         │
│  4. Return → Check /cost delta (not "hey"!)            │
│                                                         │
│  5. End of task → Either /clear or save summary        │
│                                                         │
│  NEVER: Revive old long chats with casual greetings    │
│  ALWAYS: Compact before walking away                   │
│  CHECK: /stats daily to catch usage patterns           │
└─────────────────────────────────────────────────────────┘

Let me show you what this looks like in practice:

Scenario 1: Working on a feature

Start: /cost shows 0 tokens

[Read 5 files, write code, debug errors, run tests]

Check: /cost shows 45,000 tokens

Before lunch: /compact "Preserve the auth logic and pending API changes"

Lunch break...

Return: /cost shows 8,000 tokens (compressed!)

[Continue work]

End of day: /clear (starting fresh tomorrow)

Scenario 2: The “hey” trap (what NOT to do)

Yesterday: Long session, 100K+ tokens accumulated
           Left conversation open

Today: Typed "hey" to resume
       Cost: 22% of daily limit
       Reason: Full context resurrection

Better approach:
Yesterday: /compact "Save progress on feature X"
           Close conversation
Today: Start new conversation with fresh context
       Reference saved summary if needed

The cache factor

The status line shows two important metrics:

cache_creation_input_tokens: 12,450  ← First-time cost
cache_read_input_tokens:     8,320   ← Reduced cost

cache_creation_input_tokens - New tokens being cached. This is the “first-time cost” when Claude hasn’t seen this context before.

cache_read_input_tokens - Tokens read from cache. These cost less because they’re reused from a previous cache.

Maximizing cache hits reduces your effective token costs. This is why staying in the same conversation can sometimes be cheaper than starting new ones - if the context hasn’t changed much.

But there’s a balance. If the conversation has grown too large, the resurrection cost outweighs any cache benefit.

Configuration options

You can configure auto-compact behavior:

{
  "autoCompact": false
}

Or set a custom threshold:

# Triggers auto-compact at 300k tokens (default is lower)
export CLAUDE_CODE_AUTO_COMPACT_WINDOW=300000

I prefer to control compaction manually with the /compact command. Auto-compact at arbitrary moments can interrupt workflow and lose important context.

You can also set up a PreCompact hook to preserve critical context:

{
  "hooks": {
    "PreCompact": [{
      "hooks": [{
        "type": "prompt",
        "prompt": "Before compacting, save any critical in-progress work to a task file"
      }]
    }]
  }
}

This ensures important work gets saved before automatic compaction triggers.

Common mistakes I made

Mistake	Impact	What I do now
Reviving old chats with “hey”	22%+ token burn	Start fresh or compact first
Ignoring context bloat	Degraded reasoning	Monitor /context weekly
Not checking /cost	Surprise overages	Check before and after tasks
Letting auto-compact trigger mid-task	Lost context	Compact at logical boundaries
Reading entire large files	84K+ tokens per read	Use targeted reads or MCP indexing

The biggest lesson: a simple greeting in a large conversation can cost more than a full coding session. Always compact before walking away, and never revive old chats without thinking about the context cost.

Summary

In this post, I showed how a simple “hey” cost me 22% of my daily usage limits and what I learned about Claude Code token management.

The key commands are:

/cost - Check current session usage
/stats - View daily usage trends
/context - See where tokens are going
/compact [instructions] - Compress context while preserving key points
/clear - Reset conversation entirely

The most important habit: never revive old long conversations. Either compact before you leave, or start fresh when you return. The context resurrection cost is simply too high.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit: 'Saying hey cost me 22% of my usage limits'
👨‍💻 Claude Code Documentation

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!