Skip to content

How to Monitor Claude Code Cache Statistics and Token Usage

Problem

I noticed my Claude Code token usage was higher than expected. I wanted to check if prompt caching was working correctly, but I couldn’t find any built-in command to show cache statistics.

When I searched online, I found that Claude Code stores detailed usage data in JSONL session files. Here’s how I monitor my cache ratio and diagnose caching problems.

Where Session Data Lives

Claude Code stores session data in JSONL (JSON Lines) format:

~/.claude/projects/*/*.jsonl

Each line in these files is a separate JSON object containing message data, including a usage field with token metrics.

To find your latest session file:

Terminal window
ls -lt ~/.claude/projects/*/*.jsonl | head -n 1

Usage Fields to Monitor

The usage object contains these key fields:

  • cache_read_input_tokens: Cached content read from the cache (cheap)
  • cache_creation_input_tokens: New content written to the cache (expensive, one-time cost)
  • input_tokens: Non-cached input tokens
  • output_tokens: Output tokens generated

Here’s what each field means in practice:

cache_read_input_tokens → Tokens retrieved from cache (cheap, ~10% of input cost)
cache_creation_input_tokens → Tokens saved to cache (expensive, ~125% of input cost, one-time)
input_tokens → Tokens sent without caching (normal input cost)
output_tokens → Tokens generated by Claude (output cost)

Calculate Cache Ratio

The cache ratio tells you how effectively caching is working:

ratio = cache_read_input_tokens / (cache_read_input_tokens + cache_creation_input_tokens + input_tokens)

Interpretation:

  • 90%+: Excellent caching
  • 50-70%: Acceptable, room for improvement
  • Below 30%: Caching problem or fresh session

Quick Check Script

I wrote a one-line Python script to check cache stats from the last 50 lines:

check_cache.py
tail -50 ~/.claude/projects/*/*.jsonl | python3 -c "
import sys, json
for line in sys.stdin:
try:
d = json.loads(line.strip())
except:
continue
u = d.get('usage') or d.get('message', {}).get('usage')
if not u or 'cache_read_input_tokens' not in u:
continue
cr = u.get('cache_read_input_tokens', 0)
cc = u.get('cache_creation_input_tokens', 0)
total = cr + cc + u.get('input_tokens', 0)
if total:
print(f'CR:{cr:>7,} CC:{cc:>7,} ratio:{cr/total*100:.0f}%')
"

When I run this during an active session:

CR: 15,451 CC: 8,234 ratio:35%
CR: 15,451 CC: 12,567 ratio:28%
CR: 15,451 CC: 18,901 ratio:22%

I noticed the cache_read (CR) was stuck at 15,451 while cache_creation (CC) kept growing. This indicated a caching problem.

Monitor in Real-time

For ongoing monitoring during a session, I use watch:

realtime_monitor.sh
watch -n 5 'tail -20 ~/.claude/projects/*/*.jsonl | python3 -c "
import sys, json
for line in sys.stdin:
d = json.loads(line.strip())
u = d.get(\"usage\") or d.get(\"message\", {}).get(\"usage\", {})
cr = u.get(\"cache_read_input_tokens\", 0)
cc = u.get(\"cache_creation_input_tokens\", 0)
if cr or cc:
total = cr + cc
print(f\"Read: {cr:,} Created: {cc:,} Ratio: {cr/total*100:.0f}%\")
"'

This updates every 5 seconds so I can see how caching performs during my work.

What to Look For

Healthy Caching Pattern

In a healthy session, cache_read should grow while cache_creation stays stable:

Turn 1: CR: 0 CC: 10,000 ratio:0% (building cache)
Turn 2: CR: 8,000 CC: 2,000 ratio:80% (reading from cache)
Turn 3: CR:12,000 CC: 1,000 ratio:92% (mostly cache reads)
Turn 4: CR:15,000 CC: 500 ratio:97% (excellent caching)

Problem Pattern

If cache_read stays flat while cache_creation grows, caching is broken:

Turn 1: CR: 0 CC: 10,000 ratio:0%
Turn 2: CR: 5,000 CC: 15,000 ratio:25%
Turn 3: CR: 5,000 CC: 20,000 ratio:20%
Turn 4: CR: 5,000 CC: 25,000 ratio:17%

The stuck cache_read with growing cache_creation means the cache prefix keeps changing, forcing new content to be cached repeatedly.

Diagnostic Patterns Table

PatternMeaningAction
CR stuck, CC growsCache prefix brokenCheck for structural changes in prompts
CR grows, CC stableHealthy cachingNormal behavior, continue working
High CC on every turnNo caching happeningCheck API settings or session structure
Ratio drops on resumeSession resume issueMay be normal on first turn after resume

Session vs Resume Pattern

The cache ratio behaves differently depending on session state:

Fresh session: Low cache ratio is expected because the cache is being built. The first few turns show high cache_creation and low cache_read.

Subsequent turns in same session: Ratio should improve significantly as cache_read grows and cache_creation stabilizes.

Resumed session: The first turn after resume may show a lower ratio due to structural changes. Subsequent turns should recover to high ratios.

Common Mistakes

Not knowing where session files are. I initially searched for a --stats flag before realizing the data was in JSONL files.

Expecting first turn to have high ratio. A fresh session always starts with low cache ratio. Monitor across multiple turns to see the pattern.

Ignoring cache_creation growth. If cache_creation keeps growing turn after turn, that’s a red flag even if cache_read is non-zero.

Not checking stats regularly. Without monitoring, I wouldn’t have noticed my caching was broken until I saw the bill.

Extract Specific Usage Reports

To see the last 10 usage entries with jq:

Terminal window
tail -10 ~/.claude/projects/*/session-*.jsonl | jq -r '.usage | "read: \(.cache_read_input_tokens // 0) created: \(.cache_creation_input_tokens // 0)"'

To count total turns in a session:

Terminal window
wc -l ~/.claude/projects/*/*.jsonl

Summary

In this post, I showed how to monitor Claude Code cache statistics by reading JSONL session files. The key metrics are cache_read_input_tokens (grows when caching works) and cache_creation_input_tokens (should stabilize after initial turns). A stuck cache_read with growing cache_creation indicates caching problems. Use the scripts I provided to monitor your sessions and catch issues early.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments