How to Monitor Claude Code Cache Statistics and Token Usage

Apr 1, 2026

Problem

I noticed my Claude Code token usage was higher than expected. I wanted to check if prompt caching was working correctly, but I couldn’t find any built-in command to show cache statistics.

When I searched online, I found that Claude Code stores detailed usage data in JSONL session files. Here’s how I monitor my cache ratio and diagnose caching problems.

Where Session Data Lives

Claude Code stores session data in JSONL (JSON Lines) format:

~/.claude/projects/*/*.jsonl

Each line in these files is a separate JSON object containing message data, including a usage field with token metrics.

To find your latest session file:

ls -lt ~/.claude/projects/*/*.jsonl | head -n 1

Usage Fields to Monitor

The usage object contains these key fields:

cache_read_input_tokens: Cached content read from the cache (cheap)
cache_creation_input_tokens: New content written to the cache (expensive, one-time cost)
input_tokens: Non-cached input tokens
output_tokens: Output tokens generated

Here’s what each field means in practice:

cache_read_input_tokens  → Tokens retrieved from cache (cheap, ~10% of input cost)
cache_creation_input_tokens → Tokens saved to cache (expensive, ~125% of input cost, one-time)
input_tokens             → Tokens sent without caching (normal input cost)
output_tokens            → Tokens generated by Claude (output cost)

Calculate Cache Ratio

The cache ratio tells you how effectively caching is working:

ratio = cache_read_input_tokens / (cache_read_input_tokens + cache_creation_input_tokens + input_tokens)

Interpretation:

90%+: Excellent caching
50-70%: Acceptable, room for improvement
Below 30%: Caching problem or fresh session

Quick Check Script

I wrote a one-line Python script to check cache stats from the last 50 lines:

tail -50 ~/.claude/projects/*/*.jsonl | python3 -c "
import sys, json
for line in sys.stdin:
    try:
        d = json.loads(line.strip())
    except:
        continue
    u = d.get('usage') or d.get('message', {}).get('usage')
    if not u or 'cache_read_input_tokens' not in u:
        continue
    cr = u.get('cache_read_input_tokens', 0)
    cc = u.get('cache_creation_input_tokens', 0)
    total = cr + cc + u.get('input_tokens', 0)
    if total:
        print(f'CR:{cr:>7,}  CC:{cc:>7,}  ratio:{cr/total*100:.0f}%')
"

When I run this during an active session:

CR: 15,451  CC:  8,234  ratio:35%
CR: 15,451  CC: 12,567  ratio:28%
CR: 15,451  CC: 18,901  ratio:22%

I noticed the cache_read (CR) was stuck at 15,451 while cache_creation (CC) kept growing. This indicated a caching problem.

Monitor in Real-time

For ongoing monitoring during a session, I use watch:

watch -n 5 'tail -20 ~/.claude/projects/*/*.jsonl | python3 -c "
import sys, json
for line in sys.stdin:
    d = json.loads(line.strip())
    u = d.get(\"usage\") or d.get(\"message\", {}).get(\"usage\", {})
    cr = u.get(\"cache_read_input_tokens\", 0)
    cc = u.get(\"cache_creation_input_tokens\", 0)
    if cr or cc:
        total = cr + cc
        print(f\"Read: {cr:,}  Created: {cc:,}  Ratio: {cr/total*100:.0f}%\")
"'

This updates every 5 seconds so I can see how caching performs during my work.

What to Look For

Healthy Caching Pattern

In a healthy session, cache_read should grow while cache_creation stays stable:

Turn 1: CR:     0  CC: 10,000  ratio:0%   (building cache)
Turn 2: CR: 8,000  CC:  2,000  ratio:80%  (reading from cache)
Turn 3: CR:12,000  CC:  1,000  ratio:92%  (mostly cache reads)
Turn 4: CR:15,000  CC:    500  ratio:97%  (excellent caching)

Problem Pattern

If cache_read stays flat while cache_creation grows, caching is broken:

Turn 1: CR:     0  CC: 10,000  ratio:0%
Turn 2: CR: 5,000  CC: 15,000  ratio:25%
Turn 3: CR: 5,000  CC: 20,000  ratio:20%
Turn 4: CR: 5,000  CC: 25,000  ratio:17%

The stuck cache_read with growing cache_creation means the cache prefix keeps changing, forcing new content to be cached repeatedly.

Diagnostic Patterns Table

Pattern	Meaning	Action
CR stuck, CC grows	Cache prefix broken	Check for structural changes in prompts
CR grows, CC stable	Healthy caching	Normal behavior, continue working
High CC on every turn	No caching happening	Check API settings or session structure
Ratio drops on resume	Session resume issue	May be normal on first turn after resume

Session vs Resume Pattern

The cache ratio behaves differently depending on session state:

Fresh session: Low cache ratio is expected because the cache is being built. The first few turns show high cache_creation and low cache_read.

Subsequent turns in same session: Ratio should improve significantly as cache_read grows and cache_creation stabilizes.

Resumed session: The first turn after resume may show a lower ratio due to structural changes. Subsequent turns should recover to high ratios.

Common Mistakes

Not knowing where session files are. I initially searched for a --stats flag before realizing the data was in JSONL files.

Expecting first turn to have high ratio. A fresh session always starts with low cache ratio. Monitor across multiple turns to see the pattern.

Ignoring cache_creation growth. If cache_creation keeps growing turn after turn, that’s a red flag even if cache_read is non-zero.

Not checking stats regularly. Without monitoring, I wouldn’t have noticed my caching was broken until I saw the bill.

Extract Specific Usage Reports

To see the last 10 usage entries with jq:

tail -10 ~/.claude/projects/*/session-*.jsonl | jq -r '.usage | "read: \(.cache_read_input_tokens // 0) created: \(.cache_creation_input_tokens // 0)"'

To count total turns in a session:

wc -l ~/.claude/projects/*/*.jsonl

Summary

In this post, I showed how to monitor Claude Code cache statistics by reading JSONL session files. The key metrics are cache_read_input_tokens (grows when caching works) and cache_creation_input_tokens (should stabilize after initial turns). A stuck cache_read with growing cache_creation indicates caching problems. Use the scripts I provided to monitor your sessions and catch issues early.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit Discussion: Claude Code Cache Statistics
👨‍💻 Claude Code Documentation

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!