Skip to content

How to Fix Claude Code Prompt Caching Bug with Two-Line Patch

Problem

When I resumed a Claude Code session, my prompt caching stopped working. The cache read ratio dropped from 99% to 26%, burning through my token budget way faster than expected.

# Before resume (working cache)
cache_read_input_tokens: 85,420
cache_creation_input_tokens: 12,500
input_tokens: 97,920
# After resume (broken cache)
cache_read_input_tokens: 25,100
cache_creation_input_tokens: 72,400
input_tokens: 97,500

The cache ratio tanked from 87% to 26% on every session resume.

Environment

  • Claude Code v2.1.81
  • Node.js v20.x
  • macOS Sonoma
  • Claude Max subscription (x20)

What happened?

I was using Claude Code for a long coding session. Everything worked great until I closed the terminal and resumed later. The session picked up where I left off, but my token usage spiked dramatically.

I checked the JSONL logs in ~/.claude/projects/ and saw this pattern:

Session 1 (fresh): cache ratio ~87%
Session 2 (resume): cache ratio ~26%
Session 3 (resume): cache ratio ~26%

Every resumed session was rebuilding the cache from scratch instead of reusing it.

Here’s what a healthy cache ratio looks like:

cache_read_input_tokens: 85,420
cache_creation_input_tokens: 12,500
used_input_tokens: 97,920
# Ratio: 85,420 / 97,920 = 87% cache hit

And a broken one:

cache_read_input_tokens: 25,100
cache_creation_input_tokens: 72,400
used_input_tokens: 97,500
# Ratio: 25,100 / 97,500 = 26% cache hit

I dug into the community discussions and found a Reddit thread where someone had already reverse-engineered the issue.

How to solve it?

The fix requires two steps: installing via npm instead of the standalone binary, and patching the db8 function in cli.js.

Step 1: Install via npm

The standalone binary has an additional bug involving sentinel value replacement. Use the npm package instead:

Terminal window
mkdir -p ~/cc-cache-fix && cd ~/cc-cache-fix
npm install @anthropic-ai/[email protected]

Step 2: Backup the original file

Always backup before patching:

Terminal window
cp node_modules/@anthropic-ai/claude-code/cli.js node_modules/@anthropic-ai/claude-code/cli.js.orig

Step 3: Apply the patch

I used a Python one-liner to patch the file. The patch finds the db8 function and adds two lines to preserve deferred_tools_delta and mcp_instructions_delta attachments:

patch.py
python3 -c "
import sys
path = 'node_modules/@anthropic-ai/claude-code/cli.js'
with open(path) as f: src = f.read()
old = 'if(A.attachment.type===\"hook_additional_context\"&&a6(process.env.CLAUDE_CODE_SAVE_HOOK_ADDITIONAL_CONTEXT))return!0;return!1}'
new = old.replace('return!1}',
'if(A.attachment.type===\"deferred_tools_delta\")return!0;'
'if(A.attachment.type===\"mcp_instructions_delta\")return!0;'
'return!1}')
if old not in src:
print('ERROR: pattern not found, wrong version?'); sys.exit(1)
src = src.replace(old, new, 1)
with open(path, 'w') as f: f.write(src)
print('Patched successfully!')
"

Run it from your ~/cc-cache-fix directory:

Terminal window
cd ~/cc-cache-fix
python3 -c "..."

Step 4: Verify the patch

Check that the patch was applied:

Terminal window
grep -c "deferred_tools_delta" node_modules/@anthropic-ai/claude-code/cli.js
# Should return 1 or more

Step 5: Run the patched version

Start Claude Code with node:

Terminal window
node node_modules/@anthropic-ai/claude-code/cli.js

Step 6 (optional): Create a wrapper script

For convenience, create a wrapper:

claude-patched
mkdir -p ~/.local/bin
cat > ~/.local/bin/claude-patched << 'EOF'
#!/usr/bin/env bash
exec node ~/cc-cache-fix/node_modules/@anthropic-ai/claude-code/cli.js "$@"
EOF
chmod +x ~/.local/bin/claude-patched

Now you can run claude-patched from anywhere.

Step 7: Verify results

After patching, resume a session and check the cache ratio:

Terminal window
# Find the latest JSONL log
ls -lt ~/.claude/projects/*/logs/*.jsonl | head -1
# Check cache metrics in the last few lines
tail -20 ~/.claude/projects/*/logs/$(ls -t ~/.claude/projects/*/logs/*.jsonl | head -1 | xargs basename)

The cache ratio should now be around 72-99% on consecutive resumes.

The reason

The bug occurs in the db8 function, which filters attachments during session resume. This function has an allowlist of attachment types to preserve, but it was missing two critical types:

  • deferred_tools_delta: Contains tool definitions that were deferred from previous messages
  • mcp_instructions_delta: Contains MCP server instructions that were cached

When the session resumes, the db8 function returns false for these attachment types, causing them to be stripped from the conversation context. The API then has to rebuild the cache from scratch instead of reusing the existing cached prompts.

The patched function looks like this (minified):

// The db8 function (simplified for clarity)
function db8(A) {
// ... existing allowlist checks ...
if(A.attachment.type==="hook_additional_context"&&a6(process.env.CLAUDE_CODE_SAVE_HOOK_ADDITIONAL_CONTEXT))return!0;
// These two lines were missing:
if(A.attachment.type==="deferred_tools_delta")return!0;
if(A.attachment.type==="mcp_instructions_delta")return!0;
return!1 // Default: strip the attachment
}

By adding these two checks, the function now preserves the cached tool and MCP instruction data, allowing the prompt cache to work correctly across session resumes.

Common mistakes

Using the standalone binary - The standalone binary has a separate bug with sentinel value replacement. Always use the npm package for this fix.

Patching the wrong version - Function names are minified and change between versions. The patch script matches the exact db8 function signature from v2.1.81.

Expecting 99% on first resume - The first resume after patching will have a lower cache ratio due to structural reset. Check the second and third resumes for improvement.

Not backing up - Always backup cli.js before patching. If something goes wrong, restore from cli.js.orig.

Caveats

This fix has been tested on Claude Code v2.1.81 with Claude Max (x20). Results may vary on other versions.

The patch only fixes input token caching. Output tokens are still consumed normally.

Modifying the client code may have ToS implications. Use at your own discretion.

Summary

In this post, I showed how to fix the Claude Code prompt caching bug by patching the db8 function to preserve deferred_tools_delta and mcp_instructions_delta attachments. The key point is that the npm package plus a two-line patch restores cache ratio from ~26% to ~72-99% on resumed sessions.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments