How to Manage AI Agent Context with Handoff Files to Prevent Re-reading Waste

Jun 8, 2026

Problem

AI coding agents have no persistent memory between sessions. Every time I start a new session, the AI re-reads my project structure, re-analyzes the same files, and re-discovers past findings. This repetition is the single largest source of token waste in multi-session workflows.

I noticed that the first 20-40% of tokens in each session were spent on context re-establishment. The AI was scanning directories, re-reading key files, and re-explaining things it already knew from the previous session.

The solution: a handoff file

A compact handoff file eliminates that overhead by giving the AI a cheat sheet for the current task. I maintain a handoff.md in the project root and instruct the AI to read it first in every session.

Here’s the template I use:

## Goal
Fix OAuth token refresh bug in auth/callback.ts

## Key Files
- src/auth/callback.ts — OAuth handler, line 42 has the bug
- src/auth/tokens.ts — token refresh logic
- tests/auth/callback.test.ts — failing test

## Already Tried
- Added console.log at line 42 — token is valid but expires immediately
- Checked .env — TOKEN_TTL is set to 3600 (correct)
- Verified refresh endpoint responds 200 on curl test

## Known Errors
- "Invalid grant" when refresh token is reused
- Token store returns null after first refresh

## Decisions
- Will switch from implicit to explicit refresh flow
- Will add token reuse detection

## Next Steps
- Implement explicit refresh in auth/callback.ts
- Write test for token reuse scenario

## DO NOT READ (unless necessary)
- node_modules/, dist/, coverage/, .git/
- *.test.ts files except callback.test.ts

And here’s the instruction template I paste at the start of each session:

Before doing anything else:
1. Read handoff.md
2. Do not scan directories listed under DO NOT READ
3. Continue from Next Steps
4. Be concise — show only the patch and the reason
5. After each significant finding, update handoff.md

Why this works

Each session would cost 20-40% wasted tokens before the handoff file. Now every session after the first starts at Next Steps instead of square one. Combined with an exclusion list for directories like node_modules, .venv, dist, and build, this eliminated the most expensive pattern: repeated full-context loading.

In my case, this was the second most impactful technique after data compaction, contributing significantly to the 138M to 20M token per day reduction.

Key rules for a good handoff file

Keep it to one screen, roughly 30-50 lines
Update it after each session
Include file paths and brief descriptions, not file contents
Tell the AI to read it first explicitly
Periodically compact it: remove dead ends and outdated info

Common mistakes

Making the handoff file too long (it should fit on one screen)
Not updating it after each session
Including file contents instead of file paths and brief descriptions
Forgetting to tell the AI to read it first
Letting dead-end research accumulate in the handoff

Summary

In this post, I showed how to use a handoff file to prevent AI agents from re-discovering context in every session. The key point is that a well-maintained 30-line handoff file eliminates the 20-40% token overhead from repeated context loading. Keep it short, keep it current, and tell the AI to read it first.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit Discussion: Codex context optimization

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!