How to Manage AI Agent Context with Handoff Files to Prevent Re-reading Waste
Problem
AI coding agents have no persistent memory between sessions. Every time I start a new session, the AI re-reads my project structure, re-analyzes the same files, and re-discovers past findings. This repetition is the single largest source of token waste in multi-session workflows.
I noticed that the first 20-40% of tokens in each session were spent on context re-establishment. The AI was scanning directories, re-reading key files, and re-explaining things it already knew from the previous session.
The solution: a handoff file
A compact handoff file eliminates that overhead by giving the AI a cheat sheet for the current task. I maintain a handoff.md in the project root and instruct the AI to read it first in every session.
Here’s the template I use:
## GoalFix OAuth token refresh bug in auth/callback.ts
## Key Files- src/auth/callback.ts — OAuth handler, line 42 has the bug- src/auth/tokens.ts — token refresh logic- tests/auth/callback.test.ts — failing test
## Already Tried- Added console.log at line 42 — token is valid but expires immediately- Checked .env — TOKEN_TTL is set to 3600 (correct)- Verified refresh endpoint responds 200 on curl test
## Known Errors- "Invalid grant" when refresh token is reused- Token store returns null after first refresh
## Decisions- Will switch from implicit to explicit refresh flow- Will add token reuse detection
## Next Steps- Implement explicit refresh in auth/callback.ts- Write test for token reuse scenario
## DO NOT READ (unless necessary)- node_modules/, dist/, coverage/, .git/- *.test.ts files except callback.test.tsAnd here’s the instruction template I paste at the start of each session:
Before doing anything else:1. Read handoff.md2. Do not scan directories listed under DO NOT READ3. Continue from Next Steps4. Be concise — show only the patch and the reason5. After each significant finding, update handoff.mdWhy this works
Each session would cost 20-40% wasted tokens before the handoff file. Now every session after the first starts at Next Steps instead of square one. Combined with an exclusion list for directories like node_modules, .venv, dist, and build, this eliminated the most expensive pattern: repeated full-context loading.
In my case, this was the second most impactful technique after data compaction, contributing significantly to the 138M to 20M token per day reduction.
Key rules for a good handoff file
- Keep it to one screen, roughly 30-50 lines
- Update it after each session
- Include file paths and brief descriptions, not file contents
- Tell the AI to read it first explicitly
- Periodically compact it: remove dead ends and outdated info
Common mistakes
- Making the handoff file too long (it should fit on one screen)
- Not updating it after each session
- Including file contents instead of file paths and brief descriptions
- Forgetting to tell the AI to read it first
- Letting dead-end research accumulate in the handoff
Summary
In this post, I showed how to use a handoff file to prevent AI agents from re-discovering context in every session. The key point is that a well-maintained 30-line handoff file eliminates the 20-40% token overhead from repeated context loading. Keep it short, keep it current, and tell the AI to read it first.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments