Skip to content

How to Manage AI Agent Context with Handoff Files to Prevent Re-reading Waste

Problem

AI coding agents have no persistent memory between sessions. Every time I start a new session, the AI re-reads my project structure, re-analyzes the same files, and re-discovers past findings. This repetition is the single largest source of token waste in multi-session workflows.

I noticed that the first 20-40% of tokens in each session were spent on context re-establishment. The AI was scanning directories, re-reading key files, and re-explaining things it already knew from the previous session.

The solution: a handoff file

A compact handoff file eliminates that overhead by giving the AI a cheat sheet for the current task. I maintain a handoff.md in the project root and instruct the AI to read it first in every session.

Here’s the template I use:

handoff.md
## Goal
Fix OAuth token refresh bug in auth/callback.ts
## Key Files
- src/auth/callback.ts — OAuth handler, line 42 has the bug
- src/auth/tokens.ts — token refresh logic
- tests/auth/callback.test.ts — failing test
## Already Tried
- Added console.log at line 42 — token is valid but expires immediately
- Checked .env — TOKEN_TTL is set to 3600 (correct)
- Verified refresh endpoint responds 200 on curl test
## Known Errors
- "Invalid grant" when refresh token is reused
- Token store returns null after first refresh
## Decisions
- Will switch from implicit to explicit refresh flow
- Will add token reuse detection
## Next Steps
- Implement explicit refresh in auth/callback.ts
- Write test for token reuse scenario
## DO NOT READ (unless necessary)
- node_modules/, dist/, coverage/, .git/
- *.test.ts files except callback.test.ts

And here’s the instruction template I paste at the start of each session:

session_instruction.txt
Before doing anything else:
1. Read handoff.md
2. Do not scan directories listed under DO NOT READ
3. Continue from Next Steps
4. Be concise — show only the patch and the reason
5. After each significant finding, update handoff.md

Why this works

Each session would cost 20-40% wasted tokens before the handoff file. Now every session after the first starts at Next Steps instead of square one. Combined with an exclusion list for directories like node_modules, .venv, dist, and build, this eliminated the most expensive pattern: repeated full-context loading.

In my case, this was the second most impactful technique after data compaction, contributing significantly to the 138M to 20M token per day reduction.

Key rules for a good handoff file

  • Keep it to one screen, roughly 30-50 lines
  • Update it after each session
  • Include file paths and brief descriptions, not file contents
  • Tell the AI to read it first explicitly
  • Periodically compact it: remove dead ends and outdated info

Common mistakes

  • Making the handoff file too long (it should fit on one screen)
  • Not updating it after each session
  • Including file contents instead of file paths and brief descriptions
  • Forgetting to tell the AI to read it first
  • Letting dead-end research accumulate in the handoff

Summary

In this post, I showed how to use a handoff file to prevent AI agents from re-discovering context in every session. The key point is that a well-maintained 30-line handoff file eliminates the 20-40% token overhead from repeated context loading. Keep it short, keep it current, and tell the AI to read it first.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments