How Big Should Your AGENTS.md File Be: Sizing Guidelines for AI Instructions
I stared at my AGENTS.md file—720 lines of “helpful” instructions I’d accumulated over months. My AI assistant was struggling. It would miss obvious things, ignore my preferences, and sometimes hallucinate requirements I never asked for.
The irony hit me: my instructions file was so long it was confusing the assistant instead of helping it.
The Problem: No One Tells You the Magic Number
When I started using AI coding assistants, I went overboard. I pasted my entire team’s style guide. Added every convention I could think of. Included historical decisions, edge cases, and “just in case” rules.
The result? A bloated instruction file that:
- Consumed precious context tokens
- Polluted every interaction with irrelevant information
- Made debugging harder (which instruction caused this behavior?)
- Slowed down the assistant’s responses
I searched for guidance. Found conflicting advice:
"Just make it comprehensive enough""Include everything the assistant might need""More context is better"That’s when I found concrete numbers that changed my approach.
The Numbers That Actually Matter
From practical experience and community discussions:
- Under 150 lines: The recommended starting point from docs.factory.ai
- Over 700 lines: Definitively too long—Reddit discussions confirmed this causes problems
- My observation: Around 100-150 lines is the sweet spot for most projects
Here’s a decision framework I now follow:
| Lines | Status | Action ||----------|------------|-------------------------------------|| <100 | Good | Add content if assistant misses context || 100-150 | Acceptable | Review for skill extraction opportunities || 150-200 | Warning | Actively refactor into skills || >200 | Too Large | Immediate refactor required |The Iterative Approach That Works
Instead of starting with a comprehensive document, I now follow this process:
1. Start Small (50-100 lines)
Begin with only core conventions:
# Project OverviewBuilding a REST API for task management.
# Tech Stack- Python 3.11, FastAPI, SQLAlchemy- PostgreSQL database- Redis for caching
# Coding Standards- Type hints required on all functions- Pydantic models for all API schemas- Test coverage minimum 80%
# File Organizationsrc/ api/ # FastAPI routes models/ # SQLAlchemy models schemas/ # Pydantic models services/ # Business logic2. Add Content Only When Needed
I watch for patterns where the assistant misses context:
# Assistant kept forgetting to add error handling# So I added this to AGENTS.md:
# Error Handling- All API endpoints must have try/except blocks- Return consistent error format: {"error": "message"}- Log exceptions with stack trace3. Extract to Skills When Growing
When my file approached 150 lines, I noticed different categories emerging. Instead of keeping everything in AGENTS.md, I extracted specialized instructions:
AGENTS.md (kept general) ├── Project overview ├── Tech stack └── Core coding standards
skills/ ├── testing.md (extracted) ├── api-design.md (extracted) └── database-patterns.md (extracted)This way, specialized instructions only load when relevant to the task.
Common Mistakes I Made
Mistake 1: Copy-Pasting Entire Style Guides
# Coding Standards[Copied entire 200-line company style guide...][Every rule, even irrelevant ones...][Historical decisions from 2021...]The assistant would reference outdated conventions for simple tasks, wasting tokens and confusing outputs.
Mistake 2: Adding Reactively Without Cleanup
Every time the assistant made a mistake, I added a new instruction. Never removed anything. The file grew without bound.
Now I follow a rule: Add one instruction, review two existing ones.
Mistake 3: Not Measuring Impact
I added instructions blindly without checking if they improved outputs. Some made things worse.
Current approach: When I add an instruction, I note it. After a week, I check: “Did this actually help?” If not, it goes.
The Token Economics Perspective
Every line in your instruction file has a cost:
Assume: ~4 tokens per line average
100 lines = ~400 tokens (negligible impact)200 lines = ~800 tokens (notable overhead)700 lines = ~2800 tokens (significant waste)With context windows being precious real estate, I want every token to count. Instructions that are too broad for the current task waste context.
What to Keep in Your AGENTS.md
Focus on truly global information:
# Project OverviewBrief, high-level description.
# Tech StackLanguages, frameworks, databases.
# Coding StandardsOnly the non-negotiable rules.
# File OrganizationStructure that affects all work.
# Common CommandsBuild, test, run commands.What to Extract to Skills
Move task-specific instructions:
# Database Patterns- Use SQLAlchemy ORM- Migrations via Alembic- Index foreign keys- Soft deletes with `deleted_at` column
# Query Patterns- Always use pagination for list endpoints- Use connection pooling- Avoid N+1 queries with joinedloadThis only loads when I’m doing database work.
Summary
After months of experimentation, my guidelines are:
- Start at ~50-100 lines with core conventions only
- Add incrementally when the assistant misses context
- Refactor into skills at 150 lines
- Never exceed 200 lines in global instructions
- Measure impact of each instruction
The goal isn’t comprehensiveness—it’s effectiveness. Every line should earn its place in the context window.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments