Why Does AI Coding Productivity Drop in Large Codebases?
Problem
I’ve been using AI coding assistants extensively over the past year, and I noticed something troubling: my productivity gains weren’t consistent. In greenfield projects, I felt like a superhero—coding 10x faster than before. But as projects grew, that superpower faded. In mid-sized codebases, I was down to 2-3x productivity. In large legacy systems, AI barely helped at all.
This wasn’t just my imagination. A Reddit thread confirmed I wasn’t alone:
“Claude has been great for spinning things up quickly. 10x improvement at least. When codebase gets to mid-size territory, productivity drops to 2-3x. In large codebases, pretty much forget it.”
Why does this happen? And more importantly, what can we do about it?
The Context Window Bottleneck
The root cause isn’t model capability—it’s context management. Large language models have finite context windows, but more importantly, they lack the mental model that developers build over time.
Let me break down the math:
Small Project: 10,000 lines (~250K tokens)├─ Fits in context window└─ Productivity: 10x
Mid-sized: 100,000 lines (~2.5M tokens)├─ Exceeds context by 12x└─ Productivity: 2-3x
Large Enterprise: 1,000,000+ lines (~25M+ tokens)├─ Exceeds context by 100x+└─ Productivity: 1-1.5x (near parity)The numbers are stark. But the real problem goes deeper than token limits.
What Actually Breaks Down
When I tried to work on a large codebase with AI assistance, I saw three specific failures:
1. Architecture Blindness
The AI doesn’t understand the architectural decisions made months ago. It suggests patterns that contradict established conventions. One Reddit user put it well:
“System is huge, logic is complex, CC begins struggling with simple things, missing old, new, and deprecated approaches”
I experienced this firsthand. I asked the AI to add a feature, and it proposed using a library we had already deprecated in favor of a custom implementation. It had no way of knowing—we never documented that decision.
2. Mental Model Deficit
Here’s the uncomfortable truth:
“If you’ve generated the whole codebase, your brain has no idea how anything works”
When AI generates code, you lose the understanding that comes from writing it yourself. This creates a vicious cycle: you rely more on AI, understand less, and the AI’s suggestions become less useful because you can’t validate them.
3. Context Dilution
Even when you provide context, the AI gets overwhelmed. It starts mixing approaches, forgetting constraints, and generating code that “works” but breaks subtle invariants.
Strategies That Actually Work
After experimenting with different approaches, here’s what helped me regain productivity:
Work Within Bounded Contexts
Instead of pointing AI at the entire codebase, I narrow its scope. One commenter noted:
“You can work on a large codebase but if you’re doing this as part of a large team, I actually think it works well because every team focuses only on a limited part”
I create focused sessions where I only provide files relevant to the specific feature. This dramatically improves accuracy.
Context Engineering
I use .claudeignore to exclude irrelevant files and pre-filter context. I also:
- Provide architectural decision records (ADRs) as context
- Include pattern documentation upfront
- Reference specific files rather than asking broad questions
Incremental Ownership
I no longer let AI generate everything. Instead:
- I write the core logic myself
- AI generates boilerplate
- I review and understand each piece
- I document as I go
This maintains my mental model while still leveraging AI for tedious work.
Documentation as Context
I treat documentation as a first-class context tool:
Architecture Decision Records (ADRs)├─ Why we chose X over Y├─ What patterns we use where└─ What's deprecated and why
API Contracts├─ Input/output schemas├─ Error handling patterns└─ Authentication flows
Pattern Library├─ Common code patterns├─ Domain-specific conventions└─ Testing patternsThe Practical Trade-off
I’ve learned to set realistic expectations:
| Codebase Size | AI Productivity | Best Strategy |
|---|---|---|
| <10K lines | 10x | Full AI assistance |
| 10-100K lines | 3-5x | Bounded contexts + documentation |
| >100K lines | 1.5-2x | AI for boilerplate only |
The key insight: context management is a skill. The better I provide context, the more useful AI becomes.
Summary
In this post, I explored why AI coding productivity drops in large codebases and what to do about it. The key point is that context management is the bottleneck, not model capability. By working within bounded contexts, engineering context carefully, and maintaining ownership of core logic, you can extend AI’s usefulness well beyond greenfield projects.
The productivity drop is real, but it’s not inevitable. With the right strategies, you can keep the 10x productivity even as your codebase grows.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments