Skip to content

What Tasks Can LLMs NOT Do? Real Developer Limitations Revealed

Problem

I asked Claude to help debug a failing integration with a legacy payment system. The code was messy, undocumented, and written by someone who left three years ago.

Claude suggested adding logging, checking timeouts, and “reviewing the documentation.” But the documentation didn’t exist. And the real problem wasn’t in the code I was looking at—it was in a helper function buried in a different file, called indirectly through three layers of abstraction.

I spent two hours following Claude’s suggestions before I realized: LLMs cannot reverse engineer unfamiliar systems. They don’t know what questions to ask. They can’t trace execution paths through code they haven’t seen.

This wasn’t the first time. I’ve learned that certain tasks simply don’t work with AI assistance. And understanding these limitations has saved me countless hours of wasted effort.

Environment

  • Claude (Anthropic) for code assistance
  • Various legacy codebases (Python, Java, Erlang)
  • Production debugging scenarios
  • Architecture review meetings

What LLMs Cannot Do

1. Reverse Engineering Legacy Systems

When I need to understand why something works (or doesn’t work), LLMs fail consistently.

reverse-engineering-attempt.txt
Me: "Why is this payment integration failing intermittently?"
Claude: "Let me analyze the code. Common causes include:
- Network timeouts
- Invalid credentials
- Rate limiting
- Missing error handling
Suggested fixes:
1. Add retry logic
2. Verify API credentials
3. Implement exponential backoff

The suggestions were generic and wrong. The actual problem was a race condition in a state machine I hadn’t shown Claude, triggered only when two specific payment types arrived simultaneously.

LLMs cannot:

  • Trace execution through unfamiliar codebases
  • Understand historical design decisions
  • Identify implicit dependencies
  • Spot subtle side effects across modules

Real troubleshooting requires understanding the system as a whole. LLMs see code snippets without context.

2. Making Architecture Decisions

I asked Claude to help decide between microservices and a monolithic architecture for a new project.

architecture-question.txt
Me: "Should we use microservices or monolith for this e-commerce platform?"
Claude: "Both approaches have trade-offs:
Microservices benefits:
- Independent deployment
- Technology flexibility
- Team autonomy
- Scalability per service
Monolith benefits:
- Simple deployment
- Easier debugging
- No network overhead
- Faster development
Consider your team size, expected traffic, and future plans.

This is correct but useless. Architecture decisions require understanding:

  • Team expertise and hiring plans
  • Budget constraints
  • Business timeline
  • Existing infrastructure
  • Regulatory requirements
  • Future scalability needs

Claude cannot know my team’s skills, my company’s budget, or my business roadmap. It gives generic pros/cons lists instead of actionable recommendations.

Architecture remains a human decision because it requires organizational context AI cannot access.

3. Understanding AI-Generated Code

The strangest limitation: LLMs struggle to explain their own generated code when it’s complex.

I generated a data processing pipeline with Claude. When I asked why it was failing on edge cases, Claude couldn’t trace through its own logic.

generated-code-debug.txt
Me: "Why does this pipeline fail when input has nested arrays?"
Claude: "Let me check... The code should handle arrays. Try adding:
- Array validation
- Recursive flattening
- Type checking

But the code already had type checking. The bug was in how the recursion terminated. Claude had generated working code for simple cases but didn’t understand its own implementation deeply enough to debug edge cases.

This reveals a critical insight from the Reddit discussion:

“The issue is not the last 10%, but understanding the 90% made by someone else (some AI). And the issue will be understanding the 95% crap, in order to fix the last 5%. That actually makes it worse.”

Reading AI-generated code takes more cognitive effort than reading human-written code because:

  • Variable names might be inconsistent
  • Abstractions might be arbitrary
  • The “why” behind decisions isn’t documented
  • Optimization choices might be unclear

4. Using Idiomatic Library Patterns

I needed to work with an Erlang library. Claude generated a solution that worked, but it was twice as long as necessary.

llm-overengineered.erlang
% Claude generated this (20+ lines):
handle_message(Msg) ->
custom_proplists_get(key, Msg).
custom_proplists_get(Key, [H|T]) ->
case H of
{Key, Value} -> Value;
_ -> custom_proplists_get(Key, T)
end;
custom_proplists_get(_, []) -> undefined.
% ... more helper functions ...
idiomatic-solution.erlang
% The idiomatic Erlang approach (5 lines):
handle_message(Msg) ->
proplists:get_value(key, Msg).
% The standard library already provides this!

Claude didn’t know that Erlang’s proplists module already had this function. It reimplemented standard library functionality because it lacked deep knowledge of library idioms.

This happens across languages:

  • Python: AI reimplements itertools functions
  • JavaScript: AI writes custom array methods instead of using built-ins
  • Java: AI creates utility classes that duplicate Apache Commons

Why These Limitations Matter

Understanding what LLMs cannot do helps avoid:

  1. Wasted debugging time - Stop asking AI to troubleshoot unfamiliar systems
  2. Technical debt - AI-generated code needs extra scrutiny for overengineering
  3. Skill erosion - Don’t let AI replace fundamental engineering judgment
  4. Fragile codebases - Generated code that nobody fully understands breaks unexpectedly

The Practical Approach

I use LLMs for what they do well and avoid them for what they cannot do:

Tasks I delegate to AI:

  • SQL query generation
  • Regex pattern creation
  • Boilerplate code
  • Standard CRUD operations
  • Code explanation for documented APIs

Tasks I never delegate:

  • Architecture decisions
  • Reverse engineering legacy systems
  • Debugging complex multi-system failures
  • Security-sensitive implementations
  • Code review for maintainability

How to Fix This

When AI cannot help, I use these strategies:

For reverse engineering:

reverse-engineering-checklist.md
1. Start with the failure symptom
2. Trace backward from the output
3. Identify all code paths that could produce this output
4. Check each path's assumptions
5. Look for implicit dependencies
6. Document what you find for future reference

For architecture decisions:

architecture-decision-framework.md
1. Write down constraints (team, budget, timeline)
2. List future requirements (traffic, features, compliance)
3. Evaluate each option against constraints
4. Create an Architecture Decision Record (ADR)
5. Get team consensus
6. Plan migration path if needed

For understanding generated code:

ai-code-review-checklist.md
1. Read every line before committing
2. Add comments explaining non-obvious choices
3. Rename variables for consistency
4. Remove unnecessary abstractions
5. Check for standard library alternatives
6. Write tests that cover edge cases

Summary

In this post, I explored the tasks LLMs fundamentally cannot do well. The key limitations are reverse engineering unfamiliar systems, making architecture decisions, understanding complex generated code, and using idiomatic library patterns.

These limitations matter because they define where human expertise remains essential. AI accelerates routine coding but cannot replace judgment, context understanding, or deep system knowledge.

The practical approach is knowing the boundaries. Use LLMs for SQL, regex, and boilerplate. Keep architecture, reverse engineering, and complex debugging in human hands. And always read AI-generated code with extra scrutiny—understanding the 90% often takes more effort than writing it from scratch.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments