What Tasks Can LLMs NOT Do? Real Developer Limitations Revealed

Mar 28, 2026

Problem

I asked Claude to help debug a failing integration with a legacy payment system. The code was messy, undocumented, and written by someone who left three years ago.

Claude suggested adding logging, checking timeouts, and “reviewing the documentation.” But the documentation didn’t exist. And the real problem wasn’t in the code I was looking at—it was in a helper function buried in a different file, called indirectly through three layers of abstraction.

I spent two hours following Claude’s suggestions before I realized: LLMs cannot reverse engineer unfamiliar systems. They don’t know what questions to ask. They can’t trace execution paths through code they haven’t seen.

This wasn’t the first time. I’ve learned that certain tasks simply don’t work with AI assistance. And understanding these limitations has saved me countless hours of wasted effort.

Environment

Claude (Anthropic) for code assistance
Various legacy codebases (Python, Java, Erlang)
Production debugging scenarios
Architecture review meetings

What LLMs Cannot Do

1. Reverse Engineering Legacy Systems

When I need to understand why something works (or doesn’t work), LLMs fail consistently.

Me: "Why is this payment integration failing intermittently?"

Claude: "Let me analyze the code. Common causes include:
- Network timeouts
- Invalid credentials
- Rate limiting
- Missing error handling

Suggested fixes:
1. Add retry logic
2. Verify API credentials
3. Implement exponential backoff

The suggestions were generic and wrong. The actual problem was a race condition in a state machine I hadn’t shown Claude, triggered only when two specific payment types arrived simultaneously.

LLMs cannot:

Trace execution through unfamiliar codebases
Understand historical design decisions
Identify implicit dependencies
Spot subtle side effects across modules

Real troubleshooting requires understanding the system as a whole. LLMs see code snippets without context.

2. Making Architecture Decisions

I asked Claude to help decide between microservices and a monolithic architecture for a new project.

Me: "Should we use microservices or monolith for this e-commerce platform?"

Claude: "Both approaches have trade-offs:

Microservices benefits:
- Independent deployment
- Technology flexibility
- Team autonomy
- Scalability per service

Monolith benefits:
- Simple deployment
- Easier debugging
- No network overhead
- Faster development

Consider your team size, expected traffic, and future plans.

This is correct but useless. Architecture decisions require understanding:

Team expertise and hiring plans
Budget constraints
Business timeline
Existing infrastructure
Regulatory requirements
Future scalability needs

Claude cannot know my team’s skills, my company’s budget, or my business roadmap. It gives generic pros/cons lists instead of actionable recommendations.

Architecture remains a human decision because it requires organizational context AI cannot access.

3. Understanding AI-Generated Code

The strangest limitation: LLMs struggle to explain their own generated code when it’s complex.

I generated a data processing pipeline with Claude. When I asked why it was failing on edge cases, Claude couldn’t trace through its own logic.

Me: "Why does this pipeline fail when input has nested arrays?"

Claude: "Let me check... The code should handle arrays. Try adding:
- Array validation
- Recursive flattening
- Type checking

But the code already had type checking. The bug was in how the recursion terminated. Claude had generated working code for simple cases but didn’t understand its own implementation deeply enough to debug edge cases.

This reveals a critical insight from the Reddit discussion:

“The issue is not the last 10%, but understanding the 90% made by someone else (some AI). And the issue will be understanding the 95% crap, in order to fix the last 5%. That actually makes it worse.”

Reading AI-generated code takes more cognitive effort than reading human-written code because:

Variable names might be inconsistent
Abstractions might be arbitrary
The “why” behind decisions isn’t documented
Optimization choices might be unclear

4. Using Idiomatic Library Patterns

I needed to work with an Erlang library. Claude generated a solution that worked, but it was twice as long as necessary.

% Claude generated this (20+ lines):
handle_message(Msg) ->
    custom_proplists_get(key, Msg).

custom_proplists_get(Key, [H|T]) ->
    case H of
        {Key, Value} -> Value;
        _ -> custom_proplists_get(Key, T)
    end;
custom_proplists_get(_, []) -> undefined.
% ... more helper functions ...

% The idiomatic Erlang approach (5 lines):
handle_message(Msg) ->
    proplists:get_value(key, Msg).
% The standard library already provides this!

Claude didn’t know that Erlang’s proplists module already had this function. It reimplemented standard library functionality because it lacked deep knowledge of library idioms.

This happens across languages:

Python: AI reimplements itertools functions
JavaScript: AI writes custom array methods instead of using built-ins
Java: AI creates utility classes that duplicate Apache Commons

Why These Limitations Matter

Understanding what LLMs cannot do helps avoid:

Wasted debugging time - Stop asking AI to troubleshoot unfamiliar systems
Technical debt - AI-generated code needs extra scrutiny for overengineering
Skill erosion - Don’t let AI replace fundamental engineering judgment
Fragile codebases - Generated code that nobody fully understands breaks unexpectedly

The Practical Approach

I use LLMs for what they do well and avoid them for what they cannot do:

Tasks I delegate to AI:

SQL query generation
Regex pattern creation
Boilerplate code
Standard CRUD operations
Code explanation for documented APIs

Tasks I never delegate:

Architecture decisions
Reverse engineering legacy systems
Debugging complex multi-system failures
Security-sensitive implementations
Code review for maintainability

How to Fix This

When AI cannot help, I use these strategies:

For reverse engineering:

1. Start with the failure symptom
2. Trace backward from the output
3. Identify all code paths that could produce this output
4. Check each path's assumptions
5. Look for implicit dependencies
6. Document what you find for future reference

For architecture decisions:

1. Write down constraints (team, budget, timeline)
2. List future requirements (traffic, features, compliance)
3. Evaluate each option against constraints
4. Create an Architecture Decision Record (ADR)
5. Get team consensus
6. Plan migration path if needed

For understanding generated code:

1. Read every line before committing
2. Add comments explaining non-obvious choices
3. Rename variables for consistency
4. Remove unnecessary abstractions
5. Check for standard library alternatives
6. Write tests that cover edge cases

Summary

In this post, I explored the tasks LLMs fundamentally cannot do well. The key limitations are reverse engineering unfamiliar systems, making architecture decisions, understanding complex generated code, and using idiomatic library patterns.

These limitations matter because they define where human expertise remains essential. AI accelerates routine coding but cannot replace judgment, context understanding, or deep system knowledge.

The practical approach is knowing the boundaries. Use LLMs for SQL, regex, and boilerplate. Keep architecture, reverse engineering, and complex debugging in human hands. And always read AI-generated code with extra scrutiny—understanding the 90% often takes more effort than writing it from scratch.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit Discussion: Truly Weird Times in Tech
👨‍💻 LLM Hallucination Research
👨‍💻 Claude Code Documentation
👨‍💻 Software Architecture Fundamentals

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!