Skip to content

What Prompts Make AI Write More Concise Code?

Problem

I asked Claude to refactor some Python code. It came back 40% longer than the original.

Me: "Refactor this code to be cleaner."
Claude: [Returns 180 lines for my 120-line function]
Me: "I said cleaner, not longer."
Claude: "I added comments and helper functions for better readability..."

This happens constantly. AI assistants tend toward verbosity by default. They add comments, expand logic, create extra helper functions, and somehow end up with more code than you started with.

Vague prompts like “make this cleaner” or “refactor this” produce unpredictable results. Sometimes you get cleaner code. Sometimes you get a code novel.

Why AI Generates Verbose Code

Before fixing this, I needed to understand why it happens.

Training Bias

AI models learn from public codebases. And public codebases contain a lot of verbose code with extensive comments and defensive programming. The model learns that “good code” often looks like this:

verbose_example.py
def calculate_total(prices: List[float]) -> float:
"""
Calculate the total sum of all prices.
Args:
prices: A list of price values as floats.
Returns:
The total sum as a float.
Raises:
ValueError: If prices is empty.
"""
if not prices:
raise ValueError("Prices list cannot be empty")
total = 0.0
for price in prices:
if price < 0:
raise ValueError(f"Price cannot be negative: {price}")
total += price
return total

This is “good code” by many standards. But if you want concise code, the model needs different signals.

Safety Bias

Models are trained to be helpful and explicit. They explain themselves through comments and expanded logic. This is safer from the model’s perspective—better to over-explain than under-explain.

No Natural Brevity Constraint

When you ask someone to “write good code,” they might think about correctness, performance, or maintainability. “Brevity” doesn’t naturally appear in that list unless you explicitly add it.

What I Tested

I ran experiments with three categories of prompts:

  1. Direct quantitative constraints - Explicit line count targets
  2. Semantic deduplication - Target the root cause of verbosity
  3. Persona-based prompts - Use character framing to trigger different evaluation patterns

Here’s what worked and what didn’t.

Category 1: Direct Quantitative Constraints

What Didn’t Work

Make this code cleaner.
Refactor this.
Improve this code.

These vague requests produced inconsistent results. Sometimes shorter, sometimes longer, always unpredictable.

What Worked

prompt-direct.txt
Reduce this code to under 30 lines while maintaining functionality.
Constraints:
- Preserve all current behavior
- Keep error handling
- No comments unless absolutely necessary

AI responds well to measurable, testable constraints. “Under 30 lines” is a clear target. The model can evaluate its own output against this criterion.

Example

Before (45 lines):

before_reduction.py
def process_user(user_id):
# Get user from database
user = db.query(User).filter(User.id == user_id).first()
if user is None:
return None
# Validate user status
if user.status != 'active':
return None
# Get user orders
orders = db.query(Order).filter(Order.user_id == user_id).all()
if orders is None:
return None
# Calculate total
total = 0
for order in orders:
total += order.amount
return {'user': user, 'orders': orders, 'total': total}
def process_product(product_id):
# Get product from database
product = db.query(Product).filter(Product.id == product_id).first()
if product is None:
return None
# Validate product status
if product.status != 'active':
return None
# Get product reviews
reviews = db.query(Review).filter(Review.product_id == product_id).all()
if reviews is None:
return None
# Calculate average rating
total_rating = 0
for review in reviews:
total_rating += review.rating
avg_rating = total_rating / len(reviews) if reviews else 0
return {'product': product, 'reviews': reviews, 'avg_rating': avg_rating}

After applying the prompt (18 lines):

after_reduction.py
def fetch_and_validate(model, item_id, status='active'):
item = db.query(model).filter(model.id == item_id).first()
return item if item and item.status == status else None
def process_user(user_id):
if not (user := fetch_and_validate(User, user_id)):
return None
orders = db.query(Order).filter(Order.user_id == user_id).all() or []
return {'user': user, 'orders': orders, 'total': sum(o.amount for o in orders)}
def process_product(product_id):
if not (product := fetch_and_validate(Product, product_id)):
return None
reviews = db.query(Review).filter(Review.product_id == product_id).all() or []
avg = sum(r.rating for r in reviews) / len(reviews) if reviews else 0
return {'product': product, 'reviews': reviews, 'avg_rating': avg}

60% reduction. Same functionality. The key was giving a specific line target.

Category 2: Semantic Deduplication

This targets the root cause of verbosity: duplicated logic masked by slight variations.

The Prompt

prompt-dedup.txt
Semantically deduplicate all sequences of 5+ statements that appear 2+ times.
Process:
1. Find repeated code patterns (same logic, different variable names)
2. Extract into helper functions or utilities
3. Preserve exact behavior
4. No comments unless non-obvious

Why This Works

AI often generates similar code blocks with minor variations. A prompt that specifically targets “sequences of 5+ statements used 2+ times” tells the model exactly what to look for.

This is more targeted than “make it shorter.” You’re identifying the specific pattern that causes bloat.

Pattern Recognition

The deduplication prompt helps the model recognize patterns like:

pattern-diagram.txt
process_user() process_product()
---------------- -----------------
get from DB get from DB <-- DUPLICATE
check null check null <-- DUPLICATE
check status check status <-- DUPLICATE
get related items get related items <-- DUPLICATE
calculate total calculate avg <-- SIMILAR (different calc)
return dict return dict <-- DUPLICATE

Once identified, the model can extract the common logic.

Category 3: Persona-Based Prompts

This approach uses character framing to trigger different evaluation patterns from the model’s training data.

The Prompt

prompt-persona.txt
Review this code as Linus Torvalds. Be harsh. Criticize every unnecessary line.
Remove anything that doesn't directly solve the problem.
If you wouldn't merge it into the Linux kernel, rewrite it.

Why This Works

“Linus Torvalds” in training data is associated with blunt, efficiency-focused code review. The persona activates a different cluster of patterns in the model—ones that value terseness and efficiency over explanation.

Other Effective Personas

"You are a code golf champion. Achieve the same result with fewer characters."
"Review as a senior engineer who hates bloat. Every line must justify its existence."
"You are optimizing for a memory-constrained embedded system. Minimize code size."

Each persona triggers different evaluation criteria. “Code golf champion” focuses on character count. “Embedded systems engineer” focuses on memory. Both produce shorter code than “helpful assistant.”

Common Mistakes to Avoid

Mistake 1: Vague Requests

# WRONG
Make this code better
Clean up this code
Refactor this

These produce unpredictable results. “Better” could mean adding comments, expanding error handling, or creating helper functions—all of which increase line count.

Mistake 2: Conflicting Constraints

# WRONG
Make this code shorter but add comprehensive comments and error handling

The model will try to satisfy all constraints, producing verbose code with extensive comments.

Mistake 3: No Verification

Always measure before and after:

verify_loc.sh
# Before
wc -l original.py
# 120 original.py
# After
wc -l refactored.py
# 75 refactored.py
# Run tests to verify functionality preserved
pytest tests/

Advanced Techniques

Iterative Reduction

Multiple passes with specific targets often outperform single aggressive requests:

iterative-prompt.txt
# Pass 1
Reduce this code by 30%.
# Pass 2 (on result)
Reduce the result by another 20%.
# Pass 3
Final polish - remove any remaining redundancy.

Constraint Stacking

Multiple constraints create guardrails that prevent verbose patterns:

stacked-constraints.txt
Constraints:
- Under 30 lines
- Under 5 functions
- Maximum 3 nesting levels
- No duplicate logic
- No unused variables
- No comments (code should be self-documenting)

Language-Specific Prompts

Different languages have different idioms for brevity:

language-idioms.txt
# Python
Use list comprehensions, walrus operators, itertools, and standard library.
# JavaScript
Use array methods (map, filter, reduce), optional chaining, nullish coalescing.
# Go
Embrace early returns, minimize error handling verbosity, use Go idioms.

Language-specific idioms unlock brevity patterns that exist in the training data for each language.

Summary

In this post, I showed which prompts actually work for making AI write concise code:

ApproachEffectivenessWhen to Use
Direct line constraintsHighWhen you have a clear target
Semantic deduplicationHighWhen code has repeated patterns
Persona-basedMedium-HighWhen you want harsher evaluation
Vague requests (“cleaner”)LowNever

The key insight is specificity. Vague requests produce verbose results. Measurable constraints with clear intent produce consistent LOC reduction.

And always verify:

  1. Count lines before and after
  2. Run tests to confirm functionality preserved
  3. Check for code quality (don’t trade brevity for readability)

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments