Why Be Concise Fails with Claude: Use Hard Constraints Instead
Problem
I typed a simple prompt:
Explain quantum computing. Be concise.Claude responded with a 47-sentence wall of text. I asked it to be concise, and it gave me a lecture.
This kept happening. “Brief summary” produced three paragraphs. “Quick answer” gave me a full essay. “Keep it short” resulted in something that was definitely not short.
Every “concise” prompt produced verbose output. I was wasting tokens and time.
What Happened?
I found a Reddit discussion about Claude prompting tricks. One tip stood out:
Stop saying “be concise” — use hard limits instead. “Answer in 3 sentences max” or “5 bullet points only.” Vague instructions = verbose answers. Hard constraints work every time.
The comment section confirmed this pattern. Users reported consistent success with numeric limits and consistent failure with vague adjectives.
Let me test this.
Test 1: Vague instruction
User: Explain quantum computing. Be concise.
Claude: Quantum computing is a revolutionary approach to computation that harnessesthe principles of quantum mechanics to process information in fundamentallydifferent ways than classical computers. Unlike classical bits that can onlyexist in states of 0 or 1, quantum computers use quantum bits (qubits) thatcan exist in superposition—simultaneously representing both 0 and 1. This,combined with quantum entanglement and interference, allows quantum computersto perform certain calculations exponentially faster than classical systems.Applications include cryptography, drug discovery, optimization problems, andmachine learning. Major tech companies like IBM, Google, and others are racingto build practical quantum computers, though significant challenges remain inerror correction and qubit stability. The field has seen remarkable progress...[Response continues for 5 more paragraphs]The output kept going. “Be concise” meant nothing.
Test 2: Hard constraint
User: Explain quantum computing in exactly 3 sentences.
Claude: Quantum computers use qubits that can exist in multiple statessimultaneously, unlike classical bits limited to 0 or 1. This superposition,combined with entanglement, enables exponential speedups for specific problemslike cryptography and optimization. Current challenges include maintaining qubitstability and error correction for practical applications.Three sentences. Exactly three. No more, no less.
Why does “exactly 3 sentences” work when “be concise” fails?
Why Vague Instructions Fail
“Be concise” fails because it is subjective. LLMs cannot measure “concise.”
No Measurable Definition
What does “concise” mean to an LLM?
- One sentence?
- One paragraph?
- Half a page?
- Under 100 words?
The model has seen “concise” used in many contexts. A research paper might call 500 words concise. A tweet is concise at 20 words. A textbook might consider a 5-page summary concise.
Without a measurable definition, the model guesses based on training patterns. And training patterns favor thoroughness. The model is trained to be helpful, and helpful often means complete.
The Helpfulness Trap
LLMs are trained to be helpful. When in doubt, they err on the side of more information.
User: Explain quantum computing. Be concise.
Model thinking: "The user wants to understand quantum computing. A completeexplanation would cover qubits, superposition, entanglement, applications,and current challenges. I should provide a thorough overview to be helpful."
Result: Long response because "helpful" overrides "concise."The instruction “be concise” competes with the training objective “be helpful.” Helpful usually wins.
No Stop Condition
Vague instructions give no signal for when to stop:
"Be concise" → When do I stop? After 1 paragraph? 2? 3?"Keep it short" → How short? What's the boundary?"Don't be verbose" → Defines what NOT to do, not what to doThe model keeps generating until it feels “done,” which for an LLM often means covering all relevant points.
Why Hard Constraints Work
Numeric constraints work because they are measurable and verifiable.
Measurable
"3 sentences" → Count: 1, 2, 3. Stop."5 bullet points" → Count: 1, 2, 3, 4, 5. Stop."100 words max" → Count words. Stop at 100.The model can track and verify these limits during generation.
Consistent Results
Same input, same constraint, same output length every time:
User: List the benefits of exercise in exactly 5 bullet points.
Claude: Here are 5 benefits of exercise:1. Improved cardiovascular health2. Better weight management3. Enhanced mood and mental health4. Stronger muscles and bones5. Better sleep qualityExactly 5. Not 4, not 6. Every time.
Constraint Types Ranked by Reliability
I tested different constraint types. Here is what works best:
+-------------------+------------+-------------------------------------------+| Constraint Type | Reliability| Notes |+-------------------+------------+-------------------------------------------+| Bullet/number | Highest | "5 bullet points" - model counts items || limits | | during generation |+-------------------+------------+-------------------------------------------+| Sentence limits | High | "3 sentences" - model tracks periods |+-------------------+------------+-------------------------------------------+| Paragraph limits | Medium | "2 paragraphs" - paragraphs are flexible |+-------------------+------------+-------------------------------------------+| Word limits | Low | "100 words" - models struggle to count || | | words accurately during generation |+-------------------+------------+-------------------------------------------+Word limits are least reliable because models generate tokens, not words, and counting during generation is imprecise. A “100 word” limit often produces 85-120 words.
Common Mistakes
I made these mistakes. Maybe you will too.
Mistake 1: Using Relative Terms
❌ "Be brief" → Brief compared to what? A tweet? A chapter? A sentence?
❌ "Keep it short" → How short is short? 50 words? 100 words?
❌ "Don't write too much" → Defines what NOT to do, gives no targetMistake 2: Mixing Vague and Specific
❌ "Be concise and answer in 3 sentences." → Redundant and confusing → Model may try to satisfy both, creating conflict
✅ "Answer in 3 sentences." → One clear constraint, no ambiguityMistake 3: Wrong Constraint Type for the Task
❌ "Explain in 50 words or less." → Word counting is unreliable during generation
✅ "Explain in 3 sentences." → Sentence counting is reliable
✅ "Explain in 5 bullet points." → Bullet counting is most reliableMistake 4: Leaving Format Unspecified
❌ "Answer in 3 sentences." → Might give 3 very long sentences
✅ "Answer in 3 short sentences, each under 15 words." → Constrains both count and lengthPractical Prompting Patterns
I now use these patterns consistently:
PATTERNS = { # Quick answers "quick_fact": "Answer in 1-2 sentences.",
# Summaries "summary": "Summarize in exactly 3 bullet points.",
# Comparisons "comparison": "Compare X and Y in exactly 4 bullet points.",
# Explanations "explanation": "Explain in 3 short sentences maximum.",
# Lists "list": "List exactly 5 key points, no more.",
# Deep dive (when you actually want detail) "detailed": "Explain thoroughly with 5-7 paragraphs.",}Notice even “detailed” uses numbers. When I want length, I specify how much length.
Why This Matters Beyond Chat
Token costs add up. If every response is 2x longer than needed because of vague prompts:
- API costs double
- Response times increase
- Context windows fill faster
- Downstream parsing becomes harder
For automated workflows, predictable output length matters:
# BAD: Unpredictable lengthresponse = client.messages.create( model="claude-3-5-sonnet", messages=[{"role": "user", "content": f"Summarize this article. Be concise.\n\n{article}"}])# Response could be 100 words or 500 words
# GOOD: Predictable lengthresponse = client.messages.create( model="claude-3-5-sonnet", messages=[{"role": "user", "content": f"Summarize this article in exactly 3 bullet points.\n\n{article}"}])# Response will have exactly 3 bullet pointsSummary
Vague instructions like “be concise” fail because they lack measurable boundaries. LLMs cannot objectively measure “conciseness” — they can measure “3 sentences.”
The Reddit commenter who said all prompt tips boil down to “give more detailed prompts” was right. Specificity beats ambiguity. When you want shorter responses, specify exactly how short using numbers.
The pattern is simple:
❌ Vague: "Be [adjective]"✅ Specific: "[Action] in [number] [unit]"Replace subjective words with countable constraints. Your prompts will work every time.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit: 6 Claude Prompting Tricks I Wish I Knew on Day One
- 👨💻 Anthropic: Prompt Engineering Guide
- 👨💻 OpenAI: Prompt Engineering Best Practices
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments