Skip to content

How to Reduce Claude AI Hallucinations: 3 System Prompt Instructions That Work

Problem

I was building a customer support bot using Claude AI. The bot kept giving confident answers to questions it had no information about:

Example hallucination
Customer: What's the warranty period for the X-500 model?
Bot: The X-500 model comes with a 2-year warranty that covers manufacturing defects and includes free shipping for repairs.
[Reality: No warranty information exists in our FAQ database]

The answer sounded authoritative. It was completely wrong. And the customer made decisions based on fabricated information.

This is the hallucination problem. AI models like Claude are trained to be helpful, which creates a default behavior of always providing an answer. When the model lacks knowledge, it fills gaps with plausible fiction rather than admitting uncertainty.

What I tried first

My initial approach was naive. I added generic instructions to the system prompt:

First attempt (did not work)
You are a helpful customer support assistant. Be accurate and don't make things up.

Claude still hallucinated. The problem? “Don’t make things up” is vague. Claude doesn’t know what “making things up” means in concrete terms—it just continues generating helpful-sounding responses.

I then tried adding consequences:

Second attempt (still didn't work)
You are a helpful customer support assistant. If you provide incorrect information, customers will be harmed. Be very careful.

This made Claude more verbose and apologetic, but it still generated incorrect answers. The issue was that Claude’s training objective (being helpful) conflicted with my vague instruction.

What actually worked

I found a Reddit thread where someone discovered three specific instructions from Anthropic’s own documentation. These instructions fundamentally changed Claude’s output quality for factual tasks:

Instruction 1: Allow “I don’t know”

System prompt instruction #1
If you don't have enough information to answer a question, say "I don't have enough information to answer that" rather than constructing a plausible-sounding answer.

This was the game-changer. Without explicit permission, Claude’s default behavior is to always provide an answer. This instruction removes that pressure.

Instruction 2: Require citations for every claim

System prompt instruction #2
Every claim you make must be backed by a source. If you cannot find a source, retract the claim.

This forces Claude to anchor responses in actual data rather than generating plausible-sounding assertions.

Instruction 3: Use direct quotes

System prompt instruction #3
Before analyzing documents, extract word-for-word quotes from the source material. This prevents paraphrase-drift where meaning subtly changes during summarization.

This is especially useful for document analysis tasks where slight rewordings can change meaning.

How I implemented it

Here’s the complete system prompt I now use for my customer support bot:

Customer support system prompt
You are a customer support assistant for Acme Corp.
## Response Rules
1. **Uncertainty Acknowledgment**
If you don't have sufficient information from the provided FAQ context,
respond with: "I don't have enough information to answer that. Please
contact [email protected] for assistance."
DO NOT invent answers that sound plausible.
2. **Citation Requirement**
Every factual claim must reference the source document.
Format: [Claim] (Source: FAQ #X)
If no source exists for a claim, do not make the claim.
3. **Direct Quote Grounding**
When referencing FAQ answers, use exact quotes before paraphrasing.
Quote format: "exact text from FAQ"
## Available Context
[Faq content will be provided here]

Results after applying the instructions

The difference was dramatic:

Before: Without hallucination reduction
Customer: What's the return policy for opened electronics?
Bot: Opened electronics can be returned within 30 days with a 15% restocking fee.
Please bring the item to any store location with your receipt.
[Reality: Our FAQ has no policy for opened electronics]
Customer: Can I use my store credit online?
Bot: Yes, store credit can be used for online purchases. Simply enter
your store credit number at checkout.
[Reality: No online store credit feature exists]
After: With hallucination reduction instructions
Customer: What's the return policy for opened electronics?
Bot: I don't have enough information to answer that. Our FAQ covers returns
for unopened items within 30 days, but doesn't specify opened electronics.
Please contact [email protected] for the current policy.
Customer: Can I use my store credit online?
Bot: I don't have enough information to answer that. The provided FAQ doesn't
mention online store credit usage. Please contact [email protected].

My customer support bot went from confidently wrong to safely uncertain. I estimate a 70%+ reduction in hallucinations.

Why these instructions work

The key insight is understanding Claude’s training objective:

Why hallucinations happen
Default behavior: Always provide a helpful answer
Problem: When knowledge is missing, Claude generates plausible fiction
to satisfy the "be helpful" objective
Solution: Explicitly redefine "being helpful" to include "admitting
ignorance when appropriate"

These instructions work because they:

  1. Override the completion bias — LLMs are trained to complete patterns, not to stop and say “I don’t know.” Explicit permission breaks this pattern.

  2. Create verifiable outputs — Citations let users check claims against sources. This makes hallucinations detectable rather than invisible.

  3. Ground analysis in evidence — Direct quotes prevent the “telephone game” effect where each paraphrase drifts further from the source.

When NOT to use these instructions

I learned this the hard way. These instructions have a cost:

Creative task example (with hallucination reduction)
User: Write a short story about a robot learning to paint.
Bot: I don't have enough information to write a story about a robot
learning to paint. No source material exists about robot painting
experiences.
[Result: Useless for creative tasks]

The tradeoff is real. An arXiv paper (2307.02185) found that citation constraints reduce creative output. These instructions are for:

  • Customer support bots
  • Research assistants
  • Document analysis
  • Fact-checking workflows
  • Legal/medical/compliance applications

They are NOT for:

  • Creative writing
  • Brainstorming
  • Coding assistance
  • General conversation

Mode toggle implementation

I now use a toggle system in my applications:

Mode toggle prompt
# Mode Selection
Current mode: [research/creative]
When in RESEARCH mode, apply these rules:
- Acknowledge uncertainty explicitly
- Require citations for claims
- Use direct quotes before analysis
When in CREATIVE mode, apply these rules:
- Generate freely without citation constraints
- Focus on originality and engagement
- No "I don't know" limitations

This gives users the best of both worlds: rigorous accuracy when it matters, creative freedom when appropriate.

Common mistakes I made

Mistake 1: Applying to all use cases

I initially added these instructions globally. My coding assistant became useless:

Mistake: Applying globally to coding tasks
Me: How do I center a div in CSS?
Bot: I don't have enough information from the provided sources to answer
how to center a div. Please provide documentation.
[Bad: General knowledge shouldn't require sources]

Mistake 2: Trusting verbalized confidence

Claude sometimes claims high confidence while being wrong:

Mistake: Trusting verbalized confidence
Bot: I'm 95% confident this answer is correct based on FAQ #4.
[FAQ #4 is completely unrelated to the question]

Confidence scores from LLMs are not reliable indicators of accuracy. Always verify claims.

Mistake 3: Ignoring context window saturation

After long conversations, Claude sometimes stops following these instructions:

Mistake: Context window saturation
[After 50+ messages in a conversation]
Bot: The return policy is 60 days. [No citation provided]
[Instruction was ignored due to context dilution]

Start fresh conversations for critical tasks, or reiterate instructions periodically.

Summary

Three system prompt instructions from Anthropic’s documentation fundamentally changed my Claude AI applications:

  1. Allow “I don’t know” — Explicit permission to admit uncertainty
  2. Require citations — Force verifiable, sourced claims
  3. Use direct quotes — Ground analysis in exact source text

These instructions address the core hallucination problem: Claude’s training to always be helpful. By redefining “helpful” to include admitting ignorance, you get accurate responses instead of confident fiction.

Use these selectively for research, customer support, and any application where accuracy matters more than creativity. Avoid them for creative tasks where citation constraints hurt output quality.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments