Skip to content

Why AI Coding Assistants Fail Tool Calls: Context Pollution and Model Distraction

Problem

I switched from OpenCode to KiloCode with Kimi K2.5, and suddenly my AI coding assistant started failing tool calls. The model would hallucinate files that didn’t exist, call tools with wrong parameters, or produce incoherent responses. Same model, different results.

After debugging this for hours, I discovered the real culprit: context pollution. The harness I was using pushed too much irrelevant information into the context window, and Kimi K2.5 got distracted.

What I Tried First

My initial assumption was that Kimi K2.5 was somehow broken or incompatible with KiloCode. I tried:

  1. Switching models - Claude worked better but still had occasional failures
  2. Restarting the harness - No improvement
  3. Using simpler prompts - Helped slightly but lost functionality

None of these addressed the root cause. Then I found a Reddit discussion that pointed me in the right direction.

The Real Issue: Context Pollution

A user named dsvost on the OpenCode subreddit explained exactly what was happening:

“Most probably just cause KiloCode fill context with a lot of not sense. Limit tabs to 1 in settings, so it will not push everything not related. And I found that Kimi k2.5 is very ‘distracting’ and start fails tools call if context contain too much stuff not related to the current prompt.”

This matched my experience perfectly. The issue wasn’t the model—it was what I was feeding into it.

How AI Coding Assistants Use Context

When you use an AI coding assistant, it sends the model a large context containing:

  • Open files from your editor
  • Project structure and file tree
  • Relevant documentation
  • Conversation history
  • Tool definitions and schemas
  • System prompts

This context is supposed to help the model understand your codebase and make intelligent decisions. But when that context contains irrelevant or noisy information, the model’s attention mechanism gets pulled in wrong directions.

Think of it like trying to focus on a conversation in a noisy room. The more background noise, the harder it is to focus on what matters.

Why Tool Calls Fail Specifically

Tool calls are particularly sensitive to context pollution because they require:

  1. Accurate parameter extraction - The model must identify specific values from context
  2. Correct function selection - Choose from dozens of available tools
  3. Valid JSON generation - Produce syntactically correct structured output

When the context is polluted, the model’s attention mechanism gets pulled toward irrelevant code, outdated examples, or conflicting patterns. This leads to:

  • Hallucinated file paths that don’t exist
  • Wrong parameter types or missing required fields
  • Calling tools that don’t match the user’s intent
  • Incoherent reasoning chains

Model-Specific Sensitivity

Different models handle context pollution differently:

Kimi K2.5

Kimi is extremely capable but highly sensitive to irrelevant context. When the context window contains too much noise, Kimi’s performance degrades noticeably. This is likely due to how its attention mechanism weighs context tokens.

Claude

Claude handles context pollution better but still suffers. Claude’s longer context window (200K tokens) can become a double-edged sword—more room for useful context, but also more room for pollution.

GPT

GPT models have good focus but benefit significantly from clean context. They tend to produce more conservative outputs when uncertain, which reduces hallucination but can miss valid tool calls.

The Fix: Clean Your Context

I fixed my tool call failures by making these changes:

1. Limit Open Tabs

In KiloCode, I reduced the maximum open tabs included in context:

{
"kilocode.maxOpenTabs": 1
}

This forces the harness to only include the most relevant file rather than everything I happened to have open.

2. Configure Context Exclusions

I added rules to exclude irrelevant files from context:

{
"kilocode.contextExclusions": [
"**/node_modules/**",
"**/dist/**",
"**/.git/**",
"**/tests/**",
"**/__pycache__/**",
"**/venv/**",
"**/.env*"
]
}

These patterns filter out build artifacts, dependencies, and test files that rarely need to be in context for coding tasks.

3. Close Unrelated Tabs Before Complex Tasks

This is a workflow change. Before asking the AI to perform complex refactoring or debugging, I close tabs that aren’t relevant to the task.

4. Use Focused Prompts

Instead of asking “refactor this file,” I now specify exactly what I want:

# Bad: Vague, pulls in too much context
"Fix the authentication issue"
# Good: Focused, clear intent
"In src/auth/login.ts, the handleLogin function fails
when the user object is null. Add a null check before
accessing user.email on line 47."

Why This Matters Beyond Tool Calls

Context pollution affects more than just tool calls:

Token Costs

Every irrelevant token you include costs money. At scale, this adds up significantly.

Output Quality

Polluted context leads to degraded outputs across the board—worse explanations, less accurate code suggestions, and more hallucinations.

Reliability

Inconsistent results damage trust in AI tools. When the same prompt produces different results based on what tabs you have open, developers lose confidence.

Productivity

Failed tool calls require manual intervention. Each failure costs time and mental energy.

Common Mistakes

I made these mistakes before understanding context pollution:

  1. Keeping many tabs open “just in case” - This is the worst offender. Each open tab potentially adds thousands of tokens to context.

  2. Not configuring exclusions - Most harnesses have reasonable defaults, but adding project-specific exclusions helps significantly.

  3. Sending entire codebases - Some developers dump their whole project into context for “comprehensive” analysis. This rarely works well.

  4. Assuming more context = better results - This is the key insight to internalize. More relevant context helps; more irrelevant context hurts.

How to Diagnose Context Pollution

If you suspect context pollution is causing your tool call failures:

  1. Check your harness settings - Look for max context tokens, included files, and exclusion rules.

  2. Count your open tabs - If you have 20+ files open, that’s likely part of the problem.

  3. Review the context sent - Many tools let you inspect what context is being sent to the model.

  4. Test with minimal context - Close all tabs except the file you’re working on. If tool calls work, you’ve confirmed context pollution.

  5. Compare across models - If Kimi fails but Claude works with identical context, the more sensitive model reveals the problem.

Why Attention Mechanisms Matter

Transformer models use attention to determine which parts of the input are most relevant to the current task. When context is polluted, the attention mechanism splits focus between:

  • The actual task at hand
  • Irrelevant code from other files
  • Outdated patterns from older contexts
  • Noise from large file trees

This dilution of attention causes models to make mistakes they wouldn’t make with clean, focused context.

Context Window vs. Effective Context

There’s a difference between how many tokens a model can accept (context window) and how many tokens it can effectively use (effective context). Research shows that models perform better on tasks when relevant information appears early in the context rather than buried in noise.

The Future of Context Management

Some AI coding tools are beginning to implement smart context management:

  • Semantic search - Only include files semantically related to the current task
  • Relevance scoring - Rank files by relevance and filter low-scoring entries
  • Dynamic context - Adjust included context based on task complexity

Until these become standard, manual context management remains essential.

Summary

In this post, I explained why AI coding assistants fail tool calls when context windows contain too much irrelevant information. The key insight is that context pollution distracts the model’s attention mechanism, leading to failed tool calls, hallucinations, and poor outputs.

The fix is straightforward: limit what you include in context. Configure your harness to exclude irrelevant files, close unnecessary tabs before complex tasks, and use focused prompts that don’t require the model to sift through noise.

Models like Kimi K2.5 are particularly sensitive to context pollution, but all models benefit from clean context. The same model that fails with polluted context can perform excellently with a focused, relevant context window.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments