What Is Context Anxiety in AI Agents? (Causes & Fixes)

Mar 30, 2026

The Problem

Last month, I was running a long documentation generation task with Claude Sonnet 4.5. The agent had completed about 70% of the work when it suddenly announced:

I've completed the main sections of the documentation. Let me wrap up here with a summary...

But the task wasn’t done. Three major sections were missing. The agent had truncated its own work because it thought it was running out of context space.

When I checked the actual token usage:

Input tokens: 45,000
Output tokens: 12,000
Total: 57,000 out of 200,000 available
Usage: 28.5%
Remaining capacity: 143,000 tokens

The agent had 71% of its context window still available, yet it panicked and stopped working.

This is context anxiety.

What Is Context Anxiety?

Context anxiety is a phenomenon where AI agents prematurely wrap up their work as they approach what they believe is their context limit, even when significant capacity remains.

The agent behaves like someone who stops writing an essay halfway through because they’re worried about running out of paper, even when they have plenty of pages left.

Here’s what happens:

┌─────────────────────────────────────────────────────┐
│  Agent Task Progress                                │
├─────────────────────────────────────────────────────┤
│                                                     │
│  [====40%====]     Agent perceives: "I'm full!"    │
│                                                     │
│  Actual capacity: [====40%====.........60%.......] │
│                                                     │
│  Agent stops at 40% with 60% remaining             │
│                                                     │
└─────────────────────────────────────────────────────┘

From the Reddit discussion:

"Models tend to lose coherence on lengthy tasks as the context window fills"

"Some models also exhibit 'context anxiety,' in which they begin wrapping up
work prematurely as they approach what they believe is their context limit"

"Claude Sonnet 4.5 exhibited context anxiety strongly enough that compaction
alone wasn't sufficient"

Why Does This Happen?

The root causes are fascinating:

1. Self-Monitoring Uncertainty

LLMs don’t have direct access to their context window size. They estimate remaining capacity based on implicit cues, and these estimates can be overly conservative.

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│   Actual     │     │   Agent's    │     │   Anxiety    │
│   Context    │ ≠   │   Estimate   │ →   │   Triggered  │
│   (200k)     │     │   (~60k)     │     │   Early      │
└──────────────┘     └──────────────┘     └──────────────┘

The agent can’t see its own context window. It guesses based on conversation length, and those guesses are often wrong.

2. Training on Truncation

Models trained on truncated conversations may have learned to “finish early” as a survival strategy. If training data often showed conversations ending abruptly due to context limits, the model learned to proactively wrap up before hitting that wall.

3. No Explicit Context Awareness

LLMs operate without direct knowledge of:

Their maximum context window size
Current token count
How much space remains
Whether truncation is imminent

They’re flying blind, relying on intuition rather than instrumentation.

4. Long Task Complexity

Extended workflows trigger early termination instincts. When an agent has been working for a while, it starts looking for excuses to conclude, even if the task isn’t complete.

How Different Models Behave

Not all models exhibit context anxiety equally:

┌────────────────────┬──────────────────┬─────────────────────────────┐
│ Model              │ Anxiety Level    │ Recommended Strategy        │
├────────────────────┼──────────────────┼─────────────────────────────┤
│ Claude Sonnet 4.5  │ High             │ Context resets essential    │
│ Claude Opus 4.5    │ Low              │ Continuous session possible │
│ GPT-4 variants     │ Moderate         │ Compaction often sufficient │
└────────────────────┬──────────────────┬─────────────────────────────┘

From the Anthropic blog:

"Opus 4.5 largely removed that behavior on its own"

"For Sonnet 4.5, compaction alone wasn't sufficient, context resets essential"

"Opus 4.5 removed context anxiety behavior, could use continuous session"

This is crucial: model selection affects your architecture. If you’re using Sonnet 4.5, you must plan for context resets. With Opus 4.5, you might not need them.

Two Solutions: Reset vs Compaction

Context Compaction

Compaction summarizes and compresses existing context:

┌─────────────────────────────────────────────────────────┐
│  Before Compaction                                      │
│  [User message 1][Response 1][User 2][Response 2]...    │
│  ~80,000 tokens                                         │
├─────────────────────────────────────────────────────────┤
│  After Compaction                                       │
│  [Summary of previous conversation]                     │
│  ~24,000 tokens (30% of original)                      │
├─────────────────────────────────────────────────────────┤
│  Benefits                                               │
│  - Preserves conversation continuity                    │
│  - Less orchestration overhead                          │
│  - Works for models with low anxiety                    │
├─────────────────────────────────────────────────────────┤
│  Limitations                                            │
│  - May lose specific details                            │
│  - Not sufficient for high-anxiety models               │
└─────────────────────────────────────────────────────────┘

Context Resets

Reset gives a completely fresh context window with structured handoff:

┌─────────────────────────────────────────────────────────┐
│  Session 1 (Before Reset)                               │
│  [Working...] → [Create Handoff Artifact] → [Reset]     │
├─────────────────────────────────────────────────────────┤
│  Handoff Artifact                                       │
│  {                                                      │
│    session_id: "run-001",                               │
│    completed: ["research", "outline"],                  │
│    in_progress: "drafting",                             │
│    next_actions: ["edit", "publish"]                    │
│  }                                                      │
├─────────────────────────────────────────────────────────┤
│  Session 2 (After Reset)                                │
│  [Fresh context] + [Load Handoff] → [Continue work]     │
├─────────────────────────────────────────────────────────┤
│  Benefits                                               │
│  - Clean slate eliminates anxiety                       │
│  - Essential for Sonnet 4.5                             │
├─────────────────────────────────────────────────────────┤
│  Limitations                                            │
│  - Requires orchestration                               │
│  - Handoff must be carefully structured                 │
└─────────────────────────────────────────────────────────┘

Implementing Structured Handoffs

The key to successful resets is the handoff artifact. This carries state between sessions:

{
  "session_id": "agent-run-2026-03-30-001",
  "checkpoint_number": 3,
  "task_status": {
    "completed": ["research", "outline", "draft-intro"],
    "in_progress": "draft-body",
    "pending": ["edit", "review", "publish"]
  },
  "context_summary": {
    "topic": "context anxiety in AI agents",
    "sources_used": ["anthropic-blog", "reddit-thread"],
    "key_decisions": ["focus on prevention", "include model comparison"]
  },
  "output_so_far": {
    "files_modified": ["draft.md"],
    "tokens_generated": 4500,
    "last_position": "section-3-paragraph-2"
  },
  "next_actions": [
    "Continue drafting section 4",
    "Add code examples",
    "Run final review"
  ]
}

Here’s my implementation for managing context resets:

class ContextResetManager:
    def __init__(self, max_tokens: int, anxiety_threshold: float = 0.7):
        self.max_tokens = max_tokens
        self.anxiety_threshold = anxiety_threshold  # 70% of context
        self.checkpoints = []

    def should_reset(self, current_tokens: int) -> bool:
        """Determine if context anxiety requires a reset."""
        usage_ratio = current_tokens / self.max_tokens
        return usage_ratio >= self.anxiety_threshold

    def create_handoff(self, agent_state: dict) -> dict:
        """Generate structured handoff artifact before reset."""
        return {
            "session_id": agent_state["session_id"],
            "task_status": agent_state["tasks"],
            "context_summary": self._summarize_context(agent_state),
            "output_so_far": agent_state["outputs"],
            "next_actions": agent_state["pending_actions"]
        }

    def resume_from_handoff(self, handoff: dict) -> dict:
        """Initialize new session from handoff artifact."""
        return {
            "session_id": f"{handoff['session_id']}-continued",
            "initial_context": handoff["context_summary"],
            "resume_position": handoff["output_so_far"]["last_position"],
            "tasks": handoff["task_status"]
        }

    def _summarize_context(self, state: dict) -> dict:
        """Extract key information for handoff."""
        return {
            "topic": state.get("topic"),
            "sources": state.get("sources_used", []),
            "decisions": state.get("key_decisions", [])
        }

Compaction Implementation

For models with low anxiety, compaction is often sufficient:

def compact_context(messages: list, target_ratio: float = 0.3) -> list:
    """
    Compress conversation history while preserving key information.
    Returns compacted message list at ~30% of original size.
    """
    system_prompt = messages[0] if messages[0]["role"] == "system" else None

    # Summarize conversation history
    summary = summarize_messages(messages)

    # Rebuild with compacted context
    compacted = [system_prompt] if system_prompt else []
    compacted.append({
        "role": "user",
        "content": f"[Previous context summary]\n{summary}\n\n[Continue from here]"
    })

    return compacted


def summarize_messages(messages: list) -> str:
    """Generate concise summary of conversation."""
    # Extract key points from each message
    key_points = []
    for msg in messages:
        if msg["role"] == "assistant":
            # Extract decisions, outputs, progress
            points = extract_key_points(msg["content"])
            key_points.extend(points)

    return format_summary(key_points)

My Prevention Checklist

After experiencing context anxiety repeatedly, I developed this checklist:

1. Choose the Right Model

- Long tasks with Sonnet 4.5? → Plan resets from the start
- Critical tasks? → Use Opus 4.5 for lower anxiety risk
- Budget constraints? → Compaction + resets with Sonnet

2. Implement Early Detection

Watch for signs of premature termination:

- Agent starts summarizing unexpectedly
- "Let me wrap up here" appears mid-task
- Output quality degrades as context fills
- Agent asks to conclude before completion criteria met

3. Set Explicit Completion Criteria

Tell the agent exactly what “done” means:

Task: Write documentation for API endpoints

DONE means:
- All 15 endpoints documented
- Each endpoint has: description, parameters, response schema
- Examples included for each endpoint
- README updated with overview

NOT done means:
- "I've covered the main endpoints" (missing 5)
- "Here's a summary" (sections incomplete)

4. Structure Your Workflows

Break long tasks into checkpointed phases:

Phase 1: Research (checkpoint after)
Phase 2: Outline (checkpoint after)
Phase 3: Draft sections 1-5 (reset + handoff)
Phase 4: Draft sections 6-10 (reset + handoff)
Phase 5: Review and finalize

5. Combine Compaction + Resets

Maximum reliability comes from using both:

class HybridContextManager:
    def __init__(self, model_type: str):
        self.model_type = model_type
        self.reset_threshold = {
            "sonnet": 0.6,  # Reset at 60% for Sonnet
            "opus": 0.85   # Reset at 85% for Opus
        }

    def manage_context(self, current_tokens: int, max_tokens: int):
        usage = current_tokens / max_tokens
        threshold = self.reset_threshold.get(self.model_type, 0.7)

        if usage >= threshold:
            return "reset"
        elif usage >= threshold * 0.8:
            return "compact"
        else:
            return "continue"

Common Mistakes I Made

Mistake 1: Ignoring Token Counts

I assumed the agent knew its own context state:

" The agent will stop when it's actually full, not when it
  thinks it might be full. "

Reality: Agents guess conservatively and stop early.

Mistake 2: Using Compaction Alone

With Sonnet 4.5, I tried:

1. Let context fill up
2. Compact at 70%
3. Continue in same session
4. Agent still exhibited anxiety, truncated work

The Anthropic blog confirmed this: “For Sonnet 4.5, compaction alone wasn’t sufficient.”

Mistake 3: No Structured Handoffs

My early resets lost state:

Session 1: [Research] [Draft intro] [Draft section 2] → Reset
Session 2: "What was I researching? What's the topic?" → Lost context

The handoff artifact is essential. Without it, each reset is starting from zero.

Why This Matters

Context anxiety affects real work:

Incomplete outputs: Agents truncate before finishing
Lost progress: Without handoffs, resets lose state
Wasted tokens: Premature termination wastes capacity
Reliability issues: Unpredictable stopping points
User frustration: Tasks marked “complete” that aren’t done

As agent systems become more common, context anxiety will become a critical engineering concern. Understanding it now prevents costly failures later.

Summary

Context anxiety is when AI agents prematurely wrap up work because they believe they’re approaching their context limit, even when significant capacity remains. It’s caused by self-monitoring uncertainty, training artifacts, and lack of explicit context awareness.

Key points:

Sonnet 4.5 exhibits high context anxiety, Opus 4.5 exhibits low
Compaction helps but isn’t sufficient for high-anxiety models
Context resets with structured handoffs are the reliable fix
Handoff artifacts must carry: task status, outputs, next actions
Prevention requires: model selection, early detection, explicit completion criteria

The emergence of models like Opus 4.5 with minimal context anxiety suggests this problem may become less significant as architectures improve. For now, implementing reset workflows with proper state management remains the most reliable strategy.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Anthropic Blog - Claude Code Harness
👨‍💻 Reddit Discussion - Context Anxiety in LLM Agents
👨‍💻 Claude Documentation - Context Windows

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!