Why Does My OpenClaw Main Agent Time Out After the Secondary Agent Completes?
The Problem
I was building a multi-agent system with OpenClaw where my main agent delegates tasks to a specialized secondary agent. Everything seemed to work fine until I noticed a strange pattern:
[Main Agent] Delegating code review task to SecondaryAgent...[Secondary Agent] Starting task execution...[Secondary Agent] Task completed successfully. Result: All checks passed.[Main Agent] ERROR: Delegation timed out after 300 seconds[Main Agent] Falling back to execute task myself...The secondary agent clearly finished its work, but the main agent still timed out and fell back to doing the task itself. This meant duplicated work and wasted resources.
Initial Investigation
I checked my openclaw.yaml configuration first:
agents: main: type: orchestrator delegates_to: - code_reviewer code_reviewer: type: worker capabilities: - code_analysis - security_scanNothing special here. The timeout configuration was using defaults, which I later discovered was the root cause.
Understanding What Was Happening
I enabled debug logging to see what’s actually going on:
logging: level: debug format: detailedThe logs revealed the issue:
[DEBUG] MainAgent -> SecondaryAgent: delegation request sent[DEBUG] SecondaryAgent: task started[DEBUG] SecondaryAgent: task completed, result stored[DEBUG] MainAgent: waiting for response (timeout: 300s)[DEBUG] MainAgent: no response received within timeout[WARN] MainAgent: falling back to self-executionThe secondary agent completed, but the main agent never received the completion signal. This pointed to a communication gap.
Root Causes I Discovered
1. Default Timeout Too Short
The default timeout of 300 seconds (5 minutes) was barely enough for my code review agent to complete complex analyses. When the secondary agent took 280-310 seconds, it became a race condition.
2. Missing Completion Handler
I hadn’t defined what should happen when the secondary agent finishes:
# WRONG: No completion handleragents: code_reviewer: type: worker # Missing: on_complete callbackWithout a completion handler, the secondary agent stores its result but doesn’t notify the main agent.
3. Context Window Pressure
During complex tasks, both agents were consuming context window space rapidly. When memory pressure builds up, OpenClaw’s communication channel can become unreliable.
4. No Heartbeat Mechanism
Long-running tasks need heartbeats to confirm the secondary agent is still working. Without them, the main agent assumes the worst and triggers timeout.
The Solution
I made several changes to fix this issue.
Increase Timeout Values
First, I adjusted the timeout settings:
delegation: default_timeout_seconds: 900 # 15 minutes instead of 5 max_timeout_seconds: 3600 # Allow up to 1 hour for complex tasks
agents: main: type: orchestrator delegates_to: - code_reviewer delegation_config: timeout_per_capability: code_analysis: 600 security_scan: 900This gives complex tasks enough breathing room.
Add Completion Handlers
Next, I added explicit completion handlers:
agents: code_reviewer: type: worker capabilities: - code_analysis - security_scan on_complete: action: notify_delegator include_result: true cleanup_context: trueThe on_complete callback ensures the main agent receives the result immediately.
Enable Heartbeat Mechanism
For long-running tasks, I enabled heartbeats:
delegation: default_timeout_seconds: 900 heartbeat_enabled: true heartbeat_interval_seconds: 30 heartbeat_timeout_seconds: 120Now the main agent receives a “still working” signal every 30 seconds. If it misses 4 consecutive heartbeats (120 seconds), it knows something is wrong.
Monitor Context Window Usage
I added context monitoring to catch memory pressure early:
agents: code_reviewer: type: worker context_management: max_tokens: 8000 warning_threshold: 0.75 on_overflow: action: compact_history preserve_recent: 10When context usage hits 75%, the agent automatically compacts its history.
Why This Matters
Before these changes, I was seeing:
- Duplicated work: Main agent re-doing completed tasks
- Unpredictable behavior: Sometimes it worked, sometimes it didn’t
- Failed handoffs: Tasks that should take minutes sometimes took hours
After fixing the configuration:
[Main Agent] Delegating code review task to SecondaryAgent...[Secondary Agent] Starting task execution...[Heartbeat] Agent still working... (30s)[Heartbeat] Agent still working... (60s)[Secondary Agent] Task completed successfully. Result: All checks passed.[Main Agent] Received result from SecondaryAgent. Proceeding...Common Mistakes to Avoid
1. Using Default Timeout for Complex Tasks
Don’t assume defaults work for your use case. Measure how long your secondary agents actually take:
import time
start = time.time()result = secondary_agent.execute(complex_task)duration = time.time() - startprint(f"Task took {duration} seconds")Set your timeout to at least 2x the measured duration.
2. Not Defining Completion Callbacks
The on_complete handler is not optional for delegation:
# CORRECT: Explicit completion handlingagents: worker: on_complete: action: notify_delegator include_result: true3. Ignoring Memory Pressure
Context window exhaustion silently breaks communication. Monitor it:
agents: worker: context_management: monitoring_enabled: true alert_threshold: 0.84. No Error Handling for Timeouts
Always handle timeout gracefully:
delegation: on_timeout: action: fallback_or_retry max_retries: 2 retry_delay_seconds: 30Related Knowledge
If you’re working with multi-agent systems, you might also encounter:
- Agent discovery issues: How agents find each other in the system
- State synchronization: Keeping agent states consistent
- Load balancing: Distributing work across multiple secondary agents
- Fault tolerance: What happens when an agent crashes mid-task
The OpenClaw documentation has a dedicated section on Advanced Delegation Patterns that covers these scenarios.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments