How Do You Configure Custom Subagents in AI Coding Tools?
The Problem
I wanted to configure specialized AI subagents for different coding tasks. A code reviewer with high reasoning, a quick assistant with fast responses, a test generator that knows my patterns.
I found OpenCode CLI’s documentation showing AGENTS.md configuration with reasoning_effort settings. Perfect, I thought.
Then I spent half a day trying to make it work. I found open GitHub PRs that would solve the feature. The documentation showed the configuration, but the feature wasn’t actually implemented yet.
This post explains what actually works today for configuring custom subagents in AI coding tools, what’s broken, and practical workarounds.
What Are Custom Subagents?
Custom subagents are specialized AI agents designed for specific tasks. Instead of one general-purpose AI assistant, you configure multiple agents with different strengths:
- Code Reviewer - High reasoning, catches security issues and patterns
- Test Generator - Knows your testing framework, generates comprehensive tests
- Quick Assistant - Fast responses for simple queries, uses cheaper models
- Architect - Deep thinking for system design decisions
The idea is compelling. Different tasks need different model strengths. Code review needs thoroughness. Quick questions need speed. Cost optimization matters when you’re making thousands of requests.
The promise: configure each subagent with the right model, the right reasoning level, and the right prompts. Then invoke them as needed.
The reality: the configuration features documented may not actually work.
OpenCode CLI: What the Documentation Shows
OpenCode CLI uses an AGENTS.md file for subagent configuration. Here’s what the documentation suggests:
# OpenCode AGENTS.md Configuration
## Code Reviewer Agentname: code-reviewermodel: claude-3-opusreasoning_effort: highsystem_prompt: | You are a meticulous code reviewer. Check for: - Security vulnerabilities - Performance issues - Code style consistency - Test coverage - Documentation completeness
## Test Generator Agentname: test-generatormodel: claude-3-haikureasoning_effort: lowsystem_prompt: | Generate comprehensive unit tests for the provided code. Focus on edge cases and error handling.
## Quick Assistant Agentname: quick-assistmodel: claude-3-haikureasoning_effort: lowmax_tokens: 1000system_prompt: | Provide quick, concise answers for simple coding questions. Keep responses under 100 words unless more detail is needed.This looks perfect. Model selection, reasoning effort, custom prompts - everything I wanted.
OpenCode CLI also provides visibility features that Claude Code lacks:
- See what each subagent is doing in real-time
- Token consumption monitoring per agent
- Agent selection transparency
But here’s what I discovered about reasoning_effort: it may not actually work.
The Reasoning Effort Configuration Problem
A Reddit user raised this exact question:
“How do you configure reasoning effort for a subagent? Spent half a day trying, got nowhere, found open PRs that would solve it. Are you sure you got that to work?”
I checked the GitHub issues. There are open PRs addressing reasoning effort configuration. The feature appears documented but not fully implemented.
This is the key insight: documentation and implementation don’t always match in AI coding tools. The tools are evolving rapidly, and features get documented before they’re production-ready.
What Actually Works in OpenCode CLI
Testing confirmed:
- AGENTS.md file parsing works
- Model selection per agent works
- Custom system prompts work
- Agent visibility features work
What’s uncertain:
reasoning_effortconfiguration- Per-agent token budget control
- Extended thinking toggle
The recommendation: test each feature locally before relying on it in production.
Claude Code: A Different Approach
Claude Code takes a different approach. It doesn’t have native AGENTS.md support (as of March 2026). Instead, you configure behavior through CLAUDE.md files.
The limitation: you can’t easily switch between different agent configurations. One CLAUDE.md applies to your current session.
The solution: build a custom wrapper using the Agent SDK.
Building a Custom Wrapper
One developer shared their approach:
“Claude Code as agent harness is very good… they have the agent SDK so you can use it and build your own wrapper around it… I built my own desktop/web wrapper around the agent SDK and added all the features I need including other models and custom personas and workspaces.”
Here’s a practical implementation:
from dataclasses import dataclassfrom typing import Literal, Optionalfrom enum import Enumimport anthropic
class ModelType(Enum): OPUS = "claude-3-opus" SONNET = "claude-3-sonnet" HAIKU = "claude-3-haiku"
@dataclassclass AgentPersona: name: str model: ModelType system_prompt: str temperature: float = 0.7 max_tokens: int = 4096 extended_thinking: bool = False
class CustomClaudeWrapper: def __init__(self, api_key: str): self.client = anthropic.Anthropic(api_key=api_key) self.personas = self._load_personas() self.current_persona = "default"
def _load_personas(self) -> dict[str, AgentPersona]: return { "reviewer": AgentPersona( name="Code Reviewer", model=ModelType.OPUS, system_prompt="""You are a meticulous code reviewer. Check for:- Security vulnerabilities (OWASP top 10)- Performance bottlenecks- Test coverage gaps- Documentation completeness- Code style consistency""", extended_thinking=True ), "architect": AgentPersona( name="System Architect", model=ModelType.OPUS, system_prompt="Design scalable systems with clear separation of concerns...", extended_thinking=True ), "quick": AgentPersona( name="Quick Assistant", model=ModelType.HAIKU, system_prompt="Brief, helpful responses. Keep it under 100 words.", max_tokens=500 ), "test-writer": AgentPersona( name="Test Generator", model=ModelType.SONNET, system_prompt="Write comprehensive tests with edge cases and error handling..." ) }
def chat(self, message: str, persona: Optional[str] = None): selected = self.personas.get(persona, self.personas["quick"])
params = { "model": selected.model.value, "system": selected.system_prompt, "messages": [{"role": "user", "content": message}], "max_tokens": selected.max_tokens, "temperature": selected.temperature }
# Extended thinking support if selected.extended_thinking: params["thinking"] = { "type": "enabled", "budget_tokens": 8000 }
return self.client.messages.create(**params)This wrapper gives you control over:
- Model selection per persona
- Extended thinking toggle
- Token budgets
- Custom system prompts
Managing Multiple CLAUDE.md Files
A simpler approach for Claude Code users: maintain multiple CLAUDE.md files and switch between them.
# Code Review ModeYou are a code reviewer. Always check:- Security vulnerabilities (OWASP top 10)- Performance bottlenecks- Test coverage- Documentation
Use extended thinking for complex reviews.# Quick Query ModeYou are a quick assistant. Provide concise answers.Keep responses under 100 words unless detail is requested.Switch between these files as needed. This isn’t elegant, but it works without custom code.
A Practical Reasoning Effort Workaround
Since reasoning_effort configuration is unreliable, use model selection as a proxy:
def get_model_for_task(task_complexity: str) -> dict: """ Use model selection to approximate reasoning effort control. Opus for complex tasks, Haiku for simple ones. """ configs = { "high": { "model": "claude-3-opus", "thinking": {"type": "enabled", "budget_tokens": 10000} }, "medium": { "model": "claude-3-sonnet", "thinking": {"type": "enabled", "budget_tokens": 5000} }, "low": { "model": "claude-3-haiku", "thinking": {"type": "disabled"} } } return configs[task_complexity]Extended thinking with budget tokens is a more reliable control than the uncertain reasoning_effort parameter.
Feature Comparison: What Actually Works
| Feature | OpenCode CLI | Claude Code |
|---|---|---|
| AGENTS.md Support | Yes | No |
| Per-Agent Model Selection | Works | Requires wrapper |
| Reasoning Effort Config | Uncertain | Requires wrapper |
| Agent Visibility | Transparent | Limited |
| Custom Personas | Via AGENTS.md | Via SDK wrapper |
| Token Monitoring | Sidebar display | Not visible |
| SDK Available | No | Yes (Agent SDK) |
| Custom Wrapper Feasible | N/A | Yes, recommended |
What’s confirmed working:
- OpenCode: AGENTS.md file parsing, model selection, agent visibility
- Claude Code: Agent SDK for building wrappers, extended thinking parameter
What’s uncertain:
- OpenCode: Reasoning effort configuration
- Claude Code: Per-agent configuration without wrapper
Common Pitfalls
1. Assuming Features Work Because They’re Documented
I spent hours debugging configuration that was documented but not implemented. The fix: test every configuration locally before relying on it.
# Test if reasoning effort actually affects output# Run same prompt with different settings, compare quality and timeopencode --agent reviewer "Review this code"opencode --agent quick "Review this code"2. Over-Engineering Subagents
Creating too many specialized agents adds complexity without value. Start with 2-3 core agents:
CORE_AGENTS = { "reviewer": {"model": "opus", "thinking": True}, "quick": {"model": "haiku", "thinking": False}, "architect": {"model": "opus", "thinking": True}}3. Ignoring Model Costs
Using Opus for everything gets expensive. Default to Haiku/Sonnet, escalate to Opus only when needed:
def route_to_model(task: str, complexity_hint: str = None): """ Route simple tasks to cheaper models. Only use Opus when explicitly needed. """ if complexity_hint == "complex": return "claude-3-opus"
# Heuristics for task complexity complex_keywords = ["architect", "security", "review", "debug"] if any(kw in task.lower() for kw in complex_keywords): return "claude-3-opus"
return "claude-3-haiku" # Default to cheapest4. Not Measuring Effectiveness
Custom agents may not actually improve output. A/B test configurations:
def track_agent_effectiveness(agent_name: str, task: str, outcome: str): """ Track whether specialized agents actually help. """ metrics = { "agent": agent_name, "task_type": classify_task(task), "outcome": outcome, # "helpful", "neutral", "unhelpful" "timestamp": datetime.now() } # Store and analyze over time5. Visibility Blind Spots
Can’t debug what you can’t see. OpenCode’s agent visibility helps you understand what each subagent is doing. Claude Code’s opacity makes this harder.
If using Claude Code, add logging to your wrapper:
def chat_with_logging(self, message: str, persona: str): config = self.personas[persona] logger.info(f"Invoking {persona} with model={config.model.value}") start = time.time()
result = self._chat(message, config)
duration = time.time() - start logger.info(f"{persona} completed in {duration:.2f}s, tokens={result.usage}")
return resultPractical Implementation Guide
For OpenCode CLI Users
- Create AGENTS.md in your project root
- Define agent configurations with model and prompts
- Test reasoning effort settings - verify they work before relying on them
- Monitor token usage via sidebar
- Iterate on prompts based on output quality
# Verify configuration is loadedopencode --list-agents
# Test specific agentopencode --agent code-reviewer "Review src/auth.py"For Claude Code Users
You have two paths:
Path 1: Accept Current Limitations
- Use CLAUDE.md for global behavior configuration
- Switch between multiple CLAUDE.md files for different tasks
- No per-agent model selection without custom wrapper
Path 2: Build a Wrapper
- Clone the Claude Agent SDK
- Implement persona system (shown above)
- Add extended thinking controls
- Create workspace management
- Optional: build desktop or web UI
The wrapper approach gives you full control but requires upfront investment.
Why This Matters
The promise of configurable subagents is compelling. Different tasks need different model strengths. Code review should be thorough. Quick queries should be fast. Costs should scale with complexity.
But current tooling has gaps. Documentation shows features that aren’t fully implemented. You might spend hours debugging configuration that simply doesn’t work yet.
The practical approach:
- Verify features work before building workflows around them
- Use model selection as a proxy for reasoning control
- Prefer visibility - you need to see what your agents are doing
- Start minimal - 2-3 well-tuned agents beat 10 poorly-configured ones
Summary
In this post, I showed how to configure custom subagents in AI coding tools, and what actually works versus what’s documented. OpenCode CLI offers AGENTS.md configuration with good visibility, but reasoning effort settings may not work yet. Claude Code requires building a custom wrapper with the Agent SDK for similar functionality.
The key insight is that documentation can be aspirational. The reasoning_effort parameter appears in docs but may not be functional. Always test configuration locally before relying on it.
For now, model selection (Opus vs Haiku) remains the most reliable way to control response quality. Extended thinking with budget tokens provides more control than uncertain reasoning effort parameters.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit: My brief (and bad) experience with Claude Code after OpenCode block
- 👨💻 Anthropic Agent SDK Documentation
- 👨💻 OpenCode CLI GitHub
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments