Skip to content

How Do You Configure Custom Subagents in AI Coding Tools?

The Problem

I wanted to configure specialized AI subagents for different coding tasks. A code reviewer with high reasoning, a quick assistant with fast responses, a test generator that knows my patterns.

I found OpenCode CLI’s documentation showing AGENTS.md configuration with reasoning_effort settings. Perfect, I thought.

Then I spent half a day trying to make it work. I found open GitHub PRs that would solve the feature. The documentation showed the configuration, but the feature wasn’t actually implemented yet.

This post explains what actually works today for configuring custom subagents in AI coding tools, what’s broken, and practical workarounds.

What Are Custom Subagents?

Custom subagents are specialized AI agents designed for specific tasks. Instead of one general-purpose AI assistant, you configure multiple agents with different strengths:

  • Code Reviewer - High reasoning, catches security issues and patterns
  • Test Generator - Knows your testing framework, generates comprehensive tests
  • Quick Assistant - Fast responses for simple queries, uses cheaper models
  • Architect - Deep thinking for system design decisions

The idea is compelling. Different tasks need different model strengths. Code review needs thoroughness. Quick questions need speed. Cost optimization matters when you’re making thousands of requests.

The promise: configure each subagent with the right model, the right reasoning level, and the right prompts. Then invoke them as needed.

The reality: the configuration features documented may not actually work.

OpenCode CLI: What the Documentation Shows

OpenCode CLI uses an AGENTS.md file for subagent configuration. Here’s what the documentation suggests:

AGENTS.md
# OpenCode AGENTS.md Configuration
## Code Reviewer Agent
name: code-reviewer
model: claude-3-opus
reasoning_effort: high
system_prompt: |
You are a meticulous code reviewer. Check for:
- Security vulnerabilities
- Performance issues
- Code style consistency
- Test coverage
- Documentation completeness
## Test Generator Agent
name: test-generator
model: claude-3-haiku
reasoning_effort: low
system_prompt: |
Generate comprehensive unit tests for the provided code.
Focus on edge cases and error handling.
## Quick Assistant Agent
name: quick-assist
model: claude-3-haiku
reasoning_effort: low
max_tokens: 1000
system_prompt: |
Provide quick, concise answers for simple coding questions.
Keep responses under 100 words unless more detail is needed.

This looks perfect. Model selection, reasoning effort, custom prompts - everything I wanted.

OpenCode CLI also provides visibility features that Claude Code lacks:

  • See what each subagent is doing in real-time
  • Token consumption monitoring per agent
  • Agent selection transparency

But here’s what I discovered about reasoning_effort: it may not actually work.

The Reasoning Effort Configuration Problem

A Reddit user raised this exact question:

“How do you configure reasoning effort for a subagent? Spent half a day trying, got nowhere, found open PRs that would solve it. Are you sure you got that to work?”

I checked the GitHub issues. There are open PRs addressing reasoning effort configuration. The feature appears documented but not fully implemented.

This is the key insight: documentation and implementation don’t always match in AI coding tools. The tools are evolving rapidly, and features get documented before they’re production-ready.

What Actually Works in OpenCode CLI

Testing confirmed:

  • AGENTS.md file parsing works
  • Model selection per agent works
  • Custom system prompts work
  • Agent visibility features work

What’s uncertain:

  • reasoning_effort configuration
  • Per-agent token budget control
  • Extended thinking toggle

The recommendation: test each feature locally before relying on it in production.

Claude Code: A Different Approach

Claude Code takes a different approach. It doesn’t have native AGENTS.md support (as of March 2026). Instead, you configure behavior through CLAUDE.md files.

The limitation: you can’t easily switch between different agent configurations. One CLAUDE.md applies to your current session.

The solution: build a custom wrapper using the Agent SDK.

Building a Custom Wrapper

One developer shared their approach:

“Claude Code as agent harness is very good… they have the agent SDK so you can use it and build your own wrapper around it… I built my own desktop/web wrapper around the agent SDK and added all the features I need including other models and custom personas and workspaces.”

Here’s a practical implementation:

custom_wrapper.py
from dataclasses import dataclass
from typing import Literal, Optional
from enum import Enum
import anthropic
class ModelType(Enum):
OPUS = "claude-3-opus"
SONNET = "claude-3-sonnet"
HAIKU = "claude-3-haiku"
@dataclass
class AgentPersona:
name: str
model: ModelType
system_prompt: str
temperature: float = 0.7
max_tokens: int = 4096
extended_thinking: bool = False
class CustomClaudeWrapper:
def __init__(self, api_key: str):
self.client = anthropic.Anthropic(api_key=api_key)
self.personas = self._load_personas()
self.current_persona = "default"
def _load_personas(self) -> dict[str, AgentPersona]:
return {
"reviewer": AgentPersona(
name="Code Reviewer",
model=ModelType.OPUS,
system_prompt="""You are a meticulous code reviewer. Check for:
- Security vulnerabilities (OWASP top 10)
- Performance bottlenecks
- Test coverage gaps
- Documentation completeness
- Code style consistency""",
extended_thinking=True
),
"architect": AgentPersona(
name="System Architect",
model=ModelType.OPUS,
system_prompt="Design scalable systems with clear separation of concerns...",
extended_thinking=True
),
"quick": AgentPersona(
name="Quick Assistant",
model=ModelType.HAIKU,
system_prompt="Brief, helpful responses. Keep it under 100 words.",
max_tokens=500
),
"test-writer": AgentPersona(
name="Test Generator",
model=ModelType.SONNET,
system_prompt="Write comprehensive tests with edge cases and error handling..."
)
}
def chat(self, message: str, persona: Optional[str] = None):
selected = self.personas.get(persona, self.personas["quick"])
params = {
"model": selected.model.value,
"system": selected.system_prompt,
"messages": [{"role": "user", "content": message}],
"max_tokens": selected.max_tokens,
"temperature": selected.temperature
}
# Extended thinking support
if selected.extended_thinking:
params["thinking"] = {
"type": "enabled",
"budget_tokens": 8000
}
return self.client.messages.create(**params)

This wrapper gives you control over:

  • Model selection per persona
  • Extended thinking toggle
  • Token budgets
  • Custom system prompts

Managing Multiple CLAUDE.md Files

A simpler approach for Claude Code users: maintain multiple CLAUDE.md files and switch between them.

CLAUDE.md-reviewer
# Code Review Mode
You are a code reviewer. Always check:
- Security vulnerabilities (OWASP top 10)
- Performance bottlenecks
- Test coverage
- Documentation
Use extended thinking for complex reviews.
CLAUDE.md-quick
# Quick Query Mode
You are a quick assistant. Provide concise answers.
Keep responses under 100 words unless detail is requested.

Switch between these files as needed. This isn’t elegant, but it works without custom code.

A Practical Reasoning Effort Workaround

Since reasoning_effort configuration is unreliable, use model selection as a proxy:

model_selection.py
def get_model_for_task(task_complexity: str) -> dict:
"""
Use model selection to approximate reasoning effort control.
Opus for complex tasks, Haiku for simple ones.
"""
configs = {
"high": {
"model": "claude-3-opus",
"thinking": {"type": "enabled", "budget_tokens": 10000}
},
"medium": {
"model": "claude-3-sonnet",
"thinking": {"type": "enabled", "budget_tokens": 5000}
},
"low": {
"model": "claude-3-haiku",
"thinking": {"type": "disabled"}
}
}
return configs[task_complexity]

Extended thinking with budget tokens is a more reliable control than the uncertain reasoning_effort parameter.

Feature Comparison: What Actually Works

FeatureOpenCode CLIClaude Code
AGENTS.md SupportYesNo
Per-Agent Model SelectionWorksRequires wrapper
Reasoning Effort ConfigUncertainRequires wrapper
Agent VisibilityTransparentLimited
Custom PersonasVia AGENTS.mdVia SDK wrapper
Token MonitoringSidebar displayNot visible
SDK AvailableNoYes (Agent SDK)
Custom Wrapper FeasibleN/AYes, recommended

What’s confirmed working:

  • OpenCode: AGENTS.md file parsing, model selection, agent visibility
  • Claude Code: Agent SDK for building wrappers, extended thinking parameter

What’s uncertain:

  • OpenCode: Reasoning effort configuration
  • Claude Code: Per-agent configuration without wrapper

Common Pitfalls

1. Assuming Features Work Because They’re Documented

I spent hours debugging configuration that was documented but not implemented. The fix: test every configuration locally before relying on it.

Terminal window
# Test if reasoning effort actually affects output
# Run same prompt with different settings, compare quality and time
opencode --agent reviewer "Review this code"
opencode --agent quick "Review this code"

2. Over-Engineering Subagents

Creating too many specialized agents adds complexity without value. Start with 2-3 core agents:

minimal_agents.py
CORE_AGENTS = {
"reviewer": {"model": "opus", "thinking": True},
"quick": {"model": "haiku", "thinking": False},
"architect": {"model": "opus", "thinking": True}
}

3. Ignoring Model Costs

Using Opus for everything gets expensive. Default to Haiku/Sonnet, escalate to Opus only when needed:

cost_aware_routing.py
def route_to_model(task: str, complexity_hint: str = None):
"""
Route simple tasks to cheaper models.
Only use Opus when explicitly needed.
"""
if complexity_hint == "complex":
return "claude-3-opus"
# Heuristics for task complexity
complex_keywords = ["architect", "security", "review", "debug"]
if any(kw in task.lower() for kw in complex_keywords):
return "claude-3-opus"
return "claude-3-haiku" # Default to cheapest

4. Not Measuring Effectiveness

Custom agents may not actually improve output. A/B test configurations:

measurement.py
def track_agent_effectiveness(agent_name: str, task: str, outcome: str):
"""
Track whether specialized agents actually help.
"""
metrics = {
"agent": agent_name,
"task_type": classify_task(task),
"outcome": outcome, # "helpful", "neutral", "unhelpful"
"timestamp": datetime.now()
}
# Store and analyze over time

5. Visibility Blind Spots

Can’t debug what you can’t see. OpenCode’s agent visibility helps you understand what each subagent is doing. Claude Code’s opacity makes this harder.

If using Claude Code, add logging to your wrapper:

logging.py
def chat_with_logging(self, message: str, persona: str):
config = self.personas[persona]
logger.info(f"Invoking {persona} with model={config.model.value}")
start = time.time()
result = self._chat(message, config)
duration = time.time() - start
logger.info(f"{persona} completed in {duration:.2f}s, tokens={result.usage}")
return result

Practical Implementation Guide

For OpenCode CLI Users

  1. Create AGENTS.md in your project root
  2. Define agent configurations with model and prompts
  3. Test reasoning effort settings - verify they work before relying on them
  4. Monitor token usage via sidebar
  5. Iterate on prompts based on output quality
Terminal window
# Verify configuration is loaded
opencode --list-agents
# Test specific agent
opencode --agent code-reviewer "Review src/auth.py"

For Claude Code Users

You have two paths:

Path 1: Accept Current Limitations

  • Use CLAUDE.md for global behavior configuration
  • Switch between multiple CLAUDE.md files for different tasks
  • No per-agent model selection without custom wrapper

Path 2: Build a Wrapper

  • Clone the Claude Agent SDK
  • Implement persona system (shown above)
  • Add extended thinking controls
  • Create workspace management
  • Optional: build desktop or web UI

The wrapper approach gives you full control but requires upfront investment.

Why This Matters

The promise of configurable subagents is compelling. Different tasks need different model strengths. Code review should be thorough. Quick queries should be fast. Costs should scale with complexity.

But current tooling has gaps. Documentation shows features that aren’t fully implemented. You might spend hours debugging configuration that simply doesn’t work yet.

The practical approach:

  1. Verify features work before building workflows around them
  2. Use model selection as a proxy for reasoning control
  3. Prefer visibility - you need to see what your agents are doing
  4. Start minimal - 2-3 well-tuned agents beat 10 poorly-configured ones

Summary

In this post, I showed how to configure custom subagents in AI coding tools, and what actually works versus what’s documented. OpenCode CLI offers AGENTS.md configuration with good visibility, but reasoning effort settings may not work yet. Claude Code requires building a custom wrapper with the Agent SDK for similar functionality.

The key insight is that documentation can be aspirational. The reasoning_effort parameter appears in docs but may not be functional. Always test configuration locally before relying on it.

For now, model selection (Opus vs Haiku) remains the most reliable way to control response quality. Extended thinking with budget tokens provides more control than uncertain reasoning effort parameters.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments