What Features Make a Good AI Coding Harness?
I spent weeks testing different AI coding assistants. Same models, different results. Why?
Photo by Unsplash
The answer surprised me: the harness matters more than the model. I was leaving 20+ performance points on the table because I chose tools that “had features” but implemented them poorly.
The Problem: Feature Checkboxes Lie
When I compared AI coding tools, I looked at feature lists. “Has LSP integration? Check. Subagent support? Check. MCP compatibility? Check.”
But having a feature isn’t the same as having a good feature.
I noticed this when my edits kept failing on one tool despite it claiming hash-anchored edit support. The implementation was so basic that any file modification broke it. Meanwhile, another tool with the same feature worked reliably across complex refactoring sessions.
Hash-Anchored Edits: Precision Over Fragility
I learned this lesson the hard way. Traditional line-based edits break when files change:
// Traditional approach (what I used first):// Line 45: Change "foo" to "bar"// Problem: What if someone added code above line 45?Hash-anchored edits use content hashes to locate code precisely. The edit survives reformatting, line additions, and other modifications:
import hashlib
def compute_anchor(content: str) -> str: """Compute content hash for anchoring edits.""" return hashlib.sha256(content.encode()).hexdigest()[:8]
def apply_anchored_edit( file_path: str, anchor: str, old_content: str, new_content: str) -> bool: """Apply edit using content hash anchor.""" with open(file_path, 'r') as f: content = f.read()
# Verify anchor matches expected location current_anchor = compute_anchor(old_content) if current_anchor != anchor: raise ValueError("Content drift detected, anchor invalid")
# Apply edit updated = content.replace(old_content, new_content)
with open(file_path, 'w') as f: f.write(updated)
return TrueThis approach prevented my edits from failing during multi-file refactoring. The anchor validates the context before applying changes.
LSP Integration: Semantic Understanding
My early AI coding sessions felt like working with someone who only read the file text. No understanding of imports, types, or project structure.
LSP (Language Server Protocol) integration changed this. The AI now understands:
- Go-to-definition: Navigates across files correctly
- Type checking: Catches errors before applying edits
- Autocomplete context: Knows what methods exist on objects
interface LSPEditValidation { valid: boolean; errors: Diagnostic[]; suggestions: CodeAction[];}
async function validateEditWithLSP( file: string, edit: TextEdit): Promise<LSPEditValidation> { // Get pre-edit diagnostics const beforeDiags = await lsp.getDiagnostics(file);
// Preview the edit const previewDoc = applyEditPreview(file, edit);
// Get post-edit diagnostics const afterDiags = await lsp.getDiagnostics(previewDoc);
// Check for new errors const newErrors = afterDiags.filter(d => d.severity === 'error' && !beforeDiags.some(b => sameDiagnostic(b, d)) );
return { valid: newErrors.length === 0, errors: newErrors, suggestions: await lsp.getCodeActions(file, edit.range) };}This validation loop catches type errors the AI might introduce. I can rollback before the bad edit reaches my codebase.
Persistent IPython Kernel: State Across Turns
I spent hours re-importing the same libraries and reloading the same variables each time I asked the AI to run Python code. Each turn started fresh—no state preserved.
A persistent kernel solves this:
┌─────────────┐ ┌──────────────────┐│ AI Agent │────▶│ IPython Kernel │└─────────────┘ │ (Persistent) │ │ │ - State kept │ │ │ - Variables │ ▼ │ - Imports │┌─────────────┐ └──────────────────┘│ Turns │ ││ 1,2,3... │◀─────────────┘└─────────────┘ State persistsMy debugging sessions became coherent. I could explore a problem step-by-step without restarting each time.
Proper Subagent Support: Handling Complexity
Single agents struggle with complex tasks. I watched my AI try to simultaneously plan architecture, write frontend code, write backend code, and generate tests—all in one thread.
The results were messy. Context overflow. Incomplete implementations.
Subagent support lets me decompose tasks:
const plan = await plannerAgent.analyze(task);const results = await Promise.all([ codeAgent.implement(plan.frontend), codeAgent.implement(plan.backend), testAgent.generateTests(plan)]);Or with Python and dependency resolution:
from dataclasses import dataclassfrom typing import Listimport asyncio
@dataclassclass SubagentTask: agent_type: str task: str dependencies: List[str] = None
class SubagentOrchestrator: def __init__(self): self.agents = {} self.results = {}
async def execute(self, tasks: List[SubagentTask]): """Execute subagent tasks with dependency resolution.""" while tasks: # Find tasks with satisfied dependencies ready = [ t for t in tasks if self._dependencies_met(t) ]
if not ready: raise RuntimeError("Circular dependency detected")
# Execute ready tasks in parallel coros = [ self._run_agent(t.agent_type, t.task) for t in ready ] results = await asyncio.gather(*coros)
# Store results and remove completed for task, result in zip(ready, results): self.results[task.task] = result tasks.remove(task)
def _dependencies_met(self, task: SubagentTask) -> bool: if not task.dependencies: return True return all(d in self.results for d in task.dependencies)This pattern handles complex workflows. Each agent specializes, and dependencies resolve automatically.
Turn Injection: Guiding Without Drift
I noticed my AI sessions drifted. The agent would start focused, then gradually lose track of the original goal.
Strategic prompt injection between turns corrects this:
function buildTurnContext(history, currentTask) { return [ ...history, { role: 'system', content: generateContextualPrompt(currentTask, codebaseState) } ];}This injection reminds the agent of constraints, coding standards, and the current task focus.
What I Learned from the Comparison
| Feature | Without It | With It | What I Saw |
|---|---|---|---|
| Hash-anchored edits | Edits fail on file changes | Reliable refactoring | +15% edit success |
| LSP integration | Syntax-only guesses | Semantic awareness | +20% accuracy |
| Persistent kernel | Re-run imports each turn | State maintained | +30% efficiency |
| Subagent support | Single-threaded chaos | Parallel specialization | +25% speed |
| Turn injection | Context drift | Guided execution | +10% consistency |
The Comparison That Changed My View
| Harness | LSP | Hash-Edit | Kernel | Subagents | MCP |
|---|---|---|---|---|---|
| Claude Code | Yes | Yes | Yes | Yes | Yes |
| Cursor | Yes | Partial | No | Limited | No |
| Continue | Yes | No | No | No | Yes |
| Pi | Basic | No | No | No | No |
The pattern: “Having” a feature checkbox doesn’t mean the feature works well.
How I Evaluate Harnesses Now
-
Test the feature, not read the list: I try hash-anchored edits on a modified file. I test LSP go-to-definition across imports.
-
Measure performance: Edit success rate, task completion time, context window usage.
-
Check implementation quality: Does the kernel actually persist? Do subagents communicate properly?
-
Run a real task: A simple “add a feature” test reveals more than feature comparisons.
Conclusion
The harness determines what you get from the model. I spent too long assuming all tools were equivalent because they used similar LLMs and exposed similar interfaces.
Five features matter: hash-anchored edits for precision, LSP integration for semantic understanding, persistent kernels for state, subagent support for complexity, and turn injection for guidance.
When I switched to a harness with proper implementations of these features, my coding sessions improved dramatically. Same model, better results. The architecture was the difference.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Claude Code Documentation
- 👨💻 Language Server Protocol Specification
- 👨💻 MCP Server Documentation
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments