Skip to content

I Debugged My AI Agent for 6 Hours Before Realizing the Framework Was Hiding the Problem

I spent 6 hours debugging a LangChain agent that was stuck in an infinite loop. The agent kept calling the same tool repeatedly, but I couldn’t see why. Every time I added logging, the framework swallowed it. When I finally printed the internal state, it was a 2,000-line nested dictionary that told me nothing.

Then I opened the source file of a file-based agent a colleague had built. The state was a markdown file. I read it. I understood it in 30 seconds. The problem was obvious. I fixed it by editing the file manually.

That’s when I realized: frameworks abstract away complexity at the cost of visibility.

The Problem with Framework-Based Agent Architectures

I started with LangChain like everyone else. It promised rapid development, pre-built components, and a clean abstraction over LLM complexity. I could chain prompts, manage conversation history, and add tools with minimal code.

agent_framework.py
from langchain.agents import initialize_agent, Tool
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
tools = [
Tool(name="Search", func=search_web, description="Search the web"),
Tool(name="Calculator", func=calculate, description="Do math"),
]
agent = initialize_agent(tools, llm, agent="zero-shot-react-description")
agent.run("What is the population of France?")

This looks clean. But when things go wrong, they go really wrong.

Issue 1: The Black Box State

When my agent entered that infinite loop, I tried to inspect its state:

debug_state.py
# What I expected: see what the agent is thinking
print(agent.agent.state) # Doesn't exist
# What I had to do
import json
print(json.dumps(agent.__dict__, indent=2, default=str))

The output was 500 lines of nested objects, closures, and internal framework references. I couldn’t tell what the agent remembered, what tools it had called, or why it made decisions.

Issue 2: Version Control Hell

I wanted to track how my agent’s behavior changed over time. But with LangChain, the “state” lived in memory or framework-specific databases. I couldn’t diff it, branch it, or rollback to yesterday’s configuration.

Terminal
$ git diff
# Shows nothing useful for agent state
# All the interesting stuff is in .json files or databases
# Not version controlled

Issue 3: The Learning Curve Cliff

When I onboarded a junior developer, I spent two days explaining LangChain concepts: chains, agents, tools, callbacks, memory types, prompt templates. They could write code, but they didn’t understand what was happening under the hood.

When a bug appeared, they were helpless. The framework’s abstractions were too many layers deep.

The File-Based Alternative

Then I saw a different approach. A simple agent that stored everything in markdown files:

Directory Structure
agents/
research-agent/
state.md
memory/
facts.md
preferences.md
tasks/
active.md
completed/
context/
session.md

At first, this seemed primitive. Where were the complex chains? Where was the abstraction?

But then I needed to debug it. I opened state.md:

state.md
# Agent State
## Current Task
Research competitor pricing for SaaS tools
## Status
In Progress
## Last Action
Called search_web tool with query: "SaaS pricing comparison 2024"
## Next Steps
- Analyze search results
- Compare pricing tiers
- Generate summary report
## Context
User requested competitive analysis for pricing page redesign.
Budget: 2 hours of research.

I understood the entire state in 10 seconds. No framework knowledge required. No debugging session. Just open the file and read.

Why Files Beat Frameworks

1. Debuggability

With LangChain, debugging meant:

  1. Add print statements
  2. Enable debug logging
  3. Parse framework-specific log formats
  4. Trace through multiple abstraction layers
  5. Hope the error message makes sense

With files:

Terminal
$ cat agents/research-agent/state.md
# Done. That's it.

2. Version Control

Git works with files. I can see every change my agent ever made:

Terminal
$ git log --oneline agents/research-agent/state.md
a3b2c1d Updated task status to completed
d4e5f6g Added new research task
g7h8i9j Initial agent setup
$ git diff HEAD~1 agents/research-agent/state.md
- Status: In Progress
+ Status: Completed

I can rollback to any point:

Terminal
$ git checkout HEAD~5 -- agents/research-agent/
# Agent is now in the state it was 5 commits ago

Try doing that with a framework’s in-memory state.

3. Recoverability

When my LangChain agent got corrupted (somehow the conversation history grew to 50,000 tokens), I had to restart from scratch. There was no way to “edit” the state.

With files, I can manually fix problems:

state.md
# I can literally just delete the bad memory
# and the agent continues working

4. Accessibility for Non-Programmers

A product manager wanted to understand how our agent made decisions. With LangChain, I would have needed to explain embeddings, prompt templates, and chain composition.

With files, I sent them a link to the state.md file. They read it and asked informed questions. No technical explanation needed.

A Practical File-Based Agent Implementation

Here’s a simple file-based agent that demonstrates the concept:

file_agent.py
from pathlib import Path
from datetime import datetime
from typing import Optional
import json
class FileBasedAgent:
def __init__(self, agent_dir: str):
self.base_path = Path(agent_dir)
self.state_file = self.base_path / "state.md"
self.memory_dir = self.base_path / "memory"
self.tasks_dir = self.base_path / "tasks"
# Create directories if they don't exist
self.memory_dir.mkdir(parents=True, exist_ok=True)
self.tasks_dir.mkdir(parents=True, exist_ok=True)
def get_state(self) -> dict:
"""Read agent state from markdown file."""
if not self.state_file.exists():
return {"status": "idle", "context": {}}
content = self.state_file.read_text()
return self._parse_markdown_state(content)
def update_state(self, updates: dict) -> None:
"""Update agent state - creates new file, never mutates."""
current = self.get_state()
new_state = {
**current,
**updates,
"updated_at": datetime.now().isoformat()
}
self.state_file.write_text(self._to_markdown(new_state))
def remember(self, key: str, value: str) -> None:
"""Store in long-term memory."""
memory_file = self.memory_dir / f"{key}.md"
memory_file.write_text(f"# {key}\n\n{value}\n")
def recall(self, key: str) -> Optional[str]:
"""Retrieve from long-term memory."""
memory_file = self.memory_dir / f"{key}.md"
if memory_file.exists():
return memory_file.read_text()
return None
def add_task(self, task: str) -> None:
"""Add a new task."""
task_file = self.tasks_dir / f"task-{datetime.now().strftime('%Y%m%d-%H%M%S')}.md"
task_file.write_text(f"# Task\n\n{task}\n\n## Status\n- [ ] Pending")
def _parse_markdown_state(self, content: str) -> dict:
"""Simple markdown parser for state."""
state = {}
current_section = None
for line in content.split('\n'):
if line.startswith('## '):
current_section = line[3:].lower().replace(' ', '_')
state[current_section] = []
elif current_section and line.strip():
state[current_section].append(line.strip())
return state
def _to_markdown(self, state: dict) -> str:
"""Convert state dict to markdown."""
lines = ["# Agent State", ""]
for key, value in state.items():
lines.append(f"## {key.replace('_', ' ').title()}")
if isinstance(value, list):
lines.extend([f"- {v}" for v in value])
else:
lines.append(str(value))
lines.append("")
return '\n'.join(lines)

This is 50 lines of code. It does everything I need for basic agent state management. And when something goes wrong, I open a file.

Adding Git Integration for History

The real power comes when you add version control:

git_agent.py
import subprocess
from pathlib import Path
class GitBackedAgent(FileBasedAgent):
def checkpoint(self, message: str) -> str:
"""Create a git checkpoint of current state."""
subprocess.run(["git", "add", str(self.base_path)], check=True)
subprocess.run(["git", "commit", "-m", message], check=True)
result = subprocess.run(
["git", "rev-parse", "HEAD"],
capture_output=True,
text=True,
check=True
)
return result.stdout.strip()
def rollback(self, commit_sha: str) -> None:
"""Rollback to previous state."""
subprocess.run(
["git", "checkout", commit_sha, "--", str(self.base_path)],
check=True
)
def history(self) -> list:
"""View state history."""
result = subprocess.run(
["git", "log", "--oneline", "--", str(self.state_file)],
capture_output=True,
text=True,
check=True
)
return result.stdout.strip().split('\n')

Now every state change is tracked. I can see exactly what the agent did, when, and why:

Terminal
$ git log --oneline agents/research-agent/
abc1234 Completed pricing research
def5678 Added competitor analysis task
ghi9012 Initial setup
$ git show abc1234:agents/research-agent/state.md
# See the exact state at that commit

Unix Commands as Agent Tools

AI agents are surprisingly good at Unix commands. These tools have been battle-tested since the 1970s. Instead of building complex tool abstractions, I can use Unix directly:

unix_tools.py
import subprocess
from pathlib import Path
class UnixNativeAgent(FileBasedAgent):
"""Agent that leverages Unix commands for operations."""
def search_memory(self, query: str) -> list:
"""Use grep to search memory files."""
result = subprocess.run(
["grep", "-r", "-l", query, str(self.memory_dir)],
capture_output=True,
text=True
)
return result.stdout.strip().split('\n') if result.stdout.strip() else []
def count_tasks(self) -> dict:
"""Count tasks using find and wc."""
result = subprocess.run(
["find", str(self.tasks_dir), "-name", "*.md", "-type", "f"],
capture_output=True,
text=True
)
files = result.stdout.strip().split('\n') if result.stdout.strip() else []
return {"total_tasks": len(files)}
def get_recent_state_changes(self) -> str:
"""See recent changes using git log."""
result = subprocess.run(
["git", "log", "-5", "--oneline", "--", str(self.state_file)],
capture_output=True,
text=True
)
return result.stdout

No framework needed. Just standard, reliable Unix commands that any developer already knows.

Common Mistakes to Avoid

Mistake 1: Over-Engineering File Operations

bad_example.py
# WRONG: Adding unnecessary abstraction
class StateManager:
def __init__(self, file_path, serializer, validator, cache):
self.file_path = file_path
self.serializer = serializer
self.validator = validator
self.cache = cache
def read(self):
if self.cache.has(self.file_path):
return self.cache.get(self.file_path)
content = Path(self.file_path).read_text()
validated = self.validator.validate(content)
return self.serializer.deserialize(validated)
# RIGHT: Keep it simple
def read_state(path):
return Path(path).read_text()

The file system is already a database. Don’t rebuild a database on top of it.

Mistake 2: Ignoring Concurrency

Files can have race conditions. If multiple processes write to the same file, you’ll get corruption.

concurrency_fix.py
import fcntl
def safe_write(path, content):
"""Write with file locking."""
with open(path, 'w') as f:
fcntl.flock(f.fileno(), fcntl.LOCK_EX)
f.write(content)
fcntl.flock(f.fileno(), fcntl.LOCK_UN)

Mistake 3: Using Files When You Need a Database

File-based works great for single-agent, moderate-scale scenarios. If you need:

  • High-throughput writes (1000s per second)
  • Complex queries across states
  • Multi-agent coordination with transactions
  • Real-time state sync across machines

Then use a database. Files aren’t the solution to everything.

Mistake 4: Deep Nesting

bad_structure.txt
# WRONG: Too deep
agents/
research-agent/
state/
current/
active/
tasks/
pending/
task-1.md
# RIGHT: Flat and navigable
agents/
research-agent/
state.md
tasks/
task-1.md

When to Use Frameworks vs Files

Use frameworks (LangChain, CrewAI, AutoGen) when:

  • Rapid prototyping and time-to-market matters
  • You need pre-built integrations with many tools
  • Your team already knows the framework
  • You’re building complex multi-agent systems
  • You need features like streaming, callbacks, middleware

Use file-based architectures when:

  • Debuggability is critical
  • You need version control over agent state
  • Non-programmers need to understand agent behavior
  • You want to leverage Unix tools and Git workflows
  • Simplicity and transparency matter more than features

What I’d Do Differently

If I were starting fresh, I would:

  1. Start with files for any single-agent use case
  2. Add Git integration from day one for history and rollback
  3. Use a framework only when I need complex multi-agent coordination
  4. Keep the human in the loop with readable state files

The framework would be a last resort, not a default choice.

The Bottom Line

After years of debugging framework-based agents, I’ve learned that visibility is a feature. When I can see what my agent is doing by opening a file, debugging becomes trivial. When I can git diff my agent’s decisions, I can understand and improve its behavior over time.

Frameworks have their place. But for many agent architectures, the Unix philosophy wins: simple, composable, transparent tools that have worked reliably for 50 years. Markdown and folders might not be as exciting as the latest AI framework, but they solve the problem without creating new ones.

Next time you’re debugging an agent at 2 AM, ask yourself: would I rather be tracing through framework internals, or opening a markdown file?

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments