I Debugged My AI Agent for 6 Hours Before Realizing the Framework Was Hiding the Problem
I spent 6 hours debugging a LangChain agent that was stuck in an infinite loop. The agent kept calling the same tool repeatedly, but I couldn’t see why. Every time I added logging, the framework swallowed it. When I finally printed the internal state, it was a 2,000-line nested dictionary that told me nothing.
Then I opened the source file of a file-based agent a colleague had built. The state was a markdown file. I read it. I understood it in 30 seconds. The problem was obvious. I fixed it by editing the file manually.
That’s when I realized: frameworks abstract away complexity at the cost of visibility.
The Problem with Framework-Based Agent Architectures
I started with LangChain like everyone else. It promised rapid development, pre-built components, and a clean abstraction over LLM complexity. I could chain prompts, manage conversation history, and add tools with minimal code.
from langchain.agents import initialize_agent, Toolfrom langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
tools = [ Tool(name="Search", func=search_web, description="Search the web"), Tool(name="Calculator", func=calculate, description="Do math"),]
agent = initialize_agent(tools, llm, agent="zero-shot-react-description")agent.run("What is the population of France?")This looks clean. But when things go wrong, they go really wrong.
Issue 1: The Black Box State
When my agent entered that infinite loop, I tried to inspect its state:
# What I expected: see what the agent is thinkingprint(agent.agent.state) # Doesn't exist
# What I had to doimport jsonprint(json.dumps(agent.__dict__, indent=2, default=str))The output was 500 lines of nested objects, closures, and internal framework references. I couldn’t tell what the agent remembered, what tools it had called, or why it made decisions.
Issue 2: Version Control Hell
I wanted to track how my agent’s behavior changed over time. But with LangChain, the “state” lived in memory or framework-specific databases. I couldn’t diff it, branch it, or rollback to yesterday’s configuration.
$ git diff# Shows nothing useful for agent state# All the interesting stuff is in .json files or databases# Not version controlledIssue 3: The Learning Curve Cliff
When I onboarded a junior developer, I spent two days explaining LangChain concepts: chains, agents, tools, callbacks, memory types, prompt templates. They could write code, but they didn’t understand what was happening under the hood.
When a bug appeared, they were helpless. The framework’s abstractions were too many layers deep.
The File-Based Alternative
Then I saw a different approach. A simple agent that stored everything in markdown files:
agents/ research-agent/ state.md memory/ facts.md preferences.md tasks/ active.md completed/ context/ session.mdAt first, this seemed primitive. Where were the complex chains? Where was the abstraction?
But then I needed to debug it. I opened state.md:
# Agent State
## Current TaskResearch competitor pricing for SaaS tools
## StatusIn Progress
## Last ActionCalled search_web tool with query: "SaaS pricing comparison 2024"
## Next Steps- Analyze search results- Compare pricing tiers- Generate summary report
## ContextUser requested competitive analysis for pricing page redesign.Budget: 2 hours of research.I understood the entire state in 10 seconds. No framework knowledge required. No debugging session. Just open the file and read.
Why Files Beat Frameworks
1. Debuggability
With LangChain, debugging meant:
- Add print statements
- Enable debug logging
- Parse framework-specific log formats
- Trace through multiple abstraction layers
- Hope the error message makes sense
With files:
$ cat agents/research-agent/state.md# Done. That's it.2. Version Control
Git works with files. I can see every change my agent ever made:
$ git log --oneline agents/research-agent/state.mda3b2c1d Updated task status to completedd4e5f6g Added new research taskg7h8i9j Initial agent setup
$ git diff HEAD~1 agents/research-agent/state.md- Status: In Progress+ Status: CompletedI can rollback to any point:
$ git checkout HEAD~5 -- agents/research-agent/# Agent is now in the state it was 5 commits agoTry doing that with a framework’s in-memory state.
3. Recoverability
When my LangChain agent got corrupted (somehow the conversation history grew to 50,000 tokens), I had to restart from scratch. There was no way to “edit” the state.
With files, I can manually fix problems:
# I can literally just delete the bad memory# and the agent continues working4. Accessibility for Non-Programmers
A product manager wanted to understand how our agent made decisions. With LangChain, I would have needed to explain embeddings, prompt templates, and chain composition.
With files, I sent them a link to the state.md file. They read it and asked informed questions. No technical explanation needed.
A Practical File-Based Agent Implementation
Here’s a simple file-based agent that demonstrates the concept:
from pathlib import Pathfrom datetime import datetimefrom typing import Optionalimport json
class FileBasedAgent: def __init__(self, agent_dir: str): self.base_path = Path(agent_dir) self.state_file = self.base_path / "state.md" self.memory_dir = self.base_path / "memory" self.tasks_dir = self.base_path / "tasks"
# Create directories if they don't exist self.memory_dir.mkdir(parents=True, exist_ok=True) self.tasks_dir.mkdir(parents=True, exist_ok=True)
def get_state(self) -> dict: """Read agent state from markdown file.""" if not self.state_file.exists(): return {"status": "idle", "context": {}}
content = self.state_file.read_text() return self._parse_markdown_state(content)
def update_state(self, updates: dict) -> None: """Update agent state - creates new file, never mutates.""" current = self.get_state() new_state = { **current, **updates, "updated_at": datetime.now().isoformat() } self.state_file.write_text(self._to_markdown(new_state))
def remember(self, key: str, value: str) -> None: """Store in long-term memory.""" memory_file = self.memory_dir / f"{key}.md" memory_file.write_text(f"# {key}\n\n{value}\n")
def recall(self, key: str) -> Optional[str]: """Retrieve from long-term memory.""" memory_file = self.memory_dir / f"{key}.md" if memory_file.exists(): return memory_file.read_text() return None
def add_task(self, task: str) -> None: """Add a new task.""" task_file = self.tasks_dir / f"task-{datetime.now().strftime('%Y%m%d-%H%M%S')}.md" task_file.write_text(f"# Task\n\n{task}\n\n## Status\n- [ ] Pending")
def _parse_markdown_state(self, content: str) -> dict: """Simple markdown parser for state.""" state = {} current_section = None
for line in content.split('\n'): if line.startswith('## '): current_section = line[3:].lower().replace(' ', '_') state[current_section] = [] elif current_section and line.strip(): state[current_section].append(line.strip())
return state
def _to_markdown(self, state: dict) -> str: """Convert state dict to markdown.""" lines = ["# Agent State", ""] for key, value in state.items(): lines.append(f"## {key.replace('_', ' ').title()}") if isinstance(value, list): lines.extend([f"- {v}" for v in value]) else: lines.append(str(value)) lines.append("") return '\n'.join(lines)This is 50 lines of code. It does everything I need for basic agent state management. And when something goes wrong, I open a file.
Adding Git Integration for History
The real power comes when you add version control:
import subprocessfrom pathlib import Path
class GitBackedAgent(FileBasedAgent): def checkpoint(self, message: str) -> str: """Create a git checkpoint of current state.""" subprocess.run(["git", "add", str(self.base_path)], check=True) subprocess.run(["git", "commit", "-m", message], check=True) result = subprocess.run( ["git", "rev-parse", "HEAD"], capture_output=True, text=True, check=True ) return result.stdout.strip()
def rollback(self, commit_sha: str) -> None: """Rollback to previous state.""" subprocess.run( ["git", "checkout", commit_sha, "--", str(self.base_path)], check=True )
def history(self) -> list: """View state history.""" result = subprocess.run( ["git", "log", "--oneline", "--", str(self.state_file)], capture_output=True, text=True, check=True ) return result.stdout.strip().split('\n')Now every state change is tracked. I can see exactly what the agent did, when, and why:
$ git log --oneline agents/research-agent/abc1234 Completed pricing researchdef5678 Added competitor analysis taskghi9012 Initial setup
$ git show abc1234:agents/research-agent/state.md# See the exact state at that commitUnix Commands as Agent Tools
AI agents are surprisingly good at Unix commands. These tools have been battle-tested since the 1970s. Instead of building complex tool abstractions, I can use Unix directly:
import subprocessfrom pathlib import Path
class UnixNativeAgent(FileBasedAgent): """Agent that leverages Unix commands for operations."""
def search_memory(self, query: str) -> list: """Use grep to search memory files.""" result = subprocess.run( ["grep", "-r", "-l", query, str(self.memory_dir)], capture_output=True, text=True ) return result.stdout.strip().split('\n') if result.stdout.strip() else []
def count_tasks(self) -> dict: """Count tasks using find and wc.""" result = subprocess.run( ["find", str(self.tasks_dir), "-name", "*.md", "-type", "f"], capture_output=True, text=True ) files = result.stdout.strip().split('\n') if result.stdout.strip() else [] return {"total_tasks": len(files)}
def get_recent_state_changes(self) -> str: """See recent changes using git log.""" result = subprocess.run( ["git", "log", "-5", "--oneline", "--", str(self.state_file)], capture_output=True, text=True ) return result.stdoutNo framework needed. Just standard, reliable Unix commands that any developer already knows.
Common Mistakes to Avoid
Mistake 1: Over-Engineering File Operations
# WRONG: Adding unnecessary abstractionclass StateManager: def __init__(self, file_path, serializer, validator, cache): self.file_path = file_path self.serializer = serializer self.validator = validator self.cache = cache
def read(self): if self.cache.has(self.file_path): return self.cache.get(self.file_path) content = Path(self.file_path).read_text() validated = self.validator.validate(content) return self.serializer.deserialize(validated)
# RIGHT: Keep it simpledef read_state(path): return Path(path).read_text()The file system is already a database. Don’t rebuild a database on top of it.
Mistake 2: Ignoring Concurrency
Files can have race conditions. If multiple processes write to the same file, you’ll get corruption.
import fcntl
def safe_write(path, content): """Write with file locking.""" with open(path, 'w') as f: fcntl.flock(f.fileno(), fcntl.LOCK_EX) f.write(content) fcntl.flock(f.fileno(), fcntl.LOCK_UN)Mistake 3: Using Files When You Need a Database
File-based works great for single-agent, moderate-scale scenarios. If you need:
- High-throughput writes (1000s per second)
- Complex queries across states
- Multi-agent coordination with transactions
- Real-time state sync across machines
Then use a database. Files aren’t the solution to everything.
Mistake 4: Deep Nesting
# WRONG: Too deepagents/ research-agent/ state/ current/ active/ tasks/ pending/ task-1.md
# RIGHT: Flat and navigableagents/ research-agent/ state.md tasks/ task-1.mdWhen to Use Frameworks vs Files
Use frameworks (LangChain, CrewAI, AutoGen) when:
- Rapid prototyping and time-to-market matters
- You need pre-built integrations with many tools
- Your team already knows the framework
- You’re building complex multi-agent systems
- You need features like streaming, callbacks, middleware
Use file-based architectures when:
- Debuggability is critical
- You need version control over agent state
- Non-programmers need to understand agent behavior
- You want to leverage Unix tools and Git workflows
- Simplicity and transparency matter more than features
What I’d Do Differently
If I were starting fresh, I would:
- Start with files for any single-agent use case
- Add Git integration from day one for history and rollback
- Use a framework only when I need complex multi-agent coordination
- Keep the human in the loop with readable state files
The framework would be a last resort, not a default choice.
The Bottom Line
After years of debugging framework-based agents, I’ve learned that visibility is a feature. When I can see what my agent is doing by opening a file, debugging becomes trivial. When I can git diff my agent’s decisions, I can understand and improve its behavior over time.
Frameworks have their place. But for many agent architectures, the Unix philosophy wins: simple, composable, transparent tools that have worked reliably for 50 years. Markdown and folders might not be as exciting as the latest AI framework, but they solve the problem without creating new ones.
Next time you’re debugging an agent at 2 AM, ask yourself: would I rather be tracing through framework internals, or opening a markdown file?
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments