Why I Abandoned Pure Frameworks for Hybrid AI Agent Architecture
The LangChain Fatigue
I spent three months building AI agents with LangChain. Every time something broke, I had to dig through layers of abstractions just to find the actual problem.
One day my agent kept failing silently. I traced the issue through six wrapper classes before finding the root cause: a simple string formatting error that the framework swallowed and re-wrapped into a generic “AgentExecutionException.”
That’s when I realized the framework was hiding problems instead of helping me solve them.
I tried the opposite approach: pure file-based agents. Just markdown files and prompts. That was simpler but I lost execution guarantees. Tasks would start but never complete. No retry logic. No error handling. No state management.
Then I found the middle ground that actually works.
What I Mean by “Hybrid Architecture”
The hybrid approach splits responsibilities:
- Files store knowledge, skills, and context
- Code handles orchestration, error handling, and state management
This isn’t about choosing between frameworks and files. It’s about using each for what it does best.
┌─────────────────────────────────────────────────────────────┐│ AI Agent System │├─────────────────────────────────────────────────────────────┤│ ││ ┌─────────────────────┐ ┌─────────────────────────┐ ││ │ FILE-BASED LAYER │ │ CODE-BASED LAYER │ ││ │ (Human-Readable) │ │ (Execution Guarantees) │ ││ ├─────────────────────┤ ├─────────────────────────┤ ││ │ │ │ │ ││ │ skills/ │ │ State Management │ ││ │ research.md │◄──►│ Retry Logic │ ││ │ code-review.md │ │ Error Handling │ ││ │ publish.md │ │ Logging/Monitoring │ ││ │ │ │ Validation │ ││ │ context/ │ │ │ ││ │ project-goals.md │ └─────────────────────────┘ ││ │ constraints.md │ ││ │ │ ││ │ state/ │ ││ │ current-task.md │ ││ │ │ ││ └─────────────────────┘ ││ │└─────────────────────────────────────────────────────────────┘The key insight: files are for humans to read and edit. Code is for machines to execute reliably. The intersection is where this architecture shines.
The File-Based Memory Layer
Files give you three things frameworks can’t:
1. Immediate Readability
When my agent failed last week, I opened skills/research.md and saw exactly what the agent was supposed to do. No framework to decode.
# Skill: Research
## PurposeGather information from multiple sources and synthesize findings.
## Trigger Conditions- User asks for research on a topic- Project planning phase- Knowledge gap identified
## Steps1. Search web for primary sources2. Extract key claims and evidence3. Cross-reference with context7 docs4. Synthesize into structured report
## Output Format## Summary[2-3 sentence overview]
## Key Findings- Finding 1 with evidence- Finding 2 with evidence
## Sources- [Source 1](url)- [Source 2](url)2. Version Control and Collaboration
I can diff my agent’s “brain” across commits. When a behavior changes unexpectedly, I check git history.
git diff HEAD~1 skills/research.md
# Shows exactly what changed in agent behavior# Try doing that with a database blob3. Easy Modification Without Code Changes
Want to change how your agent approaches research? Edit the markdown. No redeployment needed.
The Code-Based Orchestration Layer
Files alone fail at execution guarantees. That’s where code comes in.
What Code Handles Better
Files: Code:────────────────────────────────────────────Knowledge storage State machine managementSkill definitions Retry with backoffContext/prompts Error classificationHuman readability Structured loggingGit versioning Input validation Output parsing Timeout handlingA Practical Orchestration Example
I built a simple orchestrator that loads skills from files but executes them with proper error handling:
from typing import TypedDictfrom pathlib import Pathimport yamlimport time
class AgentState(TypedDict): task: str current_skill: str context: dict results: list errors: list
class HybridOrchestrator: def __init__(self, skills_dir: str): self.skills_dir = Path(skills_dir) self.max_retries = 3 self.retry_delay = 1.0
def load_skill(self, skill_name: str) -> dict: """Load skill definition from markdown file.""" skill_path = self.skills_dir / f"{skill_name}.md"
if not skill_path.exists(): raise FileNotFoundError(f"Skill not found: {skill_name}")
content = skill_path.read_text() return self._parse_skill_markdown(content)
def _parse_skill_markdown(self, content: str) -> dict: """Extract structured data from markdown skill file.""" # Simple parser - extract sections sections = {} current_section = None
for line in content.split('\n'): if line.startswith('## '): current_section = line[3:].strip().lower().replace(' ', '_') sections[current_section] = [] elif current_section and line.strip(): sections[current_section].append(line.strip())
return { 'purpose': '\n'.join(sections.get('purpose', [])), 'steps': sections.get('steps', []), 'output_format': '\n'.join(sections.get('output_format', [])) }
def execute_skill(self, state: AgentState) -> AgentState: """Execute skill with retry logic and error handling.""" skill = self.load_skill(state['current_skill'])
for attempt in range(self.max_retries): try: result = self._run_skill(skill, state['context']) return { **state, 'results': state['results'] + [result], 'errors': state['errors'] } except Exception as e: if attempt == self.max_retries - 1: return { **state, 'errors': state['errors'] + [{ 'skill': state['current_skill'], 'error': str(e), 'attempts': attempt + 1 }] } time.sleep(self.retry_delay * (2 ** attempt))
return state
def _run_skill(self, skill: dict, context: dict) -> dict: """Actual skill execution - calls LLM here.""" # Implementation details depend on your LLM client passThis gives me retry logic, error classification, and structured state management. But the skill itself is defined in a markdown file I can edit anytime.
Why This Beats Pure Frameworks
I tried building the same agent three ways:
Attempt 1: Pure LangChain
from langchain.agents import initialize_agent, Toolfrom langchain_openai import ChatOpenAIfrom langchain.memory import ConversationBufferMemoryfrom langchain.chains import LLMChain
# Define toolstools = [ Tool(name="search", func=search, description="Search the web"), Tool(name="summarize", func=summarize, description="Summarize text"),]
# Create agent with memoryllm = ChatOpenAI(model="gpt-4")memory = ConversationBufferMemory(memory_key="chat_history")
agent = initialize_agent( tools, llm, agent="conversational-react-description", memory=memory, verbose=True)
# When something goes wrong, good luck debuggingresult = agent.run("Research AI agent architectures")Problems I faced:
- Error messages referenced internal classes I’d never heard of
- Adding a simple step required understanding the chain abstraction
- Memory management was opaque
- Customizing behavior meant overriding framework methods
Attempt 2: Pure File-Based
agent/├── system-prompt.md├── tools/│ ├── search.md│ └── summarize.md└── memory/ └── session.mdProblems I faced:
- No retry logic when API calls failed
- No way to track which step was running
- Debugging meant reading raw markdown
- No structured error handling
Attempt 3: Hybrid
agent/├── skills/ # File-based (human readable)│ ├── research.md│ ├── summarize.md│ └── publish.md├── context/ # File-based (project context)│ ├── goals.md│ └── constraints.md└── orchestrator.py # Code-based (execution guarantees)What worked:
- Debugging meant checking both files and logs
- Adding skills meant creating markdown files
- Error handling was explicit in code
- State management was transparent
Real World Implementation
Here’s how I structure my agents now:
from pathlib import Pathfrom typing import Optionalimport json
class HybridAgent: def __init__(self, agent_dir: str): self.base_path = Path(agent_dir) self.skills_path = self.base_path / "skills" self.context_path = self.base_path / "context" self.state: dict = {}
def load_context(self) -> dict: """Load all context files into memory.""" context = {} for file in self.context_path.glob("*.md"): context[file.stem] = file.read_text() return context
def get_skill(self, skill_name: str) -> str: """Read skill definition from file.""" skill_file = self.skills_path / f"{skill_name}.md" if not skill_file.exists(): raise ValueError(f"Unknown skill: {skill_name}") return skill_file.read_text()
def run_skill(self, skill_name: str, **kwargs) -> dict: """Execute a skill with orchestration.""" skill_prompt = self.get_skill(skill_name) context = self.load_context()
# Merge context into skill execution full_context = {**context, **kwargs}
# Execute with retry logic result = self._execute_with_retry( skill_prompt, full_context, max_retries=3 )
# Update state self.state[f"last_{skill_name}"] = result return result
def _execute_with_retry( self, prompt: str, context: dict, max_retries: int ) -> dict: """Code-based execution guarantees.""" import time
for attempt in range(max_retries): try: return self._call_llm(prompt, context) except Exception as e: if attempt == max_retries - 1: # Log error with context self._log_error(e, prompt, context, attempt) raise time.sleep(2 ** attempt)
return {}
def _call_llm(self, prompt: str, context: dict) -> dict: """Call your LLM of choice here.""" # Implementation depends on your LLM client pass
def _log_error(self, error, prompt, context, attempt): """Structured error logging.""" import logging logger = logging.getLogger(__name__) logger.error(json.dumps({ "error": str(error), "attempt": attempt, "context_keys": list(context.keys()), "prompt_length": len(prompt) }))When to Use More Code vs. More Files
I’ve found a simple heuristic:
Put it in a FILE when:- A human needs to read or edit it- You want version control history- It describes WHAT the agent should do- It changes frequently during development
Put it in CODE when:- The agent needs to execute it reliably- You need error handling and retry logic- It describes HOW the agent should behave- Multiple skills share the same logicCommon Mistakes I Made
Mistake 1: Over-Abstracting the File Format
I initially tried to create a complex YAML schema for skill files. Then I realized: the point is human readability. Markdown is enough.
Mistake 2: Putting Too Much in Code
I wrote a state machine with 47 states. It was impossible to understand. Now I keep orchestration simple and put complexity in the skill definitions.
Mistake 3: No Separation Between Skill and Execution
My first hybrid attempt mixed skill prompts with execution code. The files became unreadable. Now files are pure skill definitions, code is pure orchestration.
The Hybrid Mindset
The debate between frameworks and file trees creates a false choice. The best architecture uses both:
Files for: Code for:────────────────────────────────────────────Knowledge storage State managementSkill definitions Error handlingProject context Retry logicHuman collaboration Structured loggingGit history ValidationPrompt templates Output parsingThis hybrid approach gives you:
- Simplicity: Files are easy to read and edit
- Reliability: Code handles execution edge cases
- Flexibility: Change skills without touching code
- Debuggability: See both the plan (files) and execution (logs)
- Maintainability: Non-developers can modify agent behavior
Summary
I abandoned pure frameworks because they hide too much. I abandoned pure file-based approaches because they guarantee too little. The hybrid architecture gives me both transparency and reliability.
Start with this principle: if a human needs to read or edit it, put it in a file. If an agent needs to execute or manage it, put it in code. The intersection is where AI agent architecture works best.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit discussion on AI agent architecture
- 👨💻 LangGraph Documentation
- 👨💻 Claude Code Skills System
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments