Why I Abandoned Pure Frameworks for Hybrid AI Agent Architecture

Mar 21, 2026

The LangChain Fatigue

I spent three months building AI agents with LangChain. Every time something broke, I had to dig through layers of abstractions just to find the actual problem.

One day my agent kept failing silently. I traced the issue through six wrapper classes before finding the root cause: a simple string formatting error that the framework swallowed and re-wrapped into a generic “AgentExecutionException.”

That’s when I realized the framework was hiding problems instead of helping me solve them.

I tried the opposite approach: pure file-based agents. Just markdown files and prompts. That was simpler but I lost execution guarantees. Tasks would start but never complete. No retry logic. No error handling. No state management.

Then I found the middle ground that actually works.

What I Mean by “Hybrid Architecture”

The hybrid approach splits responsibilities:

Files store knowledge, skills, and context
Code handles orchestration, error handling, and state management

This isn’t about choosing between frameworks and files. It’s about using each for what it does best.

┌─────────────────────────────────────────────────────────────┐
│                    AI Agent System                          │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────────────┐    ┌─────────────────────────┐    │
│  │  FILE-BASED LAYER   │    │  CODE-BASED LAYER       │    │
│  │  (Human-Readable)    │    │  (Execution Guarantees) │    │
│  ├─────────────────────┤    ├─────────────────────────┤    │
│  │                     │    │                         │    │
│  │  skills/            │    │  State Management       │    │
│  │    research.md      │◄──►│  Retry Logic            │    │
│  │    code-review.md   │    │  Error Handling         │    │
│  │    publish.md       │    │  Logging/Monitoring     │    │
│  │                     │    │  Validation             │    │
│  │  context/           │    │                         │    │
│  │    project-goals.md  │    └─────────────────────────┘    │
│  │    constraints.md   │                                    │
│  │                     │                                    │
│  │  state/             │                                    │
│  │    current-task.md  │                                    │
│  │                     │                                    │
│  └─────────────────────┘                                    │
│                                                             │
└─────────────────────────────────────────────────────────────┘

The key insight: files are for humans to read and edit. Code is for machines to execute reliably. The intersection is where this architecture shines.

The File-Based Memory Layer

Files give you three things frameworks can’t:

1. Immediate Readability

When my agent failed last week, I opened skills/research.md and saw exactly what the agent was supposed to do. No framework to decode.

# Skill: Research

## Purpose
Gather information from multiple sources and synthesize findings.

## Trigger Conditions
- User asks for research on a topic
- Project planning phase
- Knowledge gap identified

## Steps
1. Search web for primary sources
2. Extract key claims and evidence
3. Cross-reference with context7 docs
4. Synthesize into structured report

## Output Format
## Summary
[2-3 sentence overview]

## Key Findings
- Finding 1 with evidence
- Finding 2 with evidence

## Sources
- [Source 1](url)
- [Source 2](url)

2. Version Control and Collaboration

I can diff my agent’s “brain” across commits. When a behavior changes unexpectedly, I check git history.

git diff HEAD~1 skills/research.md

# Shows exactly what changed in agent behavior
# Try doing that with a database blob

3. Easy Modification Without Code Changes

Want to change how your agent approaches research? Edit the markdown. No redeployment needed.

The Code-Based Orchestration Layer

Files alone fail at execution guarantees. That’s where code comes in.

What Code Handles Better

Files:               Code:
────────────────────────────────────────────
Knowledge storage    State machine management
Skill definitions    Retry with backoff
Context/prompts      Error classification
Human readability    Structured logging
Git versioning       Input validation
                     Output parsing
                     Timeout handling

A Practical Orchestration Example

I built a simple orchestrator that loads skills from files but executes them with proper error handling:

from typing import TypedDict
from pathlib import Path
import yaml
import time

class AgentState(TypedDict):
    task: str
    current_skill: str
    context: dict
    results: list
    errors: list

class HybridOrchestrator:
    def __init__(self, skills_dir: str):
        self.skills_dir = Path(skills_dir)
        self.max_retries = 3
        self.retry_delay = 1.0

    def load_skill(self, skill_name: str) -> dict:
        """Load skill definition from markdown file."""
        skill_path = self.skills_dir / f"{skill_name}.md"

        if not skill_path.exists():
            raise FileNotFoundError(f"Skill not found: {skill_name}")

        content = skill_path.read_text()
        return self._parse_skill_markdown(content)

    def _parse_skill_markdown(self, content: str) -> dict:
        """Extract structured data from markdown skill file."""
        # Simple parser - extract sections
        sections = {}
        current_section = None

        for line in content.split('\n'):
            if line.startswith('## '):
                current_section = line[3:].strip().lower().replace(' ', '_')
                sections[current_section] = []
            elif current_section and line.strip():
                sections[current_section].append(line.strip())

        return {
            'purpose': '\n'.join(sections.get('purpose', [])),
            'steps': sections.get('steps', []),
            'output_format': '\n'.join(sections.get('output_format', []))
        }

    def execute_skill(self, state: AgentState) -> AgentState:
        """Execute skill with retry logic and error handling."""
        skill = self.load_skill(state['current_skill'])

        for attempt in range(self.max_retries):
            try:
                result = self._run_skill(skill, state['context'])
                return {
                    **state,
                    'results': state['results'] + [result],
                    'errors': state['errors']
                }
            except Exception as e:
                if attempt == self.max_retries - 1:
                    return {
                        **state,
                        'errors': state['errors'] + [{
                            'skill': state['current_skill'],
                            'error': str(e),
                            'attempts': attempt + 1
                        }]
                    }
                time.sleep(self.retry_delay * (2 ** attempt))

        return state

    def _run_skill(self, skill: dict, context: dict) -> dict:
        """Actual skill execution - calls LLM here."""
        # Implementation details depend on your LLM client
        pass

This gives me retry logic, error classification, and structured state management. But the skill itself is defined in a markdown file I can edit anytime.

Why This Beats Pure Frameworks

I tried building the same agent three ways:

Attempt 1: Pure LangChain

from langchain.agents import initialize_agent, Tool
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from langchain.chains import LLMChain

# Define tools
tools = [
    Tool(name="search", func=search, description="Search the web"),
    Tool(name="summarize", func=summarize, description="Summarize text"),
]

# Create agent with memory
llm = ChatOpenAI(model="gpt-4")
memory = ConversationBufferMemory(memory_key="chat_history")

agent = initialize_agent(
    tools,
    llm,
    agent="conversational-react-description",
    memory=memory,
    verbose=True
)

# When something goes wrong, good luck debugging
result = agent.run("Research AI agent architectures")

Problems I faced:

Error messages referenced internal classes I’d never heard of
Adding a simple step required understanding the chain abstraction
Memory management was opaque
Customizing behavior meant overriding framework methods

Attempt 2: Pure File-Based

agent/
├── system-prompt.md
├── tools/
│   ├── search.md
│   └── summarize.md
└── memory/
    └── session.md

Problems I faced:

No retry logic when API calls failed
No way to track which step was running
Debugging meant reading raw markdown
No structured error handling

Attempt 3: Hybrid

agent/
├── skills/              # File-based (human readable)
│   ├── research.md
│   ├── summarize.md
│   └── publish.md
├── context/             # File-based (project context)
│   ├── goals.md
│   └── constraints.md
└── orchestrator.py      # Code-based (execution guarantees)

What worked:

Debugging meant checking both files and logs
Adding skills meant creating markdown files
Error handling was explicit in code
State management was transparent

Real World Implementation

Here’s how I structure my agents now:

from pathlib import Path
from typing import Optional
import json

class HybridAgent:
    def __init__(self, agent_dir: str):
        self.base_path = Path(agent_dir)
        self.skills_path = self.base_path / "skills"
        self.context_path = self.base_path / "context"
        self.state: dict = {}

    def load_context(self) -> dict:
        """Load all context files into memory."""
        context = {}
        for file in self.context_path.glob("*.md"):
            context[file.stem] = file.read_text()
        return context

    def get_skill(self, skill_name: str) -> str:
        """Read skill definition from file."""
        skill_file = self.skills_path / f"{skill_name}.md"
        if not skill_file.exists():
            raise ValueError(f"Unknown skill: {skill_name}")
        return skill_file.read_text()

    def run_skill(self, skill_name: str, **kwargs) -> dict:
        """Execute a skill with orchestration."""
        skill_prompt = self.get_skill(skill_name)
        context = self.load_context()

        # Merge context into skill execution
        full_context = {**context, **kwargs}

        # Execute with retry logic
        result = self._execute_with_retry(
            skill_prompt,
            full_context,
            max_retries=3
        )

        # Update state
        self.state[f"last_{skill_name}"] = result
        return result

    def _execute_with_retry(
        self,
        prompt: str,
        context: dict,
        max_retries: int
    ) -> dict:
        """Code-based execution guarantees."""
        import time

        for attempt in range(max_retries):
            try:
                return self._call_llm(prompt, context)
            except Exception as e:
                if attempt == max_retries - 1:
                    # Log error with context
                    self._log_error(e, prompt, context, attempt)
                    raise
                time.sleep(2 ** attempt)

        return {}

    def _call_llm(self, prompt: str, context: dict) -> dict:
        """Call your LLM of choice here."""
        # Implementation depends on your LLM client
        pass

    def _log_error(self, error, prompt, context, attempt):
        """Structured error logging."""
        import logging
        logger = logging.getLogger(__name__)
        logger.error(json.dumps({
            "error": str(error),
            "attempt": attempt,
            "context_keys": list(context.keys()),
            "prompt_length": len(prompt)
        }))

When to Use More Code vs. More Files

I’ve found a simple heuristic:

Put it in a FILE when:
- A human needs to read or edit it
- You want version control history
- It describes WHAT the agent should do
- It changes frequently during development

Put it in CODE when:
- The agent needs to execute it reliably
- You need error handling and retry logic
- It describes HOW the agent should behave
- Multiple skills share the same logic

Common Mistakes I Made

Mistake 1: Over-Abstracting the File Format

I initially tried to create a complex YAML schema for skill files. Then I realized: the point is human readability. Markdown is enough.

Mistake 2: Putting Too Much in Code

I wrote a state machine with 47 states. It was impossible to understand. Now I keep orchestration simple and put complexity in the skill definitions.

Mistake 3: No Separation Between Skill and Execution

My first hybrid attempt mixed skill prompts with execution code. The files became unreadable. Now files are pure skill definitions, code is pure orchestration.

The Hybrid Mindset

The debate between frameworks and file trees creates a false choice. The best architecture uses both:

Files for:                    Code for:
────────────────────────────────────────────
Knowledge storage             State management
Skill definitions             Error handling
Project context               Retry logic
Human collaboration           Structured logging
Git history                   Validation
Prompt templates              Output parsing

This hybrid approach gives you:

Simplicity: Files are easy to read and edit
Reliability: Code handles execution edge cases
Flexibility: Change skills without touching code
Debuggability: See both the plan (files) and execution (logs)
Maintainability: Non-developers can modify agent behavior

Summary

I abandoned pure frameworks because they hide too much. I abandoned pure file-based approaches because they guarantee too little. The hybrid architecture gives me both transparency and reliability.

Start with this principle: if a human needs to read or edit it, put it in a file. If an agent needs to execute or manage it, put it in code. The intersection is where AI agent architecture works best.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit discussion on AI agent architecture
👨‍💻 LangGraph Documentation
👨‍💻 Claude Code Skills System

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!