Skip to content

ReAct Framework: The Reasoning Pattern Behind Modern AI Agents

I spent three weeks building an AI agent that could answer questions about my codebase. It could think, reason, even generate beautiful Chain-of-Thought explanations. But when I asked it “What’s the most complex function in this project?”, it confidently told me about a function… that didn’t exist.

The problem? My agent could only think. It couldn’t act.

The Missing Piece

I had built my agent using pure Chain-of-Thought (CoT) prompting - the technique where you ask an LLM to “think step by step.” It worked beautifully for math problems and logic puzzles. But for real-world tasks? Useless.

agent-cot.js
// My naive Chain-of-Thought implementation
const answer = await llm.generate(`
Think step by step: ${question}
Let me think through this carefully...
`);
// Output: Confident hallucinations

The agent would generate impressive-sounding reasoning chains that had no connection to reality. It was like asking someone to describe a room they’d never entered - detailed, confident, and completely wrong.

I needed my agent to interact with the world. To read files. To run tests. To grep code. But how?

Discovering ReAct

Then I stumbled across a paper from Yao et al. titled “ReAct: Synergizing Reasoning and Acting in Language Models.” The title clicked: Reasoning + Acting.

┌─────────────────────────────────────────────────────┐
│ Pure Chain-of-Thought │
│ │
│ Question → Think → Think → Think → Answer │
│ (all internal, no grounding) │
└─────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ ReAct │
│ │
│ Question → Think → Act → Observe → Think → ... │
│ ↑_________________________| │
│ (feedback loop with real world) │
└─────────────────────────────────────────────────────┘

The insight was deceptively simple: alternate between thinking and acting. Don’t just reason in isolation - let your reasoning guide actions, and let observations from those actions inform further reasoning.

The ReAct Loop in Practice

I rewrote my agent to follow the ReAct pattern:

react-loop.js
const reactLoop = async (agent, task) => {
let state = {
task,
thoughts: [],
actions: [],
observations: []
};
while (!state.complete) {
// Thought phase - reason about current state
const thought = await agent.think(state);
state.thoughts.push(thought);
console.log(`Thought: ${thought.content}`);
if (thought.concludes) {
state.complete = true;
return thought.answer;
}
// Action phase - decide what to do
const action = await agent.decideAction(thought);
state.actions.push(action);
console.log(`Action: ${action.tool}(${action.input})`);
// Observation phase - execute and observe
const observation = await executeAction(action);
state.observations.push(observation);
console.log(`Observation: ${observation.result}`);
// Update state for next iteration
state = { ...state, lastObservation: observation };
}
};

Now when I asked “What’s the most complex function?”, the agent would:

  1. Think: “I need to find the most complex function. First, I should list all files.”
  2. Act: Call listFiles("./src")
  3. Observe: See the file list
  4. Think: “Now I need to analyze each file for complexity…”
  5. Act: Call grepPattern("function", each file)
  6. Observe: Get function definitions
  7. …and so on until it found the real answer.

Why ReAct Works

The magic is in the feedback loop. Let me show you what happened when I traced through both approaches:

Pure Chain-of-Thought (what I had before):

User: What's the most complex function?
LLM: Let me think...
1. Complex functions have many lines
2. Complex functions have many conditionals
3. Looking at my training data, the most complex function is likely a sorting algorithm
4. Based on common patterns, I'd say `quicksort` is probably the most complex
5. Answer: The `quicksort` function is the most complex
Reality: No quicksort function exists in this codebase.

ReAct Pattern (what I built after):

User: What's the most complex function?
LLM:
Thought: I need to find all functions in the codebase first.
Action: findFunctions("./src")
Observation: Found 47 functions across 12 files.
Thought: Now I need to analyze complexity. Let me check cyclomatic complexity.
Action: analyzeComplexity(functions)
Observation: Top 3 by complexity:
- processPayment(): 23 cyclomatic complexity
- validateForm(): 18 cyclomatic complexity
- parseConfig(): 15 cyclomatic complexity
Thought: processPayment has highest complexity. Let me verify its details.
Action: readFile("payment.js")
Observation: function processPayment(order) { ... } // 156 lines
Thought: I have sufficient information to answer.
Answer: processPayment() with 23 cyclomatic complexity and 156 lines.
Reality: Correct - verified against actual codebase.

The difference? ReAct grounds reasoning in reality. Every thought can be validated through action. Every observation informs the next thought.

The Numbers Don’t Lie

The ReAct paper reported impressive benchmarks. On ALFWorld (a household task simulation), ReAct outperformed reinforcement learning baselines by 34%. On WebShop (online shopping tasks), it beat them by 10%.

But here’s what really caught my attention: ReAct achieved this using only 1-2 in-context examples. No fine-tuning. No massive training runs. Just show the model a couple of examples and let it reason-act-observe.

Performance on ALFWorld benchmark:
┌────────────────────────────┬────────────┐
│ Method │ Success % │
├────────────────────────────┼────────────┤
│ Random │ 7.3% │
│ Imitation Learning │ 28.1% │
│ Reinforcement Learning │ 37.1% │
│ ReAct (1-2 examples) │ 71.1% │
└────────────────────────────┴────────────┘

How Modern Agents Use ReAct

Once I understood ReAct, I started seeing it everywhere. Claude Code? ReAct. Cursor? ReAct. GitHub Copilot? ReAct. The differences between these products aren’t in the core pattern - they’re in the engineering:

Tool Selection: What actions can the agent take? File operations, terminal commands, web searches?

Context Management: How do you maintain state across hundreds of thought-action-observation cycles?

Error Recovery: What happens when an action fails? Does the agent retry? Try a different approach?

Termination: When does the agent stop? How does it know it’s done?

modern-react-impl.py
class ModernReActAgent:
"""How production systems implement ReAct"""
def __init__(self):
self.tools = {
"read_file": self.read_file,
"write_file": self.write_file,
"run_command": self.run_command,
"grep": self.grep,
# ... more tools
}
self.max_iterations = 100
self.context_window = 200000 # tokens
async def run(self, task: str) -> str:
state = AgentState(task=task)
for i in range(self.max_iterations):
# Manage context - don't overflow
if state.token_count > self.context_window * 0.8:
state = self.summarize_state(state)
# Thought
thought = await self.think(state)
# Check if we're done
if thought.type == "conclusion":
return thought.answer
# Action
action = thought.action
if action.tool not in self.tools:
# Error recovery - tell agent to try different approach
observation = Observation(
error=f"Unknown tool: {action.tool}",
available_tools=list(self.tools.keys())
)
else:
# Execute with error handling
try:
observation = await self.tools[action.tool](action.input)
except Exception as e:
observation = Observation(
error=str(e),
suggestion="Try a different approach"
)
# Update state for next iteration
state.add(thought, action, observation)
return "Max iterations reached without conclusion"

Common Misconceptions

After working with ReAct for months, I’ve seen several misconceptions repeatedly:

1. “ReAct is just tool calling”

No. Tool calling is the Action part. The Thought-Observation cycle is what makes it ReAct. Without the reasoning component, you just have a function-calling wrapper.

2. “ReAct is experimental”

Quite the opposite. ReAct is the industry standard. Almost every production AI agent uses it. The “new” agent frameworks you hear about? They’re ReAct with different tooling.

3. “ReAct is too slow for production”

It’s true that ReAct requires multiple LLM calls. But for complex tasks, it’s actually faster because it avoids hallucination-induced failures. A correct answer in 10 seconds beats a wrong answer in 1 second.

4. “My agent doesn’t need reasoning, just actions”

Try debugging why your agent took a wrong action. Without the thought traces, you’re flying blind. ReAct gives you visibility into the decision process.

The Simplest ReAct Implementation

Here’s a minimal working example you can try:

minimal-react.py
import json
def react_loop(prompt: str, llm, tools: dict, max_turns: int = 10):
"""
Minimal ReAct implementation.
Args:
prompt: The task/question
llm: Function that takes a string and returns LLM response
tools: Dict mapping tool names to functions
max_turns: Maximum iterations
"""
messages = [
{"role": "system", "content": """You are a ReAct agent. Follow this format:
Thought: [your reasoning about what to do next]
Action: {"tool": "tool_name", "input": "tool_input"}
OR
Thought: [your final reasoning]
Answer: [your final answer]
Available tools: """ + ", ".join(tools.keys())},
{"role": "user", "content": prompt}
]
for _ in range(max_turns):
response = llm(messages)
messages.append({"role": "assistant", "content": response})
# Check for final answer
if "Answer:" in response:
return response.split("Answer:")[1].strip()
# Extract and execute action
if "Action:" in response:
try:
action_str = response.split("Action:")[1].split("\n")[0].strip()
action = json.loads(action_str)
result = tools[action["tool"]](action["input"])
messages.append({
"role": "user",
"content": f"Observation: {result}"
})
except Exception as e:
messages.append({
"role": "user",
"content": f"Observation: Error - {str(e)}"
})
return "Max turns reached without answer"
# Usage example
def my_llm(messages):
# Replace with your LLM call
pass
def search(query):
# Replace with your search implementation
return f"Results for: {query}"
result = react_loop(
"What is the capital of France?",
my_llm,
{"search": search}
)

When ReAct Shines (And When It Doesn’t)

ReAct excels at:

  • Multi-step tasks requiring external data
  • Tasks where verification is possible
  • Debugging and exploration
  • Complex decision trees

ReAct struggles with:

  • Simple, single-step queries (overhead isn’t worth it)
  • Tasks requiring creative insight (reasoning can’t substitute for creativity)
  • Extremely long chains (context window limits)

The Pattern Behind the Pattern

Understanding ReAct changed how I think about AI agents. It’s not about the tools or the prompts - it’s about the loop. The cycle of reasoning, acting, and observing creates a system that can correct itself, adapt to unexpected results, and maintain focus on the goal.

Every time you see an AI agent “thinking” before it acts, that’s ReAct. Every time you see it read a file, process it, and decide what to do next - ReAct. It’s become so ubiquitous that we forget it’s a pattern at all. It’s just… how agents work.

But understanding why it works gives you the power to debug it, improve it, and build better systems on top of it. The Thought-Action-Observation loop isn’t just clever engineering - it’s the fundamental structure that makes AI agents useful in the real world.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!


See Also

Comments