How to Implement Deterministic Routing for AI Agents
Problem
After 6 months of running AI agents in production, I kept hitting the same issue: unpredictable routing. My agent would sometimes call the wrong tool, or call tools in the wrong order, and I couldn’t figure out why.
Here’s what my agent code looked like:
from langchain.agents import create_react_agent
agent = create_react_agent( llm=llm, tools=[search_tool, calculator_tool, database_tool], prompt="You are a helpful assistant. Use tools as needed.")
result = agent.invoke({"input": user_query})When something went wrong, I was stuck reading model outputs trying to reconstruct what happened. Same input could produce different routing paths across runs. I had no visibility into WHY the model chose a specific tool.
Environment
- Python 3.12
- LangGraph for state machines
- PostgreSQL for execution traces
- Pytest for testing routing logic
Understanding the Root Cause
The problem wasn’t the model. The problem was asking the LLM to handle control flow.
Most agent frameworks default to LLM-driven routing. The model receives available tools and must decide:
- Which tool to call next
- What arguments to pass
- Whether to call another tool or return to the user
This creates several production issues:
- Non-determinism: Same input produces different routing paths- Opaque debugging: Reading model outputs to reconstruct decisions- Hidden failures: Model calls wrong tools in wrong order- Temperature sensitivity: Even low temperature introduces variance- Prompt coupling: Routing logic lives in natural language, not codeA Reddit post crystallized this for me: “Pull routing out of the LLM entirely. Tool selection by structured rules before the LLM is ever consulted. The model handles reasoning. It does not handle control flow.”
Solution: Deterministic Routing
Deterministic routing separates concerns:
ROUTING LOGIC (deterministic, explicit) -> TOOL EXECUTION -> LLM REASONING (within tool)Instead of asking “which tool should I use?”, my code explicitly defines:
- State machine transitions: What states exist and valid transitions
- Routing rules: Code-based conditions that determine next tool
- Execution order: Predefined sequences for common workflows
- Fallback paths: Explicit handling when conditions aren’t met
Step 1: Define Explicit State
from typing import TypedDict
class AgentState(TypedDict): query: str query_type: str | None tool_result: dict | None final_answer: str | None execution_trace: listStep 2: Create Deterministic Router
def route_query(state: AgentState) -> str: """Deterministic routing based on query analysis.""" query = state["query"].lower()
# Explicit rules - version controlled, testable if any(word in query for word in ["calculate", "math", "sum", "average"]): return "calculator" elif any(word in query for word in ["search", "find", "look up"]): return "search" elif any(word in query for word in ["database", "record", "user"]): return "database" else: return "llm_fallback"This is pure Python. I can unit test it. I can version control it. I can see exactly why a query routes to a specific tool.
Step 3: Build State Machine with LangGraph
from langgraph.graph import StateGraph, END
# Build state machinegraph = StateGraph(AgentState)
# Add nodes (tools)graph.add_node("router", route_query_node)graph.add_node("calculator", calculator_tool)graph.add_node("search", search_tool)graph.add_node("database", database_tool)graph.add_node("llm_fallback", llm_reasoning_node)graph.add_node("synthesize", synthesize_answer)
# Define explicit edgesgraph.add_edge("router", "calculator", condition=lambda s: s["query_type"] == "calculator")graph.add_edge("router", "search", condition=lambda s: s["query_type"] == "search")graph.add_edge("router", "database", condition=lambda s: s["query_type"] == "database")graph.add_edge("router", "llm_fallback", condition=lambda s: s["query_type"] == "fallback")
# All tools flow to synthesisgraph.add_edge("calculator", "synthesize")graph.add_edge("search", "synthesize")graph.add_edge("database", "synthesize")graph.add_edge("llm_fallback", "synthesize")graph.add_edge("synthesize", END)
app = graph.compile()Now I run the agent:
result = app.invoke({"query": "What is 25% of 400?", "execution_trace": []})# Trace: router -> calculator -> synthesize -> END# I know EXACTLY what path was taken and whyStep 4: Add Execution Tracing
The biggest win was observability. I went from reading prompt text hoping to reverse-engineer what happened, to having a complete execution trace on every run.
import jsonfrom datetime import datetime
def traced_node(node_name): """Decorator to add tracing to any node.""" def decorator(func): def wrapper(state: AgentState) -> AgentState: trace_entry = { "node": node_name, "timestamp": datetime.utcnow().isoformat(), "input": {k: v for k, v in state.items() if k != "execution_trace"}, }
result = func(state)
trace_entry["output"] = {k: v for k, v in result.items() if k != "execution_trace"}
result["execution_trace"] = result.get("execution_trace", []) + [trace_entry] return result return wrapper return decoratorRunning with tracing:
result = app.invoke({"query": "Search for Python tutorials", "execution_trace": []})print(json.dumps(result["execution_trace"], indent=2))Output:
[ { "node": "router", "timestamp": "2026-03-20T10:30:00.000000", "input": {"query": "Search for Python tutorials"}, "output": {"query_type": "search"} }, { "node": "search", "timestamp": "2026-03-20T10:30:00.100000", "input": {"query": "Search for Python tutorials", "query_type": "search"}, "output": {"tool_result": ["result1", "result2"]} }, { "node": "synthesize", "timestamp": "2026-03-20T10:30:00.200000", "input": {"query": "Search for Python tutorials", "query_type": "search", "tool_result": ["result1", "result2"]}, "output": {"final_answer": "Here are Python tutorials..."} }]Testing Routing Logic
Now I can unit test routing without calling the LLM:
import pytestfrom router import route_query
class TestRouter: def test_calculator_routing(self): state = {"query": "What is 25% of 400?"} assert route_query(state) == "calculator"
def test_search_routing(self): state = {"query": "Find Python tutorials"} assert route_query(state) == "search"
def test_database_routing(self): state = {"query": "Get user record 123"} assert route_query(state) == "database"
def test_fallback_routing(self): state = {"query": "Write a poem about cats"} assert route_query(state) == "llm_fallback"Running tests:
pytest test_router.py -v
# Outputtest_router.py::TestRouter::test_calculator_routing PASSEDtest_router.py::TestRouter::test_search_routing PASSEDtest_router.py::TestRouter::test_database_routing PASSEDtest_router.py::TestRouter::test_fallback_routing PASSEDWhat Changed in Production
Before deterministic routing, debugging was a nightmare:
User: "Agent called wrong tool"Me: "Let me check the logs... model output says it chose tool X"User: "Why?"Me: "I don't know, the prompt seemed clear"User: "Can you fix it?"Me: "I'll try tweaking the prompt..."After deterministic routing:
User: "Agent called wrong tool"Me: "Let me check the execution trace..."Me: "Router classified query as type X, called tool Y"Me: "The routing rule matched keyword Z in the query"User: "That makes sense, can we add an exception for this case?"Me: "Sure, I'll add a condition to the routing function"Common Mistakes I Made
-
Mixing routing in prompts: I tried putting if-then logic in system prompts instead of code. This made debugging harder, not easier.
-
Over-delegating to LLM: I asked the model to choose from 20+ tools. Deterministic routing works best when you narrow options with rules first.
-
Missing state tracking: I didn’t maintain explicit state, letting the context window get cluttered. Clear state objects fixed this.
-
No execution traces: I relied on model output logs instead of structured trace data. Adding execution_trace to state changed everything.
Summary
In this post, I showed how to implement deterministic routing for AI agents. The key point is pulling routing logic out of LLM prompts and into explicit code - state machines, routing functions, or rule engines.
The model handles reasoning within tools. Your code handles control flow between them. This separation gives you:
- Complete execution traces for debugging
- Reproducible behavior across runs
- Unit testable routing logic
- Version-controlled routing rules
For production AI systems, this pattern is essential. Start with LangGraph or similar frameworks that make state machines and explicit routing natural.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit: After 6 months of agent failures in production
- 👨💻 LangGraph Documentation
- 👨💻 Circuit Breaker Pattern
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments