Skip to content

Koog Workflow Strategies: Functional vs Graph vs Planning Compared

Purpose

When I first tried building an AI agent in production, I let the LLM call whatever tools it wanted, whenever it wanted. The result was chaos. Sometimes it would call a database write before reading. Sometimes it would loop endlessly. Sometimes it would just… do nothing useful.

Koog solves this with three workflow strategies: Functional, Graph, and Planning. I spent time understanding each one, and here’s what I learned about when to use which.

The Problem with Unstructured Tool Calling

The default approach—just giving an LLM a pile of tools and hoping it figures out the order—works for demos. It falls apart in production.

I ran into three specific issues:

  1. No verification loops: The agent would call tools in the wrong order, and I couldn’t insert validation steps
  2. No persistence: If the process crashed mid-execution, everything was lost
  3. No debugging: I couldn’t see why the agent made certain decisions

I needed structure. Koog provides three different kinds.

Functional Strategy: Code-Based Orchestration

The Functional strategy is exactly what it sounds like—you write code that orchestrates the agent’s steps. You control what tools are available at each step and what outputs you expect.

Here’s what I built for a simple problem-solving workflow:

FunctionalStrategyExample.java
var functionalAgent = AIAgent.builder()
.promptExecutor(promptExecutor)
.functionalStrategy("my-strategy", (ctx, userInput) -> {
// Step 1: Identify problem with limited tools
ProblemDescription problem = ctx
.subtask("Identify the problem: $userInput")
.withOutput(ProblemDescription.class)
.withTools(communicationTools, databaseReadTools)
.run();
// Step 2: Solve with expanded tools
ProblemSolution solution = ctx
.subtask("Solve the problem: $problem")
.withOutput(ProblemSolution.class)
.withTools(databaseReadTools, databaseWriteTools)
.run();
return solution;
})
.build();

The key insight here is withTools(). In step 1, I only give the agent read access and communication tools. It can’t accidentally modify data. In step 2, I add write tools. This kind of staged permission control prevents a lot of production accidents.

When would I use this? When I know the general flow of the workflow, but I want the LLM to handle the specifics at each step. It’s straightforward, testable, and feels like regular Java code.

Graph Strategy: Finite State Machine Workflows

The Graph strategy treats the workflow as a finite state machine. You define nodes (steps) and edges (transitions between steps). This adds overhead, but it buys you two things I really needed:

  1. Persistence: Each node can be persisted, so if the process crashes, you can resume
  2. Visualization: The graph can be exported and shared with ML colleagues who don’t read code

Here’s how the same problem-solving workflow looks as a graph:

GraphStrategyExample.java
var graphAgent = AIAgent.builder()
.graphStrategy(builder -> {
var graph = builder
.withInput(String.class)
.withOutput(ProblemSolution.class);
// Define nodes
var identifyProblem = AIAgentSubgraph.builder()
.withInput(String.class)
.withOutput(ProblemDescription.class)
.limitedTools(communicationTools, databaseReadTools)
.withTask(input -> "Identify the problem")
.build();
var solveProblem = AIAgentSubgraph.builder()
.withInput(ProblemDescription.class)
.withOutput(ProblemSolution.class)
.limitedTools(databaseReadTools, databaseWriteTools)
.withTask(input -> "Solve the problem")
.build();
var verifySolution = AIAgentSubgraph.builder()
.withInput(ProblemSolution.class)
.withOutput(CriticResult.class)
.limitedTools(communicationTools)
.withTask(input -> "Verify the solution is correct")
.build();
// Connect nodes with edges
graph.edge(graph.nodeStart, identifyProblem);
graph.edge(identifyProblem, solveProblem);
graph.edge(solveProblem, verifySolution);
// Conditional edge - loop back if verification fails
graph.edge(AIAgentEdge.builder()
.from(verifySolution)
.to(solveProblem)
.onCondition(result -> !result.isSuccessful())
.build());
// Finish when verification passes
graph.edge(AIAgentEdge.builder()
.from(verifySolution)
.to(graph.nodeFinish)
.onCondition(CriticResult::isSuccessful)
.build());
return graph.build();
})
.build();

Notice the conditional edges. If verification fails, the workflow loops back to solve again. This kind of control flow is harder to express in the Functional strategy.

I’d use Graph when I need crash recovery or when non-engineers need to understand the workflow. The trade-off is more boilerplate.

Planning Strategy: Goal-Oriented Action Planning

The Planning strategy is different. Instead of defining the execution order yourself, you define actions with preconditions, effects, and goals. The agent (using GOAP—Goal-Oriented Action Planning—or LLM-based planning) figures out the order.

I haven’t used this one as much, but I understand its purpose. When there are multiple valid paths to a goal, and the best path depends on the current state, Planning makes sense.

Planning Strategy Structure:
- Define actions (each with preconditions and effects)
- Define goal conditions
- Agent plans optimal execution order at runtime

This would be useful for something like a troubleshooting agent where the order of diagnostic steps depends on what previous steps find. Or a game AI where the strategy adapts to the situation.

How I Choose Between Them

After working with these, here’s my decision process:

Use Functional when:

  • The workflow has a known structure with some conditional branching
  • You want code that’s easy to read and test
  • You don’t need persistence or visualization
  • You want to control tool access at each step

Use Graph when:

  • The workflow might crash and needs to resume
  • You need to share the workflow with non-developers
  • You want fine-grained persistence at each step
  • The workflow has complex conditional transitions (loops, branches)

Use Planning when:

  • The execution order isn’t known ahead of time
  • Multiple paths could achieve the goal
  • The agent needs to adapt to the current state
  • You’re building something game-like or exploratory

A Mistake I Made

At first, I reached for Graph because it seemed “more powerful.” But for a simple linear workflow with two verification loops, it was over-engineering. The Functional strategy would have been cleaner. I’ve since learned to start with Functional and only move to Graph when I actually need persistence or visualization.

Another mistake: not using subtask-specific tool restrictions. Early on, I gave every step all the tools. That defeated the purpose. Restricting tools per step is a real safety feature—I should use it.

Why This Matters

Production AI agents aren’t chatbots. They integrate with databases, APIs, and external services. They need to be predictable, debuggable, and sometimes recoverable.

Choosing the right workflow strategy isn’t just about code style. It determines whether you can:

  • Recover from crashes mid-execution
  • Understand why the agent did something
  • Share the system with ML colleagues
  • Prevent the agent from taking dangerous actions

Start simple. Use Functional until you hit a wall. Then consider Graph. Planning is for when even Graph is too rigid.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments