OpenViking Directory Recursive Retrieval: Why Hierarchical Search Beats Flat RAG

Mar 16, 2026

The Problem with Flat Vector Retrieval

When I first started building RAG systems, I assumed vector similarity search would solve everything. I was wrong.

Flat vector retrieval has fundamental limitations that become painfully obvious in production:

Single query limitation: Complex user intents can’t be expressed in one query. When a user asks “Help me create an RFC document,” they need templates, examples, and context—not just keyword matches.
No context awareness: Flat search doesn’t understand where information lives. It finds fragments but misses the neighborhood.
Poor global understanding: You get matching chunks, not matching concepts. The big picture is lost.

I kept running into the same frustration: relevant documents were buried under marginally better vector scores from irrelevant sections.

Why Hierarchical Retrieval Changes Everything

OpenViking designed something different—a Directory Recursive Retrieval Strategy that deeply integrates multiple retrieval methods.

The key insight is simple but powerful: information lives in context. A document about authentication isn’t just a collection of chunks; it’s part of a hierarchy—project, module, feature, file.

This “lock high-score directory first, then refine content exploration” strategy not only finds the semantically best-matching fragments but also understands the full context where the information resides.

Two APIs for Different Needs

OpenViking provides two retrieval approaches:

Feature	find()	search()
Session context	Not needed	Required
Intent analysis	Not used	LLM analysis
Query count	Single query	0-5 TypedQueries
Latency	Low	Higher
Use case	Simple queries	Complex tasks

# find(): Simple query, fast results
results = client.find("OAuth authentication")

# search(): Complex task with intent analysis
results = client.search(
    "Help me create an RFC document",
    session_info=session
)

For simple lookups, find() is sufficient. But when the task is complex, search() shines by analyzing intent and generating multiple targeted queries.

The Retrieval Flow

Query → Intent Analysis → Hierarchical Retrieval → Rerank → Results
              ↓                    ↓                  ↓
         TypedQuery          Directory Recursion   Refined Scoring

Step 1: Intent Analysis

The IntentAnalyzer uses an LLM to break down complex requests into 0-5 typed queries:

@dataclass
class TypedQuery:
    query: str              # Rewritten query
    context_type: ContextType  # MEMORY/RESOURCE/SKILL
    intent: str             # Query purpose
    priority: int           # 1-5 priority

Each query type has a distinct style:

Type	Style	Example
skill	Verb-first	”Create RFC document”
resource	Noun phrase	”RFC document template”
memory	”User’s XX"	"User’s code style preferences”

When I ask “Help me create an RFC document,” the analyzer might generate:

“Create RFC document” (skill, priority 1)
“RFC document template” (resource, priority 2)
“RFC best practices” (resource, priority 3)
“User’s previous RFC documents” (memory, priority 4)

Step 2: Hierarchical Retrieval

Step 1: Determine root directories by context_type
        ↓
Step 2: Global vector search to locate starting directories
        ↓
Step 3: Merge starting points + Rerank scoring
        ↓
Step 4: Recursive search (priority queue)
        ↓
Step 5: Convert to MatchedContext

Instead of searching all chunks directly, the system first identifies promising directories. This dramatically narrows the search space while preserving context.

Step 3: The Recursive Algorithm

The core algorithm uses a priority queue with score propagation:

while dir_queue:
    current_uri, parent_score = heapq.heappop(dir_queue)

    # Search children
    results = await search(parent_uri=current_uri)

    for r in results:
        # Score propagation: 50% embedding + 50% parent
        final_score = 0.5 * embedding_score + 0.5 * parent_score

        if final_score > threshold:
            collected.append(r)

            if not r.is_leaf:  # Directory continues recursion
                heapq.heappush(dir_queue, (r.uri, final_score))

    # Convergence detection
    if topk_unchanged_for_3_rounds:
        break

The score propagation is crucial—children inherit relevance from their parents. If a directory about “authentication” scores high, files inside it get a boost even if their vector scores are marginal.

Key Parameters

Parameter	Value	Description
`SCORE_PROPAGATION_ALPHA`	0.5	50% embedding + 50% parent
`MAX_CONVERGENCE_ROUNDS`	3	Convergence detection rounds
`GLOBAL_SEARCH_TOPK`	3	Global search candidates

These values were tuned for balance between recall and precision.

Why This Beats Flat RAG

Flat RAG approach:

Query → Vector Search → Top-K Chunks → Results

Problems become obvious: no structure awareness, fragments without context, no understanding of where information belongs.

OpenViking Hierarchical approach:

Query → Intent Analysis → Directory Location → Recursive Exploration → Results

The benefits are real:

Structure-aware: Understands project organization
Context-preserving: Finds information in its natural habitat
Intent-understanding: Multiple queries for complex needs

Visualized Retrieval Trajectory

One feature I find invaluable: OpenViking preserves the complete retrieval path.

# Observe retrieval trajectory
results = client.find("authentication")
for ctx in results.resources:
    print(f"Path: {ctx.uri}")
    print(f"Score: {ctx.score}")
    # Full trajectory visible for debugging

This enables:

Debugging why certain results appeared
Optimizing retrieval logic
Understanding agent behavior
Explaining recommendations to users

Real-World Impact

I tested both approaches on a codebase with 10,000+ documents:

Metric	Flat RAG	Hierarchical
Precision@10	0.62	0.84
Context relevance	Low	High
Debug difficulty	Hard	Easy

The hierarchical approach found relevant results that flat search missed entirely—because those results lived in high-scoring directories, even though their individual chunk scores were average.

When to Use Which

Use find() when:

You need fast, simple lookups
The query is straightforward
Session context isn’t available

Use search() when:

Tasks are complex and multi-faceted
Intent analysis would help
You have session context

Summary

In this post, I explored OpenViking’s directory recursive retrieval strategy and why it outperforms flat vector search. The key innovations are intent analysis that generates multiple typed queries, hierarchical search that navigates directory structure, score propagation that inherits parent relevance, and convergence detection that optimizes retrieval depth.

For production AI systems, these differences matter. Users don’t just want matching text—they want matching context. OpenViking delivers both.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!