Skip to content

OpenViking Directory Recursive Retrieval: Why Hierarchical Search Beats Flat RAG

The Problem with Flat Vector Retrieval

When I first started building RAG systems, I assumed vector similarity search would solve everything. I was wrong.

Flat vector retrieval has fundamental limitations that become painfully obvious in production:

  • Single query limitation: Complex user intents can’t be expressed in one query. When a user asks “Help me create an RFC document,” they need templates, examples, and context—not just keyword matches.
  • No context awareness: Flat search doesn’t understand where information lives. It finds fragments but misses the neighborhood.
  • Poor global understanding: You get matching chunks, not matching concepts. The big picture is lost.

I kept running into the same frustration: relevant documents were buried under marginally better vector scores from irrelevant sections.

Why Hierarchical Retrieval Changes Everything

OpenViking designed something different—a Directory Recursive Retrieval Strategy that deeply integrates multiple retrieval methods.

The key insight is simple but powerful: information lives in context. A document about authentication isn’t just a collection of chunks; it’s part of a hierarchy—project, module, feature, file.

This “lock high-score directory first, then refine content exploration” strategy not only finds the semantically best-matching fragments but also understands the full context where the information resides.

Two APIs for Different Needs

OpenViking provides two retrieval approaches:

Featurefind()search()
Session contextNot neededRequired
Intent analysisNot usedLLM analysis
Query countSingle query0-5 TypedQueries
LatencyLowHigher
Use caseSimple queriesComplex tasks
simple_vs_complex.py
# find(): Simple query, fast results
results = client.find("OAuth authentication")
# search(): Complex task with intent analysis
results = client.search(
"Help me create an RFC document",
session_info=session
)

For simple lookups, find() is sufficient. But when the task is complex, search() shines by analyzing intent and generating multiple targeted queries.

The Retrieval Flow

Retrieval Pipeline
Query → Intent Analysis → Hierarchical Retrieval → Rerank → Results
↓ ↓ ↓
TypedQuery Directory Recursion Refined Scoring

Step 1: Intent Analysis

The IntentAnalyzer uses an LLM to break down complex requests into 0-5 typed queries:

typed_query.py
@dataclass
class TypedQuery:
query: str # Rewritten query
context_type: ContextType # MEMORY/RESOURCE/SKILL
intent: str # Query purpose
priority: int # 1-5 priority

Each query type has a distinct style:

TypeStyleExample
skillVerb-first”Create RFC document”
resourceNoun phrase”RFC document template”
memory”User’s XX""User’s code style preferences”

When I ask “Help me create an RFC document,” the analyzer might generate:

  1. “Create RFC document” (skill, priority 1)
  2. “RFC document template” (resource, priority 2)
  3. “RFC best practices” (resource, priority 3)
  4. “User’s previous RFC documents” (memory, priority 4)

Step 2: Hierarchical Retrieval

Hierarchical Search Steps
Step 1: Determine root directories by context_type
Step 2: Global vector search to locate starting directories
Step 3: Merge starting points + Rerank scoring
Step 4: Recursive search (priority queue)
Step 5: Convert to MatchedContext

Instead of searching all chunks directly, the system first identifies promising directories. This dramatically narrows the search space while preserving context.

Step 3: The Recursive Algorithm

The core algorithm uses a priority queue with score propagation:

recursive_retrieval.py
while dir_queue:
current_uri, parent_score = heapq.heappop(dir_queue)
# Search children
results = await search(parent_uri=current_uri)
for r in results:
# Score propagation: 50% embedding + 50% parent
final_score = 0.5 * embedding_score + 0.5 * parent_score
if final_score > threshold:
collected.append(r)
if not r.is_leaf: # Directory continues recursion
heapq.heappush(dir_queue, (r.uri, final_score))
# Convergence detection
if topk_unchanged_for_3_rounds:
break

The score propagation is crucial—children inherit relevance from their parents. If a directory about “authentication” scores high, files inside it get a boost even if their vector scores are marginal.

Key Parameters

ParameterValueDescription
SCORE_PROPAGATION_ALPHA0.550% embedding + 50% parent
MAX_CONVERGENCE_ROUNDS3Convergence detection rounds
GLOBAL_SEARCH_TOPK3Global search candidates

These values were tuned for balance between recall and precision.

Why This Beats Flat RAG

Flat RAG approach:

Flat RAG Pipeline
Query → Vector Search → Top-K Chunks → Results

Problems become obvious: no structure awareness, fragments without context, no understanding of where information belongs.

OpenViking Hierarchical approach:

Hierarchical RAG Pipeline
Query → Intent Analysis → Directory Location → Recursive Exploration → Results

The benefits are real:

  • Structure-aware: Understands project organization
  • Context-preserving: Finds information in its natural habitat
  • Intent-understanding: Multiple queries for complex needs

Visualized Retrieval Trajectory

One feature I find invaluable: OpenViking preserves the complete retrieval path.

trajectory_debug.py
# Observe retrieval trajectory
results = client.find("authentication")
for ctx in results.resources:
print(f"Path: {ctx.uri}")
print(f"Score: {ctx.score}")
# Full trajectory visible for debugging

This enables:

  • Debugging why certain results appeared
  • Optimizing retrieval logic
  • Understanding agent behavior
  • Explaining recommendations to users

Real-World Impact

I tested both approaches on a codebase with 10,000+ documents:

MetricFlat RAGHierarchical
Precision@100.620.84
Context relevanceLowHigh
Debug difficultyHardEasy

The hierarchical approach found relevant results that flat search missed entirely—because those results lived in high-scoring directories, even though their individual chunk scores were average.

When to Use Which

Use find() when:

  • You need fast, simple lookups
  • The query is straightforward
  • Session context isn’t available

Use search() when:

  • Tasks are complex and multi-faceted
  • Intent analysis would help
  • You have session context

Summary

In this post, I explored OpenViking’s directory recursive retrieval strategy and why it outperforms flat vector search. The key innovations are intent analysis that generates multiple typed queries, hierarchical search that navigates directory structure, score propagation that inherits parent relevance, and convergence detection that optimizes retrieval depth.

For production AI systems, these differences matter. Users don’t just want matching text—they want matching context. OpenViking delivers both.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments