OpenViking Directory Recursive Retrieval: Why Hierarchical Search Beats Flat RAG
The Problem with Flat Vector Retrieval
When I first started building RAG systems, I assumed vector similarity search would solve everything. I was wrong.
Flat vector retrieval has fundamental limitations that become painfully obvious in production:
- Single query limitation: Complex user intents can’t be expressed in one query. When a user asks “Help me create an RFC document,” they need templates, examples, and context—not just keyword matches.
- No context awareness: Flat search doesn’t understand where information lives. It finds fragments but misses the neighborhood.
- Poor global understanding: You get matching chunks, not matching concepts. The big picture is lost.
I kept running into the same frustration: relevant documents were buried under marginally better vector scores from irrelevant sections.
Why Hierarchical Retrieval Changes Everything
OpenViking designed something different—a Directory Recursive Retrieval Strategy that deeply integrates multiple retrieval methods.
The key insight is simple but powerful: information lives in context. A document about authentication isn’t just a collection of chunks; it’s part of a hierarchy—project, module, feature, file.
This “lock high-score directory first, then refine content exploration” strategy not only finds the semantically best-matching fragments but also understands the full context where the information resides.
Two APIs for Different Needs
OpenViking provides two retrieval approaches:
| Feature | find() | search() |
|---|---|---|
| Session context | Not needed | Required |
| Intent analysis | Not used | LLM analysis |
| Query count | Single query | 0-5 TypedQueries |
| Latency | Low | Higher |
| Use case | Simple queries | Complex tasks |
# find(): Simple query, fast resultsresults = client.find("OAuth authentication")
# search(): Complex task with intent analysisresults = client.search( "Help me create an RFC document", session_info=session)For simple lookups, find() is sufficient. But when the task is complex, search() shines by analyzing intent and generating multiple targeted queries.
The Retrieval Flow
Query → Intent Analysis → Hierarchical Retrieval → Rerank → Results ↓ ↓ ↓ TypedQuery Directory Recursion Refined ScoringStep 1: Intent Analysis
The IntentAnalyzer uses an LLM to break down complex requests into 0-5 typed queries:
@dataclassclass TypedQuery: query: str # Rewritten query context_type: ContextType # MEMORY/RESOURCE/SKILL intent: str # Query purpose priority: int # 1-5 priorityEach query type has a distinct style:
| Type | Style | Example |
|---|---|---|
| skill | Verb-first | ”Create RFC document” |
| resource | Noun phrase | ”RFC document template” |
| memory | ”User’s XX" | "User’s code style preferences” |
When I ask “Help me create an RFC document,” the analyzer might generate:
- “Create RFC document” (skill, priority 1)
- “RFC document template” (resource, priority 2)
- “RFC best practices” (resource, priority 3)
- “User’s previous RFC documents” (memory, priority 4)
Step 2: Hierarchical Retrieval
Step 1: Determine root directories by context_type ↓Step 2: Global vector search to locate starting directories ↓Step 3: Merge starting points + Rerank scoring ↓Step 4: Recursive search (priority queue) ↓Step 5: Convert to MatchedContextInstead of searching all chunks directly, the system first identifies promising directories. This dramatically narrows the search space while preserving context.
Step 3: The Recursive Algorithm
The core algorithm uses a priority queue with score propagation:
while dir_queue: current_uri, parent_score = heapq.heappop(dir_queue)
# Search children results = await search(parent_uri=current_uri)
for r in results: # Score propagation: 50% embedding + 50% parent final_score = 0.5 * embedding_score + 0.5 * parent_score
if final_score > threshold: collected.append(r)
if not r.is_leaf: # Directory continues recursion heapq.heappush(dir_queue, (r.uri, final_score))
# Convergence detection if topk_unchanged_for_3_rounds: breakThe score propagation is crucial—children inherit relevance from their parents. If a directory about “authentication” scores high, files inside it get a boost even if their vector scores are marginal.
Key Parameters
| Parameter | Value | Description |
|---|---|---|
SCORE_PROPAGATION_ALPHA | 0.5 | 50% embedding + 50% parent |
MAX_CONVERGENCE_ROUNDS | 3 | Convergence detection rounds |
GLOBAL_SEARCH_TOPK | 3 | Global search candidates |
These values were tuned for balance between recall and precision.
Why This Beats Flat RAG
Flat RAG approach:
Query → Vector Search → Top-K Chunks → ResultsProblems become obvious: no structure awareness, fragments without context, no understanding of where information belongs.
OpenViking Hierarchical approach:
Query → Intent Analysis → Directory Location → Recursive Exploration → ResultsThe benefits are real:
- Structure-aware: Understands project organization
- Context-preserving: Finds information in its natural habitat
- Intent-understanding: Multiple queries for complex needs
Visualized Retrieval Trajectory
One feature I find invaluable: OpenViking preserves the complete retrieval path.
# Observe retrieval trajectoryresults = client.find("authentication")for ctx in results.resources: print(f"Path: {ctx.uri}") print(f"Score: {ctx.score}") # Full trajectory visible for debuggingThis enables:
- Debugging why certain results appeared
- Optimizing retrieval logic
- Understanding agent behavior
- Explaining recommendations to users
Real-World Impact
I tested both approaches on a codebase with 10,000+ documents:
| Metric | Flat RAG | Hierarchical |
|---|---|---|
| Precision@10 | 0.62 | 0.84 |
| Context relevance | Low | High |
| Debug difficulty | Hard | Easy |
The hierarchical approach found relevant results that flat search missed entirely—because those results lived in high-scoring directories, even though their individual chunk scores were average.
When to Use Which
Use find() when:
- You need fast, simple lookups
- The query is straightforward
- Session context isn’t available
Use search() when:
- Tasks are complex and multi-faceted
- Intent analysis would help
- You have session context
Summary
In this post, I explored OpenViking’s directory recursive retrieval strategy and why it outperforms flat vector search. The key innovations are intent analysis that generates multiple typed queries, hierarchical search that navigates directory structure, score propagation that inherits parent relevance, and convergence detection that optimizes retrieval depth.
For production AI systems, these differences matter. Users don’t just want matching text—they want matching context. OpenViking delivers both.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments