How World Monitor Uses AI for Real-Time Intelligence Analysis
The Problem with Cloud-Dependent Intelligence Tools
I wanted an intelligence monitoring tool that could analyze news without sending everything to third-party servers. Most platforms I looked at required cloud API subscriptions and transmitted sensitive data externally. That’s a non-starter for anyone working with sensitive geopolitical information.
What I needed was a local-first approach: run AI models on my machine when possible, fall back to cloud APIs when necessary, and degrade gracefully to browser-based inference when nothing else works. World Monitor implements exactly this architecture.
The 4-Tier AI Fallback Chain
The core insight behind World Monitor’s AI system is that reliability comes from redundancy. Instead of betting on a single provider, it chains four tiers together:
Tier 1: Ollama / LM Studio (Local endpoint, no cloud) ↓ timeout/errorTier 2: Groq (Llama 3.1 8B, temp 0.3, fast cloud inference) ↓ timeout/errorTier 3: OpenRouter (Multi-model fallback) ↓ timeout/errorTier 4: Browser T5 (Transformers.js ONNX, no network required)When I request a news summary, the system tries local inference first. If Ollama isn’t running or times out, it falls through to Groq’s fast cloud API. If that fails, OpenRouter provides model diversity. Finally, if all network options fail, Transformers.js runs a T5 model directly in the browser.
This architecture means I always get a result. The quality varies by tier, but the system never leaves me waiting indefinitely.
How Threat Classification Works
Every news item that enters World Monitor passes through a three-stage classification pipeline. I designed this to balance speed with accuracy:
┌─────────────────────────────────────────────────────────────┐│ Incoming News Item │└─────────────────────────┬───────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────┐│ Stage 1: Keyword Classifier (instant) ││ - Pattern matches ~120 threat keywords ││ - Organized by 5 severity tiers ││ - Zero latency, immediate results │└─────────────────────────┬───────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────┐│ Stage 2: Browser-Side ML (async) ││ - Transformers.js runs in Web Worker ││ - NER for entities (countries, orgs, people) ││ - Sentiment analysis ││ - Topic classification │└─────────────────────────┬───────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────┐│ Stage 3: LLM Classifier (batched async) ││ - Groq Llama 3.1 8B at temperature 0 ││ - Refined classification with reasoning ││ - Cross-references with existing data │└─────────────────────────────────────────────────────────────┘The keyword classifier gives me instant feedback. I can see threat indicators immediately while the ML pipeline refines the analysis in the background. This hybrid approach means the UI never blocks waiting for AI processing.
Client-Side Vector Memory (RAG)
One feature I’m particularly proud of is the browser-local Retrieval-Augmented Generation system. Instead of sending all my headline history to a cloud API, I store embeddings locally:
RSS Feed Parse → ML Worker (Web Worker) │ ┌───────┴───────┐ │ ONNX Embeddings│ │ all-MiniLM-L6 │ │ 384-dim float32│ └───────┬───────┘ │ ┌───────┴───────┐ │ IndexedDB Store│ │ 5,000 vector │ │ LRU by ingestAt│ └───────────────┘The system uses all-MiniLM-L6-v2 to generate 384-dimensional embeddings. These get stored in IndexedDB with a 5,000-vector cap and LRU eviction. When I search for related headlines, the similarity search happens entirely in my browser.
This approach has concrete benefits:
- Privacy: My reading patterns never leave my device
- Offline: Semantic search works without internet
- Cost: No API calls for embedding generation
- Speed: Zero network latency for similarity queries
AI-Powered News Briefings
When I want a summary of current events, World Monitor generates briefings with several smart features:
Headline Deduplication: Using Jaccard similarity with a 0.6 threshold, the system removes near-duplicate headlines before summarization. This prevents the AI from repeating the same story.
Variant-Aware Prompting: The system adjusts its focus based on what I’m monitoring. Geopolitical feeds get different prompts than tech or finance feeds.
Language Awareness: If my UI is set to Spanish, the briefing comes back in Spanish. The AI responds in the language I’m using.
Redis Caching: Briefings cache for an hour, so repeated queries don’t burn through API credits.
Country Intelligence Briefs
For each country I track, World Monitor generates a structured intelligence brief:
┌─────────────────────────────────────────────────────────────┐│ Country Intelligence Brief │├─────────────────────────────────────────────────────────────┤│ Instability Index │ 0-100 score based on recent events │├─────────────────────────────────────────────────────────────┤│ Signals │ Key indicators requiring attention │├─────────────────────────────────────────────────────────────┤│ Timeline │ Chronological event sequence │├─────────────────────────────────────────────────────────────┤│ Infrastructure │ Critical assets and vulnerabilities │└─────────────────────────────────────────────────────────────┘The AI pulls from the 15 most recent headlines for that country and generates analysis with inline citation anchors. I can click any claim to see the source headline.
AI Deduction and Forecasting
The interactive deduction tool lets me ask questions about geopolitical scenarios. The AI grounds its analysis in live data rather than training data alone:
- Near-term timeline forecasts based on current events
- Cross-panel integration via custom events
- 1-hour Redis cache for repeated queries
- Structured reasoning with source citations
I’ve found this useful for exploring “what if” scenarios without manually cross-referencing dozens of headlines.
Why Local-First Matters
After using World Monitor for several months, the local-first architecture has proven its value:
-
Privacy: When I’m researching sensitive topics, I can run entirely on local models. No data leaves my machine.
-
Offline Capability: On flights or in areas with poor connectivity, I still get full intelligence analysis through browser-based inference.
-
Cost Control: Local inference costs nothing. Cloud APIs only kick in when I need higher quality or faster results.
-
Speed: Local models have zero network latency. For quick classification tasks, this makes a noticeable difference.
-
Reliability: The fallback chain means I always get results. I’ve never had the system completely fail on me.
Technical Implementation Details
The ML Worker runs as a Web Worker, keeping the main thread responsive. Here’s what it handles:
┌─────────────────────────────────────────────────────────────┐│ ML Worker (Web Worker) │├─────────────────────────────────────────────────────────────┤│ Embeddings │ all-MiniLM-L6-v2 via ONNX Runtime ││ Sentiment │ DistilBERT fine-tuned for news ││ Summarization │ T5-small for headline summaries ││ NER │ Named entity recognition for extraction││ Topic Class │ Zero-shot classification pipeline │└─────────────────────────────────────────────────────────────┘The analysis.worker.ts handles higher-level tasks like news clustering using Jaccard similarity and cross-domain correlation detection. The vector-db.ts module manages the IndexedDB-backed vector store for semantic search.
In This Post
In this post, I showed how World Monitor implements a 4-tier AI pipeline for intelligence analysis. The key insight is that local-first AI with cloud fallback provides privacy, offline capability, and reliability without sacrificing quality. The three-stage threat classification pipeline balances instant keyword results with ML refinement and LLM override. The client-side vector memory enables semantic search without sending data to external servers.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Ollama
- 👨💻 Groq
- 👨💻 Transformers.js
- 👨💻 ONNX Runtime
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments