Skip to content

How to Add Real-Time Web Search to AI Agents with xAI Responses API

Problem

My AI agent couldn’t answer questions about current events:

User: What's Apple's stock price today?
Agent: I don't have access to real-time data. My training data cutoff is...
User: What happened in the news this morning?
Agent: I can only provide information from my pre-training period...

I tried building a RAG (Retrieval-Augmented Generation) system to solve this. But after weeks of work, I realized:

RAG Pipeline Complexity:
1. Crawl websites → 2. Extract text → 3. Clean data
4. Create embeddings → 5. Store in vector DB
6. Build retrieval logic → 7. Handle updates
8. Maintain freshness → 9. Scale infrastructure
10. Debug when results are stale
Time invested: 3 weeks
Result: Still couldn't answer "What's the weather now?"

I needed a simpler approach. That’s when I found xAI’s Responses API with native x_search support.

Environment

  • Python 3.12
  • OpenClaw agent framework (2026.3.28 release)
  • xAI API key
  • Grok model with web-search plugin

What happened?

I was building an AI assistant that needed real-time information. Stock prices, news, weather—all things that change daily. My first attempt with RAG failed because:

  1. Data freshness problem: I spent more time updating my index than building features
  2. Infrastructure overhead: Vector databases, embedding models, crawl pipelines
  3. Latency: Each query went through multiple hops before reaching the LLM

Here’s my failed RAG architecture:

rag-architecture.txt
[Traditional RAG Flow]
User Query
Query Embedding (~50ms)
Vector Search (~100ms)
Retrieve Documents (~80ms)
Context Assembly (~20ms)
LLM Generation (~2000ms)
Response
Total latency: ~2250ms + maintenance nightmare

Then I discovered xAI’s Responses API with x_search. Here’s the difference:

xsearch-architecture.txt
[x_search Flow]
User Query
AI needs current info → x_search call (~300ms)
Live web results returned
LLM processes + responds (~1500ms)
Response with sources
Total latency: ~1800ms + zero maintenance

The key insight: x_search works like opening a browser mid-conversation. The AI queries live search results and immediately incorporates findings into responses.

Step 1: Get xAI API Key

First, I needed an xAI API key:

Terminal
# Sign up at x.ai and get your API key
export XAI_API_KEY="your-api-key-here"

Step 2: Configure OpenClaw

OpenClaw 2026.3.28 has built-in x_search support. I ran the setup wizard:

Terminal
openclaw init
# Wizard asks:
? Do you want to configure x_search? (Y/n)
? Which model for search? grok-2-latest
? Enter your xAI API key: [provided]
# Output:
x_search configured successfully
Single API key handles everything

The configuration file:

config.yaml
xai:
api_key: ${XAI_API_KEY}
x_search:
enabled: true
model: grok-2-latest

Step 3: Test with a Simple Query

I tested with a real-time question:

test_xsearch.py
from openclaw import Agent
agent = Agent(model="grok-2-latest", x_search=True)
response = agent.run("What is Apple's stock price today?")
print(response)

Output:

output.txt
Based on current market data (March 29, 2026):
- Apple (AAPL) stock price: $178.42
- Change: +2.15% from yesterday
- Market status: Open
Sources:
- Yahoo Finance: AAPL quote page
- MarketWatch: Real-time pricing

The response included real-time data with sources—something my RAG system never achieved.

Step 4: Compare with RAG Approach

I ran both approaches side-by-side:

comparison.txt
[Question: "What's the latest news about NVIDIA?"]
RAG System (my old approach):
- Response: "NVIDIA announced new chips in January..." (outdated)
- Sources: My crawled documents from 2 weeks ago
- Time to build: 3 weeks
- Maintenance: Daily crawls, weekly re-indexing
x_search (new approach):
- Response: "NVIDIA just announced partnerships with..." (current)
- Sources: Live news articles from this morning
- Time to build: 10 minutes
- Maintenance: None (xAI handles it)

The difference was stark. x_search gave me current information with zero infrastructure.

Why x_search is different from RAG

I initially confused x_search with RAG. They serve different purposes:

rag-vs-xsearch.txt
[Traditional RAG]
Use case: Search YOUR documents
- Company internal docs
- Personal knowledge base
- Code repositories
- Pre-indexed, static content
Pros:
- You control the data
- Private information stays private
- Custom ranking
Cons:
- Requires infrastructure
- Data must be pre-indexed
- Updates are manual/expensive
- Can't search the web
[x_search]
Use case: Search THE WEB
- Current news
- Real-time prices
- Latest documentation
- Public information
Pros:
- Zero infrastructure
- Always current
- Built-in source attribution
- Simple setup
Cons:
- Only public information
- Adds latency (~300ms)
- Requires internet access
- Can't search private docs

Think of it this way:

analogy.txt
RAG is like:
"Go to library, photocopy books, bring home, read, write summary"
- Many steps
- Information is frozen at photocopy time
- Need to revisit library for updates
x_search is like:
"Open browser, search Google, read top results, write summary"
- Fewer steps
- Information is always fresh
- No library visits needed

When I use each approach

After testing, I found the right use cases:

Use RAG for:

  • Internal company documents
  • Private databases
  • Code search within repositories
  • Historical data that doesn’t change

Use x_search for:

  • Current stock prices
  • Latest news headlines
  • Recent software releases
  • Weather and real-time events
  • Competitive research

Combine both:

  • Search internal docs with RAG
  • Supplement with current web info via x_search
  • Merge results for comprehensive answers

Common mistakes I made

Mistake 1: Thinking x_search replaces RAG entirely

I initially tried to use x_search for everything:

mistake1.py
# WRONG: Trying to search private docs with x_search
agent.run("What's in our internal quarterly report?")
# Result: Can't find it (not on the public web)

The fix: Use RAG for private data, x_search for public web.

Mistake 2: Not accounting for latency

x_search adds ~300ms for the web query. I didn’t handle this:

mistake2.py
# WRONG: No timeout handling
response = agent.run("Search for latest Python releases")
# Sometimes hangs when web is slow

The fix: Add timeout handling:

fix2.py
import asyncio
async def search_with_timeout(agent, query, timeout=10):
try:
response = await asyncio.wait_for(
agent.run(query),
timeout=timeout
)
return response
except asyncio.TimeoutError:
return "Search timed out. Please try again."

Mistake 3: Forgetting to cite sources

x_search returns sources, but I wasn’t showing them:

mistake3.py
# WRONG: Not showing sources
response = agent.run("What's the latest TypeScript version?")
print(response.text) # Lost source attribution

The fix: Extract and display sources:

fix3.py
response = agent.run("What's the latest TypeScript version?")
print(response.text)
print("\nSources:")
for source in response.sources:
print(f"- {source.title}: {source.url}")

Complete working example

Here’s my final implementation:

agent_with_xsearch.py
from openclaw import Agent, Config
import os
class RealTimeAgent:
def __init__(self):
# Load xAI configuration
self.config = Config.from_yaml("config.yaml")
# Create agent with x_search enabled
self.agent = Agent(
model="grok-2-latest",
x_search=self.config.xai.x_search.enabled,
timeout=15.0
)
async def query(self, question: str) -> dict:
"""Query with real-time web search"""
response = await self.agent.run(question)
return {
"answer": response.text,
"sources": [
{"title": s.title, "url": s.url}
for s in response.sources
],
"timestamp": response.timestamp,
"search_used": response.x_search_used
}
# Usage
agent = RealTimeAgent()
result = await agent.query("What are the top AI news stories today?")
print(f"Answer: {result['answer']}")
print(f"\nSources ({len(result['sources'])}):")
for s in result['sources']:
print(f" - {s['title']}")

The reason x_search works so well

The technical difference is in how the API is designed:

architecture-deep.txt
[xAI Responses API with x_search]
1. User sends query to xAI API
2. API detects current-info requirement
3. x_search plugin activates automatically
4. Web search executes (like browser search)
5. Results returned to Grok model
6. Model synthesizes answer with sources
7. Response sent back to user
Key design decisions:
- Plugin is native, not bolted-on
- Search happens inside the API call
- No separate search service needed
- Source attribution is automatic

OpenClaw’s auto-detection makes setup seamless:

auto-detection.txt
When you configure Grok web-search:
1. OpenClaw detects the plugin in your config
2. Automatically enables xAI plugin support
3. x_search calls happen transparently
4. You just use agent.run() normally

Single API key handles everything—no multi-service configuration:

single-key.yaml
# This one key handles:
# - Model inference (Grok)
# - Web search (x_search)
# - Source retrieval
# - Response generation
xai:
api_key: ${XAI_API_KEY}

Summary

In this post, I explained how to add real-time web search to AI agents using xAI’s Responses API with x_search. The key point is that x_search eliminates the complexity of traditional RAG pipelines by embedding search directly into the AI conversation.

My failed RAG approach required 3 weeks of work, vector databases, crawl pipelines, and constant maintenance—yet couldn’t answer “What’s the stock price now?” With x_search, I achieved real-time web access in 10 minutes with zero infrastructure.

Use RAG for searching your private documents. Use x_search for public web information. They complement each other, not replace.

For developers building AI that needs current information, x_search is the simplest path forward.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments