Skip to content

How to Build a Unified Search Across Slack, Telegram, and Discord for AI Agents

Problem

My AI agent couldn’t find critical context. I asked it to prepare for a meeting, and it had no idea about a decision made three months ago in a Telegram thread. That thread contained the exact reasoning behind a project direction change.

I realized the problem: decisions don’t happen in documents. They happen in Slack threads, Telegram DMs, and Discord channels. My agent could only see my files, not my conversations.

Here’s what I was dealing with:

My Conversation Distribution
Platform Messages/Month Time to Search
-------- -------------- --------------
Slack ~2,000 30 seconds per search
Telegram ~500 20 seconds per search
Discord ~800 25 seconds per search
-------- -------------- --------------
Total ~3,300 ~1.5 minutes per query

Every time I needed to recall something, I had to search three different apps. My AI agent was useless for this - it had zero access to any of it.

What I Tried First

My first attempt was to use each platform’s API directly during agent sessions:

naive_search.py
# Call Slack API
slack_results = slack_client.conversations_history(channel="C12345")
# Call Telegram API
telegram_results = telegram_client.get_messages(chat_id=12345)
# Call Discord API
discord_results = discord_client.get_channel_messages(channel_id=12345)
# Combine and search
all_messages = slack_results + telegram_results + discord_results
results = [m for m in all_messages if keyword in m.text]

This failed immediately:

API Call Results
Error: Rate limit exceeded (Slack)
Wait time: 60 seconds
Retry count: 3
Total time for one search: 3+ minutes

Each platform has different rate limits. Slack allows ~1 request per second, Telegram has flood limits, Discord has strict API quotas. Real-time searching was impractical.

The Solution

I switched to a local indexing approach: sync all messages to a local database, then search locally. No API rate limits during search.

Here’s the architecture:

Architecture Overview
+---------------+ +---------------+ +---------------+
| Slack | | Telegram | | Discord |
| API | | API | | API |
+-------+-------+ +-------+-------+ +-------+-------+
| | |
+---------------------+---------------------+
|
v
+-----------------------------+
| Sync Service (CLI) |
| - Rate limit handling |
| - Incremental sync |
| - Error recovery |
+-------------+---------------+
|
v
+-----------------------------+
| SQLite Database |
| +-----------------------+ |
| | FTS5 Virtual Table | | <- Keyword search
| +-----------------------+ |
| +-----------------------+ |
| | Vector Store | | <- Semantic search
| | (Ollama embeddings) | |
| +-----------------------+ |
+-------------+---------------+
|
v
+-----------------------------+
| Search CLI / MCP |
+-------------+---------------+
|
v
+-----------------------------+
| AI Agent |
+-----------------------------+

Step 1: Create the Database Schema

I used SQLite with FTS5 (Full-Text Search) for keyword search:

schema.py
import sqlite3
from datetime import datetime
conn = sqlite3.connect('messages.db')
# Create FTS5 virtual table for full-text search
conn.execute('''
CREATE VIRTUAL TABLE IF NOT EXISTS messages USING fts5(
id,
platform, -- slack, telegram, discord
channel,
sender,
content,
timestamp,
tokenize='porter unicode61'
)
''')
# Create regular table for message metadata
conn.execute('''
CREATE TABLE IF NOT EXISTS message_meta (
id TEXT PRIMARY KEY,
platform TEXT NOT NULL,
channel TEXT NOT NULL,
sender TEXT NOT NULL,
content TEXT NOT NULL,
timestamp TEXT NOT NULL,
thread_id TEXT,
reply_to TEXT,
has_attachments INTEGER DEFAULT 0,
created_at TEXT DEFAULT CURRENT_TIMESTAMP
)
''')
# Create index for platform queries
conn.execute('''
CREATE INDEX IF NOT EXISTS idx_platform_timestamp
ON message_meta(platform, timestamp)
''')
conn.commit()

The FTS5 virtual table enables fast text search. The porter unicode61 tokenizer handles stemming and unicode characters.

Step 2: Index Messages

I created a function to index messages from all platforms:

indexer.py
import sqlite3
from datetime import datetime
from typing import Optional
conn = sqlite3.connect('messages.db')
def index_message(
platform: str,
channel: str,
sender: str,
content: str,
timestamp: str,
message_id: Optional[str] = None,
thread_id: Optional[str] = None,
reply_to: Optional[str] = None
) -> str:
"""Index a message and return its ID."""
# Generate unique ID if not provided
if not message_id:
message_id = f"{platform}_{hash(content + timestamp)}"
# Insert into metadata table
conn.execute('''
INSERT OR REPLACE INTO message_meta
(id, platform, channel, sender, content, timestamp, thread_id, reply_to)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
''', (message_id, platform, channel, sender, content, timestamp, thread_id, reply_to))
# Insert into FTS5 table
conn.execute('''
INSERT INTO messages(id, platform, channel, sender, content, timestamp)
VALUES (?, ?, ?, ?, ?, ?)
''', (message_id, platform, channel, sender, content, timestamp))
conn.commit()
return message_id
# Example usage
index_message(
platform="slack",
channel="engineering",
sender="alice",
content="We decided to use PostgreSQL instead of MySQL for the new service",
timestamp="2025-12-15T10:30:00Z"
)

Step 3: Sync from Each Platform

Each platform needs its own sync worker:

slack_sync.py
import os
from slack_sdk import WebClient
from datetime import datetime, timedelta
client = WebClient(token=os.environ["SLACK_TOKEN"])
def sync_slack_channel(channel_id: str, days: int = 30):
"""Sync messages from a Slack channel."""
oldest = (datetime.now() - timedelta(days=days)).timestamp()
cursor = None
while True:
try:
response = client.conversations_history(
channel=channel_id,
oldest=str(oldest),
cursor=cursor,
limit=200
)
for msg in response["messages"]:
index_message(
platform="slack",
channel=channel_id,
sender=msg.get("user", "unknown"),
content=msg.get("text", ""),
timestamp=msg["ts"],
thread_id=msg.get("thread_ts"),
message_id=f"slack_{channel_id}_{msg['ts']}"
)
# Handle pagination
cursor = response.get("response_metadata", {}).get("next_cursor")
if not cursor:
break
except Exception as e:
print(f"Error syncing Slack: {e}")
break
telegram_sync.py
from telethon.sync import TelegramClient
from telethon.tl.functions.messages import GetHistoryRequest
import os
api_id = int(os.environ["TELEGRAM_API_ID"])
api_hash = os.environ["TELEGRAM_API_HASH"]
def sync_telegram_chat(chat_id: int, limit: int = 1000):
"""Sync messages from a Telegram chat."""
with TelegramClient('sync_session', api_id, api_hash) as client:
messages = client(GetHistoryRequest(
peer=chat_id,
limit=limit,
offset_date=None,
offset_id=0,
add_offset=0,
max_id=0,
min_id=0,
hash=0
))
for msg in messages.messages:
if hasattr(msg, 'message') and msg.message:
index_message(
platform="telegram",
channel=str(chat_id),
sender=str(msg.from_id.user_id) if msg.from_id else "unknown",
content=msg.message,
timestamp=msg.date.isoformat(),
message_id=f"telegram_{msg.id}"
)
discord_sync.py
import discord
import os
intents = discord.Intents.default()
intents.message_content = True
client = discord.Client(intents=intents)
@client.event
async def on_ready():
for guild in client.guilds:
for channel in guild.text_channels:
await sync_discord_channel(channel)
async def sync_discord_channel(channel):
"""Sync messages from a Discord channel."""
async for msg in channel.history(limit=1000):
index_message(
platform="discord",
channel=channel.name,
sender=msg.author.name,
content=msg.content,
timestamp=msg.created_at.isoformat(),
message_id=f"discord_{msg.id}"
)
client.run(os.environ["DISCORD_TOKEN"])

Step 4: Keyword Search with FTS5

Now I could search across all platforms instantly:

search.py
def search_messages(query: str, limit: int = 10, platform: str = None):
"""Search messages across all platforms."""
# Build the FTS5 query
if platform:
sql = '''
SELECT m.platform, m.channel, m.sender, m.content, m.timestamp, mm.thread_id
FROM messages m
JOIN message_meta mm ON m.id = mm.id
WHERE messages MATCH ? AND m.platform = ?
ORDER BY rank
LIMIT ?
'''
cursor = conn.execute(sql, (query, platform, limit))
else:
sql = '''
SELECT m.platform, m.channel, m.sender, m.content, m.timestamp, mm.thread_id
FROM messages m
JOIN message_meta mm ON m.id = mm.id
WHERE messages MATCH ?
ORDER BY rank
LIMIT ?
'''
cursor = conn.execute(sql, (query, limit))
return cursor.fetchall()
# Example searches
results = search_messages("API migration")
# Returns matches from Slack, Telegram, Discord combined
results = search_messages("decision", platform="slack")
# Returns only Slack messages containing "decision"

The search is instant because FTS5 uses inverted indexes:

Search Performance
Query: "API migration"
Results: 15 matches (Slack: 8, Telegram: 4, Discord: 3)
Time: 0.002 seconds
Query: "postgres decision"
Results: 7 matches (Slack: 5, Telegram: 2, Discord: 0)
Time: 0.001 seconds

Step 5: Add Semantic Search with Ollama

Keyword search has limits. If someone said “database choice” but I search for “postgres decision”, I might miss it. I added vector embeddings for semantic search:

semantic_search.py
import ollama
import sqlite3
import numpy as np
# Create vector table
conn = sqlite3.connect('messages.db')
conn.execute('''
CREATE TABLE IF NOT EXISTS message_vectors (
id TEXT PRIMARY KEY,
embedding BLOB NOT NULL
)
''')
def get_embedding(text: str) -> list:
"""Get embedding vector from Ollama."""
response = ollama.embeddings(
model='nomic-embed-text',
prompt=text
)
return response['embedding']
def index_message_vector(message_id: str, content: str):
"""Store embedding for a message."""
embedding = get_embedding(content)
embedding_blob = np.array(embedding, dtype=np.float32).tobytes()
conn.execute('''
INSERT OR REPLACE INTO message_vectors(id, embedding)
VALUES (?, ?)
''', (message_id, embedding_blob))
conn.commit()
def semantic_search(query: str, limit: int = 10, threshold: float = 0.7):
"""Search by meaning, not just keywords."""
query_embedding = get_embedding(query)
query_vec = np.array(query_embedding, dtype=np.float32)
results = []
cursor = conn.execute('SELECT id, embedding FROM message_vectors')
for row in cursor:
msg_id = row[0]
stored_vec = np.frombuffer(row[1], dtype=np.float32)
# Cosine similarity
similarity = np.dot(query_vec, stored_vec) / (
np.linalg.norm(query_vec) * np.linalg.norm(stored_vec)
)
if similarity >= threshold:
results.append((msg_id, similarity))
# Sort by similarity
results.sort(key=lambda x: x[1], reverse=True)
# Get message details
final_results = []
for msg_id, score in results[:limit]:
msg = conn.execute(
'SELECT platform, channel, sender, content, timestamp FROM message_meta WHERE id = ?',
(msg_id,)
).fetchone()
if msg:
final_results.append((*msg, score))
return final_results

Now I can search by meaning:

semantic_example.py
# Search for database decisions
results = semantic_search("why did we choose our database technology")
# Finds: "we went with postgres for the new service"
# Finds: "the database choice was driven by..."
# Even if the word "database" isn't in the original message

Step 6: Expose to AI Agent via CLI

I created a CLI tool that my AI agent can call:

cli.py
#!/usr/bin/env python3
import argparse
import json
from search import search_messages
from semantic_search import semantic_search
def main():
parser = argparse.ArgumentParser(description='Search cross-channel messages')
parser.add_argument('query', help='Search query')
parser.add_argument('--semantic', action='store_true', help='Use semantic search')
parser.add_argument('--platform', choices=['slack', 'telegram', 'discord'])
parser.add_argument('--limit', type=int, default=10)
parser.add_argument('--json', action='store_true', help='Output as JSON')
args = parser.parse_args()
if args.semantic:
results = semantic_search(args.query, args.limit)
else:
results = search_messages(args.query, args.limit, args.platform)
if args.json:
output = []
for r in results:
output.append({
'platform': r[0],
'channel': r[1],
'sender': r[2],
'content': r[3],
'timestamp': r[4]
})
print(json.dumps(output, indent=2))
else:
for r in results:
print(f"[{r[0]}] {r[1]} - {r[2]}: {r[3][:100]}...")
if __name__ == '__main__':
main()

My AI agent can now use this CLI:

Terminal
# Keyword search
python cli.py "API migration" --json
# Semantic search
python cli.py "database architecture decisions" --semantic --json
# Platform-specific search
python cli.py "deploy" --platform slack --limit 20

Common Mistakes I Made

Mistake 1: Indexing everything without filtering

I initially synced every message from every channel. The noise drowned the signal:

Before Filtering
Total messages: 50,000
Relevant messages: ~500
Noise ratio: 99%
After searching for "API": 2,000 results, most were "API is down" alerts

I added filters:

filtering.py
def should_index(content: str, channel: str) -> bool:
"""Filter out noise before indexing."""
noise_patterns = [
"joined the channel",
"left the channel",
"changed the channel topic",
"api is down",
"service degraded",
"@here urgent"
]
content_lower = content.lower()
for pattern in noise_patterns:
if pattern in content_lower:
return False
return True

Mistake 2: Not handling rate limits

My initial sync script got banned from Slack’s API:

Rate Limit Error
Error: ratelimited
Retry-After: 3600
Cause: 500 requests in 10 seconds

I added proper rate limiting:

rate_limited_sync.py
import time
from functools import wraps
def rate_limit(calls_per_second: float):
"""Rate limit decorator."""
min_interval = 1.0 / calls_per_second
last_call = [0.0]
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
elapsed = time.time() - last_call[0]
if elapsed < min_interval:
time.sleep(min_interval - elapsed)
last_call[0] = time.time()
return func(*args, **kwargs)
return wrapper
return decorator
@rate_limit(1.0) # 1 call per second
def slack_api_call(method, **kwargs):
return client.api_call(method, **kwargs)

Mistake 3: Storing credentials insecurely

I initially hard-coded API tokens. Then I moved to environment variables:

.env
SLACK_TOKEN=xoxb-your-token-here
TELEGRAM_API_ID=12345
TELEGRAM_API_HASH=abcdef123456
DISCORD_TOKEN=your-bot-token

And loaded them properly:

config.py
import os
from dotenv import load_dotenv
load_dotenv()
SLACK_TOKEN = os.environ.get("SLACK_TOKEN")
TELEGRAM_API_ID = os.environ.get("TELEGRAM_API_ID")
TELEGRAM_API_HASH = os.environ.get("TELEGRAM_API_HASH")
DISCORD_TOKEN = os.environ.get("DISCORD_TOKEN")
# Validate all tokens exist
for name, value in [
("SLACK_TOKEN", SLACK_TOKEN),
("TELEGRAM_API_ID", TELEGRAM_API_ID),
("TELEGRAM_API_HASH", TELEGRAM_API_HASH),
("DISCORD_TOKEN", DISCORD_TOKEN)
]:
if not value:
raise ValueError(f"Missing environment variable: {name}")

Mistake 4: Ignoring thread relationships

Messages in threads have context. I initially treated them as standalone:

Thread Context Lost
Message: "Yes, let's do that"
Context: ???

I added thread tracking:

thread_context.py
def get_thread_context(message_id: str) -> list:
"""Get the full thread context for a message."""
# Get the message's thread_id
msg = conn.execute(
'SELECT thread_id, reply_to FROM message_meta WHERE id = ?',
(message_id,)
).fetchone()
if not msg or not msg[0]:
return []
thread_id = msg[0]
# Get all messages in the thread
thread = conn.execute('''
SELECT platform, channel, sender, content, timestamp
FROM message_meta
WHERE thread_id = ?
ORDER BY timestamp
''', (thread_id,)).fetchall()
return thread

The Result

After implementing this, my AI agent could now answer questions like:

Example Interaction
User: "What was the reasoning behind switching to PostgreSQL?"
Agent: Let me search your messages...
[Searching messages.db for "PostgreSQL reasoning decision"]
Found 3 relevant discussions:
1. [slack] #engineering - alice (2025-12-15):
"We decided to use PostgreSQL instead of MySQL for the new service
because we need JSONB queries and better concurrency"
2. [telegram] Dev Team - bob (2025-12-14):
"The MySQL locking issues we had last month pushed me toward Postgres"
3. [discord] #architecture - charlie (2025-12-16):
"Postgres won because of the extension ecosystem, especially pgvector"
The team switched to PostgreSQL primarily for:
- Better JSON support (JSONB queries)
- Concurrency improvements after MySQL locking issues
- pgvector extension for vector search

Instead of spending 5 minutes searching three apps, the answer took 2 seconds.

Summary

In this post, I showed how to build cross-channel search for AI agents. The key components are:

  1. SQLite FTS5 for fast keyword search across all platforms
  2. Ollama embeddings for semantic search when keywords aren’t enough
  3. Sync workers for each platform with proper rate limiting
  4. CLI interface for AI agent access

The main mistakes to avoid are: indexing everything without filtering, ignoring rate limits, storing credentials insecurely, and losing thread context.

Now my AI agent has access to where decisions actually happen - in Slack threads, Telegram DMs, and Discord channels. No more context-switching between apps, no more lost decisions.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments