How to Build an AI Agent That Monitors Multiple Social Platforms Automatically
The Problem
I spent two hours every morning checking Reddit, Twitter, YouTube, and Hacker News for trending topics in my niche. That time could have been spent coding.
I tried building separate scrapers for each platform. The code was messy. Each scraper had its own authentication logic, rate limiting rules, and output format. When I needed to add Polymarket, I had to write another entirely new module.
Then I found last30days-skill on GitHub. It gained 2,685 stars in one day. The reason? One agent handles all platforms with a unified approach.
This post shows how to build your own multi-platform monitoring agent. The key point is using a unified agent pattern instead of fragmented scrapers.
My Failed Approach
I started with separate scrapers:
# Reddit scraperdef scrape_reddit(): # Custom Reddit logic pass
# Twitter scraperdef scrape_twitter(): # Custom Twitter logic pass
# YouTube scraperdef scrape_youtube(): # Custom YouTube logic pass
# Hacker News scraperdef scrape_hn(): # Custom HN logic pass
# Call them separatelyreddit_data = scrape_reddit()twitter_data = scrape_twitter()youtube_data = scrape_youtube()hn_data = scrape_hn()This approach had three problems:
- Each scraper needed different authentication
- Rate limits were different for each platform
- Output formats were inconsistent
I was maintaining four different codebases for the same goal.
The Unified Agent Pattern
I switched to a unified approach. One agent handles all platforms through a standardized interface.
The pattern works like this:
+------------------+ | Unified Agent | +------------------+ | +--------------+--------------+ | | | +-------+------+ +----+-----+ +----+-----+ | PlatformTool | | PlatformTool | | PlatformTool | +-------+------+ +----+-----+ +----+-----+ | | | +-------+------+ +----+-----+ +----+-----+ | Reddit API | | Twitter API | | YouTube API | +--------------+ +-------------+ +-------------+Each platform becomes a “tool” that the agent can call. The agent decides when to use each tool based on what you ask it to do.
Build the Base Agent
I started with a LangGraph agent structure:
from langgraph.graph import StateGraph, ENDfrom typing import TypedDict, List, Dictimport asyncio
class AgentState(TypedDict): platforms: List[str] results: Dict[str, List[dict]] summary: str
def create_agent(): workflow = StateGraph(AgentState)
# Define nodes workflow.add_node("fetch_reddit", fetch_reddit_node) workflow.add_node("fetch_twitter", fetch_twitter_node) workflow.add_node("fetch_youtube", fetch_youtube_node) workflow.add_node("fetch_hn", fetch_hn_node) workflow.add_node("summarize", summarize_node)
# Define edges - all fetches happen in parallel workflow.add_edge("fetch_reddit", "summarize") workflow.add_edge("fetch_twitter", "summarize") workflow.add_edge("fetch_youtube", "summarize") workflow.add_edge("fetch_hn", "summarize") workflow.add_edge("summarize", END)
# Set entry point workflow.set_entry_point("fetch_reddit")
return workflow.compile()The key is parallel execution. All platforms fetch simultaneously, then merge results for summarization.
Create Platform Tools
Each platform needs a standardized tool interface:
from abc import ABC, abstractmethodfrom typing import List, Dict
class PlatformTool(ABC): """Base class for all platform tools"""
@abstractmethod async def fetch_trending(self, limit: int = 10) -> List[Dict]: """Fetch trending content from platform""" pass
@abstractmethod def de_watermark(self, content: str) -> str: """Remove watermarks and clean content""" pass
@property @abstractmethod def platform_name(self) -> str: """Return platform name""" pass
class RedditTool(PlatformTool): platform_name = "reddit"
async def fetch_trending(self, limit: int = 10) -> List[Dict]: import aiohttp
url = "https://www.reddit.com/r/all/hot.json" headers = {"User-Agent": "MultiPlatformAgent/1.0"}
async with aiohttp.ClientSession() as session: async with session.get(url, headers=headers) as resp: data = await resp.json()
posts = data["data"]["children"][:limit] return [ { "title": post["data"]["title"], "url": post["data"]["url"], "score": post["data"]["score"], "subreddit": post["data"]["subreddit"], "source": self.platform_name } for post in posts ]
def de_watermark(self, content: str) -> str: # Reddit posts don't have watermarks return content.strip()
class HackerNewsTool(PlatformTool): platform_name = "hacker_news"
async def fetch_trending(self, limit: int = 10) -> List[Dict]: import aiohttp
async with aiohttp.ClientSession() as session: # Get top stories IDs async with session.get( "https://hacker-news.firebaseio.com/v0/topstories.json" ) as resp: ids = await resp.json()
# Fetch top N stories stories = [] for id in ids[:limit]: async with session.get( f"https://hacker-news.firebaseio.com/v0/item/{id}.json" ) as resp: story = await resp.json() stories.append({ "title": story["title"], "url": story.get("url", ""), "score": story["score"], "source": self.platform_name })
return stories
def de_watermark(self, content: str) -> str: return content.strip()The abstract base class forces consistency. Each tool implements fetch_trending and de_watermark the same way.
Add YouTube Integration
YouTube was tricky because it needs API credentials:
class YouTubeTool(PlatformTool): platform_name = "youtube"
async def fetch_trending(self, limit: int = 10) -> List[Dict]: from googleapiclient.discovery import build import os
api_key = os.environ.get("YOUTUBE_API_KEY") if not api_key: raise ValueError("YOUTUBE_API_KEY not set")
youtube = build("youtube", "v3", developerKey=api_key)
request = youtube.videos().list( part="snippet,statistics", chart="mostPopular", maxResults=limit, regionCode="US" )
response = request.execute()
return [ { "title": item["snippet"]["title"], "url": f"https://youtube.com/watch?v={item['id']}", "score": int(item["statistics"].get("viewCount", 0)), "source": self.platform_name } for item in response["items"] ]
def de_watermark(self, content: str) -> str: # Remove common YouTube title patterns import re content = re.sub(r'\s*\[HD\]\s*', '', content) content = re.sub(r'\s*\[Official\]\s*', '', content) return content.strip()I made the mistake of hardcoding the API key initially. The agent failed when I tried to deploy it. Environment variables fixed this.
The Summarization Node
After fetching, the agent summarizes everything:
from anthropic import Anthropic
def summarize_node(state: AgentState) -> AgentState: client = Anthropic()
# Combine all results all_content = [] for platform, items in state["results"].items(): for item in items: all_content.append(f"[{platform}] {item['title']} (score: {item['score']})")
prompt = f"""Summarize these trending topics across platforms.Group similar topics together. Highlight the most important ones.
Content:{chr(10).join(all_content)}
Provide a concise summary with top 5 trending topics."""
response = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, messages=[{"role": "user", "content": prompt}] )
return {**state, "summary": response.content[0].text}The LLM does the heavy lifting. It groups topics across platforms automatically.
Run the Complete Agent
I put everything together:
import asynciofrom typing import Dict, List
async def run_multi_platform_agent(): # Initialize tools tools = { "reddit": RedditTool(), "hacker_news": HackerNewsTool(), "youtube": YouTubeTool() }
# Fetch from all platforms simultaneously results: Dict[str, List[dict]] = {}
async def fetch_platform(name: str, tool: PlatformTool): try: data = await tool.fetch_trending(limit=10) results[name] = [ {**item, "cleaned": tool.de_watermark(item["title"])} for item in data ] except Exception as e: results[name] = [] print(f"Error fetching {name}: {e}")
# Run all fetches in parallel await asyncio.gather(*[ fetch_platform(name, tool) for name, tool in tools.items() ])
# Summarize state = {"results": results, "summary": ""} final_state = summarize_node(state)
return final_state["summary"]
if __name__ == "__main__": summary = asyncio.run(run_multi_platform_agent()) print(summary)Running this gives me a unified summary from all platforms in about 10 seconds.
Add Polymarket for Trends
I added Polymarket to track prediction market trends:
class PolymarketTool(PlatformTool): platform_name = "polymarket"
async def fetch_trending(self, limit: int = 10) -> List[Dict]: import aiohttp
url = "https://clob.polymarket.com/markets" params = {"limit": limit, "closed": False}
async with aiohttp.ClientSession() as session: async with session.get(url, params=params) as resp: markets = await resp.json()
return [ { "title": market["question"], "url": f"https://polymarket.com/event/{market['slug']}", "score": float(market.get("volume", 0)), "source": self.platform_name } for market in markets[:limit] ]
def de_watermark(self, content: str) -> str: return content.strip()Prediction markets reveal what topics people care about before they hit mainstream news.
The De-Watermarking Problem
Some platforms add watermarks to content titles. YouTube videos often have “[HD]” or “[Official]” appended. Twitter posts sometimes include attribution text.
I built a cleaning function:
import re
def clean_title(title: str, platform: str) -> str: patterns = { "youtube": [ r'\s*\[HD\]\s*', r'\s*\[Official\]\s*', r'\s*\[Official Video\]\s*', r'\s*-\s*YouTube\s*$' ], "twitter": [ r'\s*via\s+@\w+\s*$', r'\s*—\s+\w+\s*$' ], "reddit": [], # Usually clean "hacker_news": [], # Usually clean "polymarket": [] # Usually clean }
cleaned = title for pattern in patterns.get(platform, []): cleaned = re.sub(pattern, '', cleaned, flags=re.IGNORECASE)
return cleaned.strip()This makes the summary cleaner and more readable.
Deployment Issues I Encountered
When I deployed the agent, I hit three errors:
Error 1: Rate Limits
Reddit blocked my requests after a few runs. I added caching:
import timefrom functools import lru_cache
@lru_cache(maxsize=100)def cached_reddit_fetch(timestamp: int): # Cache for 5 minutes return asyncio.run(RedditTool().fetch_trending())
def get_reddit_with_cache(): cache_key = int(time.time() // 300) # 5 minute buckets return cached_reddit_fetch(cache_key)Error 2: Missing API Keys
YouTube failed without the API key. I added validation:
def validate_environment(): required_keys = ["YOUTUBE_API_KEY", "ANTHROPIC_API_KEY"] missing = [k for k in required_keys if not os.environ.get(k)] if missing: raise ValueError(f"Missing environment variables: {missing}")Error 3: Network Failures
Some platforms occasionally timed out. I added retries:
import asynciofrom tenacity import retry, stop_after_attempt, wait_exponential
@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10))async def fetch_with_retry(tool: PlatformTool): return await tool.fetch_trending()These fixes made the agent reliable enough for daily use.
Schedule Automatic Runs
I set up a cron job to run the agent every morning:
# Run at 6 AM daily0 6 * * * cd /path/to/agent && python complete_agent.py >> /var/log/agent.log 2>&1Or use a Python scheduler:
import scheduleimport time
def daily_monitor(): summary = asyncio.run(run_multi_platform_agent()) # Send to Slack or save to file save_summary(summary)
schedule.every().day.at("06:00").do(daily_monitor)
while True: schedule.run_pending() time.sleep(60)Now I wake up to a summary instead of spending two hours manually checking platforms.
What I Learned
Building this agent taught me:
- Unified patterns beat fragmented scrapers
- Parallel execution saves time
- Environment variables prevent deployment failures
- Caching handles rate limits
- Retry logic handles transient failures
The agent runs in 10 seconds. It used to take me two hours manually.
Summary
This post showed how to build an AI agent that monitors Reddit, Twitter, YouTube, Hacker News, and Polymarket simultaneously. The key point is using a unified agent pattern instead of fragmented scrapers.
The architecture uses platform tools that implement a standard interface. The agent fetches all platforms in parallel, cleans the content with de-watermarking, and summarizes everything with an LLM.
I added caching for rate limits, retry logic for failures, and scheduling for daily automation. The result: 10 seconds instead of 2 hours of manual checking.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments