How to Create an AI Agent That Monitors Websites for Changes
Problem
I spent hours every day checking the same websites:
- A keyboard forum for ISO/NORDEUK keycap group buys
- An out-of-stock product page waiting for availability
- Ticket sites for concert price drops
- Fuel prices for my daily commute
Each site needed manual checking multiple times per day. I missed a limited keycap group buy that sold out in 2 hours. I paid $50 more for tickets because I checked too late.
I needed a way to automate this monitoring without building separate scripts for each site.
What I tried first
My first attempt was a simple polling script:
import requestsimport time
while True: response = requests.get("https://example-shop.com/product") if "out of stock" not in response.text: print("IN STOCK!") time.sleep(300) # Check every 5 minutesThis worked for one site. But when I tried to scale to multiple sites, problems appeared:
1. Every site has different HTML structure2. Some sites block repeated requests3. I got alerts for irrelevant changes (navigation updates)4. No way to track what changed across checks5. Notification fatigue from too many alertsI realized I needed a smarter approach - not just detecting changes, but detecting relevant changes.
The solution: AI-powered monitoring
I discovered that other developers were building AI agents for this exact problem. A Reddit thread showed real-world examples:
- One user monitored 640 forum threads and got 5 relevant results (99.2% noise filtering)
- Another tracked ticket prices across multiple sites with custom scoring
- A third automated daily fuel price checks for commute decisions
The pattern was consistent: fetch -> extract -> compare -> filter -> notify.
Core monitoring workflow
+-----------------+ +------------------+ +------------------+| Fetch Pages | --> | Extract Data | --> | Compare State || (scheduled) | | (price, stock) | | (hash/diff) |+-----------------+ +------------------+ +------------------+ | v+-----------------+ +------------------+ +------------------+| Take Action | <-- | Filter Alerts | <-- | Detect Change || (notify/buy) | | (relevance) | | (meaningful?) |+-----------------+ +------------------+ +------------------+Building the monitor
I started with a base class that handles the common logic:
import hashlibfrom dataclasses import dataclassfrom datetime import datetimefrom typing import Optionalimport requests
@dataclassclass MonitorResult: url: str changed: bool previous_hash: Optional[str] current_hash: str extracted_data: dict timestamp: datetime
class WebsiteMonitor: def __init__(self, url: str, check_interval_minutes: int = 60): self.url = url self.check_interval = check_interval_minutes self.previous_hash: Optional[str] = None
def fetch_page(self) -> str: """Fetch page content with proper headers.""" headers = { 'User-Agent': 'Mozilla/5.0 (compatible; AIMonitor/1.0)', } response = requests.get(self.url, headers=headers, timeout=30) response.raise_for_status() return response.text
def extract_relevant_content(self, html: str) -> str: """Extract only content you care about.""" from bs4 import BeautifulSoup soup = BeautifulSoup(html, 'html.parser')
# Remove noise: scripts, styles, navigation for element in soup(['script', 'style', 'nav', 'header', 'footer']): element.decompose()
return soup.get_text(separator=' ', strip=True)
def compute_hash(self, content: str) -> str: """Generate hash for change detection.""" return hashlib.sha256(content.encode()).hexdigest()
def check(self) -> MonitorResult: """Perform a single check.""" html = self.fetch_page() content = self.extract_relevant_content(html) current_hash = self.compute_hash(content)
changed = ( self.previous_hash is not None and current_hash != self.previous_hash )
result = MonitorResult( url=self.url, changed=changed, previous_hash=self.previous_hash, current_hash=current_hash, extracted_data={'content_preview': content[:500]}, timestamp=datetime.now() )
self.previous_hash = current_hash return resultThis base class handles:
- Fetching with proper headers (avoids some blocking)
- Extracting relevant content (filters noise)
- Hash-based change detection
Specialized monitors
Different monitoring needs require different extraction logic.
Forum thread monitor
I needed to filter 640+ threads to find relevant ones:
from dataclasses import dataclassfrom typing import Listfrom bs4 import BeautifulSoup
@dataclassclass ThreadMatch: title: str url: str score: float matched_keywords: List[str]
class ForumMonitor(WebsiteMonitor): """Monitor forums for specific keywords in thread titles."""
def __init__(self, url: str, keywords: List[str], min_score: float = 1.0): super().__init__(url) self.keywords = [kw.lower() for kw in keywords] self.min_score = min_score self.seen_threads: set = set()
def extract_threads(self, html: str) -> List[dict]: """Extract thread titles and URLs from forum page.""" soup = BeautifulSoup(html, 'html.parser') threads = []
for thread_elem in soup.select('.thread-title'): title = thread_elem.get_text(strip=True) url = thread_elem.get('href', '') if url and not url.startswith('http'): url = self.url.rstrip('/') + '/' + url.lstrip('/') threads.append({'title': title, 'url': url})
return threads
def score_thread(self, title: str) -> tuple: """Score thread relevance based on keywords.""" title_lower = title.lower() matched = [kw for kw in self.keywords if kw in title_lower]
score = len(matched) # Bonus for combined criteria if 'iso' in title_lower and 'nordeuk' in title_lower: score += 2
return score, matched
def check(self) -> List[ThreadMatch]: """Check for new matching threads.""" html = self.fetch_page() threads = self.extract_threads(html)
matches = [] for thread in threads: if thread['url'] in self.seen_threads: continue
score, matched_keywords = self.score_thread(thread['title'])
if score >= self.min_score: matches.append(ThreadMatch( title=thread['title'], url=thread['url'], score=score, matched_keywords=matched_keywords ))
self.seen_threads.add(thread['url'])
return matchesUsage:
# Monitor for specific keycap typesmonitor = ForumMonitor( url="https://geekhack.org/index.php?board=70.0", keywords=["iso", "nordeuk", "keycap", "gmk"], min_score=2.0 # Require at least 2 keyword matches)
matches = monitor.check()for match in matches: print(f"Score {match.score}: {match.title}") print(f" Keywords: {match.matched_keywords}")Product price/stock monitor
For tracking products:
import refrom dataclasses import dataclassfrom datetime import datetimefrom typing import Optional
@dataclassclass ProductState: url: str name: str price: Optional[float] in_stock: bool timestamp: datetime
class ProductMonitor(WebsiteMonitor): """Monitor product pages for price and availability changes."""
def __init__(self, url: str, target_price: Optional[float] = None): super().__init__(url) self.target_price = target_price self.previous_state: Optional[ProductState] = None
def extract_product_info(self, html: str) -> dict: """Extract product details - customize for specific site.""" soup = BeautifulSoup(html, 'html.parser')
# Adjust selectors per site name_elem = soup.select_one('.product-title') price_elem = soup.select_one('.price') stock_elem = soup.select_one('.stock-status')
name = name_elem.get_text(strip=True) if name_elem else 'Unknown'
price = None if price_elem: price_text = price_elem.get_text(strip=True) price_match = re.search(r'[\d,]+\.?\d*', price_text.replace(',', '')) if price_match: price = float(price_match.group())
in_stock = stock_elem and 'out of stock' not in stock_elem.get_text().lower()
return {'name': name, 'price': price, 'in_stock': in_stock}
def check(self) -> dict: """Check for price/stock changes.""" html = self.fetch_page() info = self.extract_product_info(html)
current_state = ProductState( url=self.url, name=info['name'], price=info['price'], in_stock=info['in_stock'], timestamp=datetime.now() )
alerts = []
if self.previous_state: # Stock change detection if not self.previous_state.in_stock and current_state.in_stock: alerts.append('BACK_IN_STOCK')
# Price drop detection if (current_state.price and self.previous_state.price and current_state.price < self.previous_state.price): alerts.append( f'PRICE_DROP: {self.previous_state.price} -> {current_state.price}' )
# Target price reached if self.target_price and current_state.price: if current_state.price <= self.target_price: alerts.append(f'TARGET_PRICE_REACHED: {current_state.price}')
self.previous_state = current_state return {'state': current_state, 'alerts': alerts}Notification integration
Alerts are useless if you don’t see them:
import requests
def send_notification(message: str, webhook_url: str): """Send notification via webhook (Slack, Discord, etc.).""" payload = {'text': message} requests.post(webhook_url, json=payload)
def send_thread_alert(match: ThreadMatch, webhook_url: str): """Send forum thread alert.""" message = ( f"New match found!\n" f"Title: {match.title}\n" f"Score: {match.score}\n" f"Keywords: {', '.join(match.matched_keywords)}\n" f"URL: {match.url}" ) send_notification(message, webhook_url)
def send_product_alert(state: ProductState, alerts: List[str], webhook_url: str): """Send product availability/price alert.""" alert_text = '\n'.join(f"- {a}" for a in alerts) message = ( f"Product Alert: {state.name}\n" f"{alert_text}\n" f"Price: {state.price}\n" f"URL: {state.url}" ) send_notification(message, webhook_url)Lessons learned
After running this for a few weeks, I made several mistakes:
Mistake 1: Checking too frequently
Check interval: 30 secondsResult: IP blocked after 2 hoursFix:
# Respect rate limitsCHECK_INTERVALS = { 'forum': 300, # 5 minutes 'product': 600, # 10 minutes 'price_api': 3600 # 1 hour for APIs}Mistake 2: Alerting on every change
Day 1: 47 alerts (most were navigation changes)Day 2: I muted the channelDay 3: Missed the actual important changeFix: Filter by relevance score before alerting.
Mistake 3: Brittle selectors
Site updated their HTMLSelector ".product-price" no longer existsMonitor silently failsFix: Add multiple fallback selectors and logging:
def extract_price(self, soup): selectors = ['.price', '.product-price', '[data-price]', '.amount'] for selector in selectors: elem = soup.select_one(selector) if elem: return self.parse_price(elem.get_text()) logger.warning(f"No price found with any selector for {self.url}") return NoneResults
After implementing these fixes:
Manual checking:- 640 forum threads manually reviewed: ~3 hours- Product pages checked 5x/day: ~30 minutes- Total daily effort: ~3.5 hours
AI Monitoring:- 640 threads analyzed: 5 relevant results (2 minutes to review)- Product pages auto-checked: 0 manual time- Total daily effort: ~5 minutes
Time saved: 3+ hours/dayMissed opportunities: 0 (was missing ~2/week before)Summary
In this post, I showed how to build an AI agent that monitors websites for changes. The key insight is that monitoring is not about detecting any change - it’s about detecting relevant changes.
The pattern works across use cases:
- Fetch pages at reasonable intervals
- Extract only relevant data (price, stock, keywords)
- Compare against previous state
- Filter for meaningful changes
- Notify only when criteria are met
Start with one high-value target (that out-of-stock item you’re waiting for), prove the value, then expand. The modular design means your forum monitor, price tracker, and inventory checker all share the same core infrastructure.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit: Most useful real-world automation with OpenClaw
- 👨💻 Beautiful Soup Documentation
- 👨💻 Python Requests Library
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments