Skip to content

How to Create an AI Agent That Monitors Websites for Changes

Problem

I spent hours every day checking the same websites:

  • A keyboard forum for ISO/NORDEUK keycap group buys
  • An out-of-stock product page waiting for availability
  • Ticket sites for concert price drops
  • Fuel prices for my daily commute

Each site needed manual checking multiple times per day. I missed a limited keycap group buy that sold out in 2 hours. I paid $50 more for tickets because I checked too late.

I needed a way to automate this monitoring without building separate scripts for each site.

What I tried first

My first attempt was a simple polling script:

first_attempt.py
import requests
import time
while True:
response = requests.get("https://example-shop.com/product")
if "out of stock" not in response.text:
print("IN STOCK!")
time.sleep(300) # Check every 5 minutes

This worked for one site. But when I tried to scale to multiple sites, problems appeared:

Problems encountered
1. Every site has different HTML structure
2. Some sites block repeated requests
3. I got alerts for irrelevant changes (navigation updates)
4. No way to track what changed across checks
5. Notification fatigue from too many alerts

I realized I needed a smarter approach - not just detecting changes, but detecting relevant changes.

The solution: AI-powered monitoring

I discovered that other developers were building AI agents for this exact problem. A Reddit thread showed real-world examples:

  • One user monitored 640 forum threads and got 5 relevant results (99.2% noise filtering)
  • Another tracked ticket prices across multiple sites with custom scoring
  • A third automated daily fuel price checks for commute decisions

The pattern was consistent: fetch -> extract -> compare -> filter -> notify.

Core monitoring workflow

AI Agent Workflow
+-----------------+ +------------------+ +------------------+
| Fetch Pages | --> | Extract Data | --> | Compare State |
| (scheduled) | | (price, stock) | | (hash/diff) |
+-----------------+ +------------------+ +------------------+
|
v
+-----------------+ +------------------+ +------------------+
| Take Action | <-- | Filter Alerts | <-- | Detect Change |
| (notify/buy) | | (relevance) | | (meaningful?) |
+-----------------+ +------------------+ +------------------+

Building the monitor

I started with a base class that handles the common logic:

monitor.py
import hashlib
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
import requests
@dataclass
class MonitorResult:
url: str
changed: bool
previous_hash: Optional[str]
current_hash: str
extracted_data: dict
timestamp: datetime
class WebsiteMonitor:
def __init__(self, url: str, check_interval_minutes: int = 60):
self.url = url
self.check_interval = check_interval_minutes
self.previous_hash: Optional[str] = None
def fetch_page(self) -> str:
"""Fetch page content with proper headers."""
headers = {
'User-Agent': 'Mozilla/5.0 (compatible; AIMonitor/1.0)',
}
response = requests.get(self.url, headers=headers, timeout=30)
response.raise_for_status()
return response.text
def extract_relevant_content(self, html: str) -> str:
"""Extract only content you care about."""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
# Remove noise: scripts, styles, navigation
for element in soup(['script', 'style', 'nav', 'header', 'footer']):
element.decompose()
return soup.get_text(separator=' ', strip=True)
def compute_hash(self, content: str) -> str:
"""Generate hash for change detection."""
return hashlib.sha256(content.encode()).hexdigest()
def check(self) -> MonitorResult:
"""Perform a single check."""
html = self.fetch_page()
content = self.extract_relevant_content(html)
current_hash = self.compute_hash(content)
changed = (
self.previous_hash is not None and
current_hash != self.previous_hash
)
result = MonitorResult(
url=self.url,
changed=changed,
previous_hash=self.previous_hash,
current_hash=current_hash,
extracted_data={'content_preview': content[:500]},
timestamp=datetime.now()
)
self.previous_hash = current_hash
return result

This base class handles:

  • Fetching with proper headers (avoids some blocking)
  • Extracting relevant content (filters noise)
  • Hash-based change detection

Specialized monitors

Different monitoring needs require different extraction logic.

Forum thread monitor

I needed to filter 640+ threads to find relevant ones:

forum_monitor.py
from dataclasses import dataclass
from typing import List
from bs4 import BeautifulSoup
@dataclass
class ThreadMatch:
title: str
url: str
score: float
matched_keywords: List[str]
class ForumMonitor(WebsiteMonitor):
"""Monitor forums for specific keywords in thread titles."""
def __init__(self, url: str, keywords: List[str], min_score: float = 1.0):
super().__init__(url)
self.keywords = [kw.lower() for kw in keywords]
self.min_score = min_score
self.seen_threads: set = set()
def extract_threads(self, html: str) -> List[dict]:
"""Extract thread titles and URLs from forum page."""
soup = BeautifulSoup(html, 'html.parser')
threads = []
for thread_elem in soup.select('.thread-title'):
title = thread_elem.get_text(strip=True)
url = thread_elem.get('href', '')
if url and not url.startswith('http'):
url = self.url.rstrip('/') + '/' + url.lstrip('/')
threads.append({'title': title, 'url': url})
return threads
def score_thread(self, title: str) -> tuple:
"""Score thread relevance based on keywords."""
title_lower = title.lower()
matched = [kw for kw in self.keywords if kw in title_lower]
score = len(matched)
# Bonus for combined criteria
if 'iso' in title_lower and 'nordeuk' in title_lower:
score += 2
return score, matched
def check(self) -> List[ThreadMatch]:
"""Check for new matching threads."""
html = self.fetch_page()
threads = self.extract_threads(html)
matches = []
for thread in threads:
if thread['url'] in self.seen_threads:
continue
score, matched_keywords = self.score_thread(thread['title'])
if score >= self.min_score:
matches.append(ThreadMatch(
title=thread['title'],
url=thread['url'],
score=score,
matched_keywords=matched_keywords
))
self.seen_threads.add(thread['url'])
return matches

Usage:

Usage example
# Monitor for specific keycap types
monitor = ForumMonitor(
url="https://geekhack.org/index.php?board=70.0",
keywords=["iso", "nordeuk", "keycap", "gmk"],
min_score=2.0 # Require at least 2 keyword matches
)
matches = monitor.check()
for match in matches:
print(f"Score {match.score}: {match.title}")
print(f" Keywords: {match.matched_keywords}")

Product price/stock monitor

For tracking products:

product_monitor.py
import re
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class ProductState:
url: str
name: str
price: Optional[float]
in_stock: bool
timestamp: datetime
class ProductMonitor(WebsiteMonitor):
"""Monitor product pages for price and availability changes."""
def __init__(self, url: str, target_price: Optional[float] = None):
super().__init__(url)
self.target_price = target_price
self.previous_state: Optional[ProductState] = None
def extract_product_info(self, html: str) -> dict:
"""Extract product details - customize for specific site."""
soup = BeautifulSoup(html, 'html.parser')
# Adjust selectors per site
name_elem = soup.select_one('.product-title')
price_elem = soup.select_one('.price')
stock_elem = soup.select_one('.stock-status')
name = name_elem.get_text(strip=True) if name_elem else 'Unknown'
price = None
if price_elem:
price_text = price_elem.get_text(strip=True)
price_match = re.search(r'[\d,]+\.?\d*', price_text.replace(',', ''))
if price_match:
price = float(price_match.group())
in_stock = stock_elem and 'out of stock' not in stock_elem.get_text().lower()
return {'name': name, 'price': price, 'in_stock': in_stock}
def check(self) -> dict:
"""Check for price/stock changes."""
html = self.fetch_page()
info = self.extract_product_info(html)
current_state = ProductState(
url=self.url,
name=info['name'],
price=info['price'],
in_stock=info['in_stock'],
timestamp=datetime.now()
)
alerts = []
if self.previous_state:
# Stock change detection
if not self.previous_state.in_stock and current_state.in_stock:
alerts.append('BACK_IN_STOCK')
# Price drop detection
if (current_state.price and self.previous_state.price and
current_state.price < self.previous_state.price):
alerts.append(
f'PRICE_DROP: {self.previous_state.price} -> {current_state.price}'
)
# Target price reached
if self.target_price and current_state.price:
if current_state.price <= self.target_price:
alerts.append(f'TARGET_PRICE_REACHED: {current_state.price}')
self.previous_state = current_state
return {'state': current_state, 'alerts': alerts}

Notification integration

Alerts are useless if you don’t see them:

notifications.py
import requests
def send_notification(message: str, webhook_url: str):
"""Send notification via webhook (Slack, Discord, etc.)."""
payload = {'text': message}
requests.post(webhook_url, json=payload)
def send_thread_alert(match: ThreadMatch, webhook_url: str):
"""Send forum thread alert."""
message = (
f"New match found!\n"
f"Title: {match.title}\n"
f"Score: {match.score}\n"
f"Keywords: {', '.join(match.matched_keywords)}\n"
f"URL: {match.url}"
)
send_notification(message, webhook_url)
def send_product_alert(state: ProductState, alerts: List[str], webhook_url: str):
"""Send product availability/price alert."""
alert_text = '\n'.join(f"- {a}" for a in alerts)
message = (
f"Product Alert: {state.name}\n"
f"{alert_text}\n"
f"Price: {state.price}\n"
f"URL: {state.url}"
)
send_notification(message, webhook_url)

Lessons learned

After running this for a few weeks, I made several mistakes:

Mistake 1: Checking too frequently

What I did wrong
Check interval: 30 seconds
Result: IP blocked after 2 hours

Fix:

Proper intervals
# Respect rate limits
CHECK_INTERVALS = {
'forum': 300, # 5 minutes
'product': 600, # 10 minutes
'price_api': 3600 # 1 hour for APIs
}

Mistake 2: Alerting on every change

Notification fatigue
Day 1: 47 alerts (most were navigation changes)
Day 2: I muted the channel
Day 3: Missed the actual important change

Fix: Filter by relevance score before alerting.

Mistake 3: Brittle selectors

What broke
Site updated their HTML
Selector ".product-price" no longer exists
Monitor silently fails

Fix: Add multiple fallback selectors and logging:

Robust extraction
def extract_price(self, soup):
selectors = ['.price', '.product-price', '[data-price]', '.amount']
for selector in selectors:
elem = soup.select_one(selector)
if elem:
return self.parse_price(elem.get_text())
logger.warning(f"No price found with any selector for {self.url}")
return None

Results

After implementing these fixes:

Before vs After
Manual checking:
- 640 forum threads manually reviewed: ~3 hours
- Product pages checked 5x/day: ~30 minutes
- Total daily effort: ~3.5 hours
AI Monitoring:
- 640 threads analyzed: 5 relevant results (2 minutes to review)
- Product pages auto-checked: 0 manual time
- Total daily effort: ~5 minutes
Time saved: 3+ hours/day
Missed opportunities: 0 (was missing ~2/week before)

Summary

In this post, I showed how to build an AI agent that monitors websites for changes. The key insight is that monitoring is not about detecting any change - it’s about detecting relevant changes.

The pattern works across use cases:

  1. Fetch pages at reasonable intervals
  2. Extract only relevant data (price, stock, keywords)
  3. Compare against previous state
  4. Filter for meaningful changes
  5. Notify only when criteria are met

Start with one high-value target (that out-of-stock item you’re waiting for), prove the value, then expand. The modular design means your forum monitor, price tracker, and inventory checker all share the same core infrastructure.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments