How Can AI Agents Handle Website Logins and Dynamic Interactions?
I spent two hours last week debugging why my AI agent kept failing to interact with a dynamic dropdown on a logged-in page. The selector worked in my browser, but when the agent tried it—nothing. The element wasn’t there yet, or it was in an iframe, or the page structure had changed since I wrote the code.
This is the core problem with building AI agents that need to interact with websites: traditional web scraping tools break on modern applications. They can’t handle authentication flows, dynamic content, or the interactive elements that define today’s web.
Here’s what I learned about building AI agents that can actually interact with websites reliably.
The Problem with Traditional Approaches
I started with Puppeteer, thinking it would be straightforward. Log in, navigate, extract data. But then reality hit:
from playwright.sync_api import sync_playwright
def scrape_dashboard(): with sync_playwright() as p: browser = p.chromium.launch() page = browser.new_page()
# This works... until it doesn't page.goto("https://example.com/login") page.fill('input[name="password"]', "my-password") page.click('button[type="submit"]')
# Wait for dashboard - but how long? page.wait_for_selector('.dashboard-content', timeout=10000)
# Extract data - hope the selector is still valid data = page.query_selector_all('.data-item')
return dataEvery run was a gamble. Sometimes the login succeeded, sometimes it timed out. Dynamic dropdowns were a nightmare—I’d click them, wait for options to appear, but the timing was never consistent.
The real issues:
- Authentication flows: OAuth, SSO, MFA, CAPTCHAs—traditional tools weren’t built for these
- Dynamic content: Modern SPAs load data asynchronously; your scraper sees empty pages
- Rate limiting: Websites detect automated access and block it
- UI fragility: A minor design update changes a selector, and everything breaks
Solution 1: Web Agent Infrastructure with AgentQL
I discovered AgentQL through a Reddit thread about MCP servers. It’s purpose-built infrastructure for AI agents to interact with websites.
Instead of targeting specific DOM elements, you describe what you want in natural language:
import agentql
session = agentql.start_session("https://example.com/login")
# AgentQL handles the authentication flowpage = session.pagepage.fill('input[type="password"]', "my-password")page.click('button:has-text("Sign in")')
# Query using natural language instead of brittle selectorsquery = """{ dashboard_data[] { title value trend }}"""
result = session.query(query)print(result.data)The key insight from ninadpathak on Reddit: add proper headers and wait parameters to avoid flakiness:
from agentql.sync_api import Session
# Proper configuration prevents rate limit issuessession = Session( url="https://target-site.com", user_agent="Mozilla/5.0 (compatible; AgentQL)", wait_for=5000 # 5 seconds for dynamic content)
# AgentQL manages authentication state automatically# No need to handle cookies, tokens, or session persistenceresult = session.query("""{ authenticated_content { user_name dashboard_metrics[] { name value } }}""")AgentQL handles:
- Browser session management
- Authentication state persistence
- Dynamic content loading with intelligent waiting
- Retry logic and error recovery
Solution 2: MCP Server for Authenticated API Calls
For applications like Slack, Jira, and Datadog, there’s an even better approach. Instead of automating the UI at all, build an MCP (Model Context Protocol) server that calls internal APIs directly through the browser’s authenticated session.
opentabs-dev shared this insight: “Instead of DOM-based automation for things like Slack, Jira, and Datadog, I built an MCP server that calls the app’s internal APIs through the browser’s authenticated session. The agent gets structured tools like slack_send_message instead of trying to click around.”
Here’s how I implemented this pattern:
import { Server } from "@modelcontextprotocol/sdk/server/index.js";import { extractSessionFromBrowser } from "./browser-session.js";
const server = new Server({ name: "internal-api-mcp", version: "1.0.0",}, { capabilities: { tools: {} }});
// Extract authenticated session from browserconst session = await extractSessionFromBrowser("chrome", "slack.com");
// Define tool that uses internal APIserver.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [{ name: "slack_send_message", description: "Send a message to a Slack channel", inputSchema: { type: "object", properties: { channel: { type: "string" }, message: { type: "string" } }, required: ["channel", "message"] } }]}));
server.setRequestHandler(CallToolRequestSchema, async (request) => { if (request.params.name === "slack_send_message") { const { channel, message } = request.params.arguments;
// Call Slack's internal API with authenticated session const response = await fetch("https://slack.com/api/chat.postMessage", { method: "POST", headers: { "Authorization": `Bearer ${session.token}`, "Content-Type": "application/json", "Cookie": session.cookies.join("; ") }, body: JSON.stringify({ channel: channel, text: message }) });
return { content: [{ type: "text", text: JSON.stringify(await response.json()) }] }; }});The session extraction captures cookies and tokens from your already-authenticated browser:
import browser_cookie3import requests
def extract_slack_session(): """Extract authenticated Slack session from browser cookies.""" cookies = browser_cookie3.chrome(domain_name='slack.com')
session = requests.Session() session.cookies.update(cookies)
# Get the API token from browser's localStorage or cookies # This varies by application token = get_token_from_browser_storage("slack.com", "api_token")
return { "cookies": cookies, "token": token, "session": session }This approach has several advantages:
- Stability: Internal APIs change far less frequently than UI elements
- Speed: Direct API calls are faster than browser automation
- Reliability: No timing issues with dynamic content loading
- Maintainability: When the UI changes, your MCP server keeps working
Why This Matters for AI Agents
The real power comes from combining these approaches with AI agents. Your agent gets structured tools:
User: "Check my Jira tickets and post a summary to Slack"
Agent uses MCP tools:1. jira_list_tickets(status="in-progress") → [ticket data]2. slack_send_message(channel="#standup", message="Summary: ...")
Instead of:1. Navigate to Jira2. Click login3. Wait for dashboard4. Find ticket list selector5. Extract text6. Navigate to Slack7. Click channel8. Type message9. Click sendThe agent doesn’t need to know about DOM selectors or authentication flows. It just calls tools that work.
Common Mistakes I Made
-
Over-relying on DOM selectors: Every UI update broke my automation. I learned to prefer API-based approaches whenever possible.
-
Ignoring rate limits: My first attempts got blocked. Always use appropriate user-agent headers and wait parameters:
# WRONG: Gets blocked quicklyresponse = requests.get("https://api.example.com/data")
# RIGHT: Respects rate limitsimport time
def fetch_with_backoff(url, max_retries=3): for attempt in range(max_retries): response = requests.get( url, headers={"User-Agent": "Mozilla/5.0 (compatible; AgentQL)"} )
if response.status_code == 429: # Rate limited wait_time = int(response.headers.get("Retry-After", 5)) time.sleep(wait_time) continue
return response-
Handling authentication manually: I wasted time reimplementing OAuth flows. Using AgentQL or MCP servers with existing sessions is far more reliable.
-
Skipping dynamic content waits: Modern SPAs need time. Implicit waits prevent timing failures:
from playwright.sync_api import sync_playwright
with sync_playwright() as p: browser = p.chromium.launch() page = browser.new_page()
page.goto("https://spa-example.com/dashboard")
# WRONG: Content might not be loaded yet # data = page.query_selector(".data-item")
# RIGHT: Wait for specific content to appear page.wait_for_selector(".data-item", state="visible", timeout=10000) data = page.query_selector_all(".data-item")- Not extracting session tokens properly: For MCP servers, you need to capture the full authenticated state—cookies, tokens, and headers.
The MCP Server Ecosystem
The Reddit thread mentioned “10 MCP servers that together give your AI agent an actual brain.” This is the key insight: these tools are composable.
┌─────────────────────────────────────────────────────┐│ AI Agent │└─────────────────────────────────────────────────────┘ │ ┌───────────────┼───────────────┐ ▼ ▼ ▼┌───────────────┐ ┌───────────────┐ ┌───────────────┐│ AgentQL MCP │ │ Slack MCP │ │ Jira MCP ││ │ │ │ │ ││ Web scraping │ │ API calls │ │ API calls ││ Auth flows │ │ with session │ │ with session │└───────────────┘ └───────────────┘ └───────────────┘ │ │ │ ▼ ▼ ▼ Websites Slack API Jira APIEach MCP server provides a focused set of tools. The agent combines them to accomplish complex tasks.
Practical Implementation: Building an MCP Server for Internal APIs
Let me walk through building an MCP server for a typical internal API. This example uses Datadog:
import { Server } from "@modelcontextprotocol/sdk/server/index.js";import { ListToolsRequestSchema, CallToolRequestSchema,} from "@modelcontextprotocol/sdk/types.js";
// Define the MCP serverconst server = new Server({ name: "datadog-mcp", version: "1.0.0",}, { capabilities: { tools: {} }});
// Extract session from authenticated browserasync function getDatadogSession(): Promise<{ apiKey: string; appKey: string; cookies: string }> { // Implementation depends on how Datadog stores auth // This might read from browser storage, environment, or a config file
return { apiKey: process.env.DATADOG_API_KEY || "", appKey: process.env.DATADOG_APP_KEY || "", cookies: "" // For browser-based auth };}
// List available toolsserver.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ { name: "datadog_query_logs", description: "Query Datadog logs with a filter", inputSchema: { type: "object", properties: { query: { type: "string", description: "Datadog log query syntax" }, timeframe: { type: "string", description: "Time range (e.g., '1h', '24h')" } }, required: ["query"] } }, { name: "datadog_list_monitors", description: "List all Datadog monitors", inputSchema: { type: "object", properties: { name_filter: { type: "string", description: "Filter monitors by name" } } } } ]}));
// Handle tool executionserver.setRequestHandler(CallToolRequestSchema, async (request) => { const session = await getDatadogSession();
if (request.params.name === "datadog_query_logs") { const { query, timeframe = "1h" } = request.params.arguments as { query: string; timeframe?: string };
const response = await fetch("https://api.datadoghq.com/api/v1/logs-queries/list", { method: "POST", headers: { "Content-Type": "application/json", "DD-API-KEY": session.apiKey, "DD-APPLICATION-KEY": session.appKey }, body: JSON.stringify({ query: query, time: { from: `now-${timeframe}`, to: "now" } }) });
const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; }
if (request.params.name === "datadog_list_monitors") { const { name_filter } = request.params.arguments as { name_filter?: string };
const url = new URL("https://api.datadoghq.com/api/v1/monitor"); if (name_filter) { url.searchParams.set("name", name_filter); }
const response = await fetch(url.toString(), { headers: { "DD-API-KEY": session.apiKey, "DD-APPLICATION-KEY": session.appKey } });
const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; }
throw new Error(`Unknown tool: ${request.params.name}`);});
// Start the serverasync function main() { const transport = new StdioServerTransport(); await server.connect(transport);}
main().catch(console.error);Now your AI agent can query Datadog without any DOM automation:
User: "Check for any error logs in the payment service in the last hour"
Agent: I'll query Datadog for payment service errors.[Calls datadog_query_logs with query="service:payment status:error"]
Agent: I found 23 error logs in the payment service. The most common error is"connection timeout" occurring 18 times. Would you like me to investigate furtheror create an alert?Key Takeaways
-
Use AgentQL for general web interactions: It handles authentication, dynamic content, and provides natural language querying.
-
Build MCP servers for frequently-used services: Slack, Jira, Datadog, and similar tools have stable internal APIs that are easier to work with than their UIs.
-
Extract sessions from authenticated browsers: Don’t reimplement login flows; use existing sessions.
-
Prefer APIs over DOM manipulation: APIs change less frequently and are more reliable.
-
Add proper wait parameters: Dynamic content needs time to load; always include reasonable timeouts.
-
Respect rate limits: Use appropriate headers and implement backoff logic.
The two-hour debugging session that started this journey taught me something important: building reliable AI agents for web interaction isn’t about perfecting DOM selectors—it’s about using the right infrastructure. AgentQL and MCP servers are that infrastructure.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments