How to Use Lightpanda MCP Server for AI Agent Web Browsing

Mar 19, 2026

The Problem: AI Agents Can’t Browse

I wanted my AI agents to browse the web, but every solution felt heavyweight. Puppeteer requires a driver library. Playwright needs installation. Selenium is… well, Selenium. Then I found Lightpanda’s built-in MCP server.

When building AI workflows, I kept hitting the same wall. My agents needed to:

Read web pages
Extract data
Fill forms
Click buttons

But every browser automation tool required me to write orchestration code. The agent couldn’t directly control the browser - it needed me as a middleman.

Then I discovered Lightpanda ships with an MCP server that exposes browser capabilities as tools AI agents can invoke directly.

What is MCP and Why It Matters

The Model Context Protocol (MCP) is a standard for AI models to interact with external tools. Think of it as a universal plugin system for AI agents.

Instead of this flow:

AI Agent -> My Code -> Puppeteer -> Browser

MCP enables this:

AI Agent -> MCP Tool -> Browser

The agent decides what to do. No orchestration code required.

Setting Up Lightpanda MCP Server

First, I needed to understand what tools were available. Looking at the source code in src/mcp/tools.zig, I found 10 tools exposed through the MCP server:

Tool	Purpose
`goto`	Navigate to a URL
`markdown`	Extract page content as markdown
`links`	Get all links from the page
`evaluate`	Run JavaScript in page context
`semantic_tree`	Get simplified DOM structure
`interactiveElements`	Find clickable/fillable elements
`structuredData`	Extract JSON-LD and OpenGraph data
`click`	Click an element
`fill`	Fill text into input fields
`scroll`	Scroll the page

These tools are designed for AI consumption, not human developers.

I tried navigating to a URL and extracting content:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "goto",
    "arguments": {
      "url": "https://example.com"
    }
  }
}

The tool returned a success response. Then I called:

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tools/call",
  "params": {
    "name": "markdown"
  }
}

I got clean, formatted markdown back. No HTML parsing. No boilerplate removal. Just content ready for an LLM to process.

The Semantic Tree Advantage

What impressed me most was the semantic_tree tool. Traditional web scraping gives you raw HTML - a mess of divs, spans, and nested elements.

The semantic tree returns a simplified structure:

{
  "type": "document",
  "children": [
    {
      "type": "heading",
      "level": 1,
      "text": "Main Title"
    },
    {
      "type": "paragraph",
      "text": "Some content here..."
    },
    {
      "type": "button",
      "text": "Submit",
      "nodeId": 42
    }
  ]
}

This is optimized for AI reasoning. The agent sees structure, not syntax.

Finding Interactive Elements

I needed to fill a search form. First, I asked to find interactive elements:

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "interactiveElements"
  }
}

The response listed all clickable and fillable elements with their backend node IDs. Then I could:

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "tools/call",
  "params": {
    "name": "fill",
    "arguments": {
      "backendNodeId": 15,
      "value": "my search query"
    }
  }
}

{
  "jsonrpc": "2.0",
  "id": 5,
  "method": "tools/call",
  "params": {
    "name": "click",
    "arguments": {
      "backendNodeId": 22
    }
  }
}

The agent navigated the form without me writing a single line of browser automation code.

MCP vs CDP: When to Use Each

Chrome DevTools Protocol (CDP) is powerful but low-level. You need to understand:

Domains (DOM, Page, Network, etc.)
Events and listeners
Session management

MCP abstracts this away. The tools are:

Self-documenting
High-level
Designed for agent workflows

Use MCP when:

Building AI agents that browse the web
You want declarative tool calls
The agent controls the flow

Use CDP when:

You need fine-grained control
Building developer tools
Writing custom browser automation

Real Use Case: Research Agent

I built an agent that researches topics across multiple websites. The workflow:

goto a search engine
fill the search query
click search
markdown to extract results
goto each promising link
structuredData to get metadata
Synthesize findings

No Python. No Node.js. Just MCP tool calls from the agent.

Extracting Structured Data

The structuredData tool extracts JSON-LD, OpenGraph, and other metadata:

{
  "jsonrpc": "2.0",
  "id": 6,
  "method": "tools/call",
  "params": {
    "name": "structuredData"
  }
}

This returns schema.org data, social media metadata, and other structured information that helps agents understand page content semantically.

Common Pitfalls

I made these mistakes:

Not waiting for navigation - After goto, pages may still load. I learned to check for expected content before proceeding.
Ignoring nodeId - The click and fill tools need node IDs from interactiveElements or semantic_tree. I initially tried using CSS selectors, which don’t work.
Over-extracting - markdown returns everything. For research tasks, I found semantic_tree gives cleaner context for LLMs.

Summary

In this post, I showed how Lightpanda’s MCP server bridges AI agents with web browsing capabilities. The key points are:

MCP tools are designed for AI consumption - No orchestration code needed
Semantic trees provide cleaner context than raw HTML
Use node IDs from interactiveElements for click and fill operations
Start with goto + markdown for simple extraction, add semantic_tree for complex reasoning

Lightpanda’s MCP server changed how I think about AI agent web browsing. Instead of writing orchestration code, I configure tools and let the agent decide.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Model Context Protocol
👨‍💻 Lightpanda GitHub

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!