Related MCP Server Resources

Explore more AI models, providers, and integration options:

  • Explore AI Models
  • Explore AI Providers
  • Explore MCP Servers
  • LangDB Pricing
  • Documentation
  • AI Industry Blog
  • MCP Web Browser Server
  • Powertools MCP Search Server
  • MkDocs MCP Search Server
  • RAG Documentation MCP Server
  • sanderkooger-mcp-server-ragdocs
Back to MCP Servers
Browser Automation MCP Server

Browser Automation MCP Server

Public
Raghu6798/Browser_scrape_mcp

Enables intelligent web scraping through a browser automation tool that can search Google, navigate to webpages, and extract content from various websites including GitHub, Stack Overflow, and documentation sites.

Verified
python
0 tools
May 30, 2025
Updated May 30, 2025

πŸ€– Browser Automation Agent

A powerful browser automation tool built with MCP (Model Controlled Program) that combines web scraping capabilities with LLM-powered intelligence. This agent can search Google, navigate to webpages, and intelligently scrape content from various websites including GitHub, Stack Overflow, and documentation sites.

πŸš€ Features

  • πŸ” Google Search Integration: Finds and retrieves top search results for any query
  • πŸ•ΈοΈ Intelligent Web Scraping: Tailored scraping strategies for different website types:
    • πŸ“‚ GitHub repositories
    • πŸ’¬ Stack Overflow questions and answers
    • πŸ“š Documentation pages
    • 🌐 Generic websites
  • 🧠 AI-Powered Processing: Uses Mistral AI for understanding and processing scraped content
  • πŸ₯· Stealth Mode: Implements browser fingerprint protection to avoid detection
  • πŸ’Ύ Content Saving: Automatically saves both screenshots and text content from scraped pages

πŸ—οΈ Architecture

This project uses a client-server architecture powered by MCP:

  • πŸ–₯️ Server: Handles browser automation and web scraping tasks
  • πŸ‘€ Client: Provides the AI interface using Mistral AI and LangGraph
  • πŸ“‘ Communication: Uses stdio for client-server communication

βš™οΈ Requirements

  • 🐍 Python 3.8+
  • 🎭 Playwright
  • 🧩 MCP (Model Controlled Program)
  • πŸ”‘ Mistral AI API key

πŸ“₯ Installation

  1. Clone the repository:
git clone https://github.com/yourusername/browser-automation-agent.git cd browser-automation-agent
  1. Install dependencies:
pip install -r requirements.txt
  1. Install Playwright browsers:
playwright install
  1. Create a .env file in the project root and add your Mistral AI API key:
MISTRAL_API_KEY=your_api_key_here

πŸ“‹ Usage

Running the Server

python main.py

Running the Client

python client.py

Sample Interaction

Once both the server and client are running:

  1. Enter your query when prompted
  2. The agent will:
    • πŸ” Search Google for relevant results
    • 🧭 Navigate to the top result
    • πŸ“Š Scrape content based on the website type
    • πŸ“Έ Save screenshots and content to files
    • πŸ“€ Return processed information

πŸ› οΈ Tool Functions

get_top_google_url

πŸ” Searches Google and returns the top result URL for a given query.

browse_and_scrape

🌐 Navigates to a URL and scrapes content based on the website type.

scrape_github

πŸ“‚ Specializes in extracting README content and code blocks from GitHub repositories.

scrape_stackoverflow

πŸ’¬ Extracts questions, answers, comments, and code blocks from Stack Overflow pages.

scrape_documentation

πŸ“š Optimized for extracting documentation content and code examples.

scrape_generic

🌐 Extracts paragraph text and code blocks from generic websites.

πŸ“ File Structure

browser-automation-agent/
β”œβ”€β”€ main.py            # MCP server implementation
β”œβ”€β”€ client.py          # Mistral AI client implementation
β”œβ”€β”€ requirements.txt   # Project dependencies
β”œβ”€β”€ .env               # Environment variables (API keys)
└── README.md          # Project documentation

πŸ“€ Output Files

The agent generates two types of output files with timestamps:

  • πŸ“Έ final_page_YYYYMMDD_HHMMSS.png: Screenshot of the final page state
  • πŸ“„ scraped_content_YYYYMMDD_HHMMSS.txt: Extracted text content from the page

βš™οΈ Customization

You can modify the following parameters in the code:

  • πŸ–₯️ Browser window size: Adjust width and height in browse_and_scrape
  • πŸ‘» Headless mode: Set headless=True for invisible browser operation
  • πŸ”’ Number of Google results: Change num_results in get_top_google_url

❓ Troubleshooting

  • πŸ”Œ Connection Issues: Ensure both server and client are running in separate terminals
  • 🎭 Playwright Errors: Make sure browsers are installed with playwright install
  • πŸ”‘ API Key Errors: Verify your Mistral API key is correctly set in the .env file
  • πŸ›£οΈ Path Errors: Update the path to main.py in client.py if needed

πŸ“œ License

MIT License

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.


Built with 🧩 MCP, 🎭 Playwright, and 🧠 Mistral AI

Publicly Shared Threads0

Discover shared experiences

Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!

Share your threads to help others
Related MCPs5
  • MCP Web Browser Server
    MCP Web Browser Server

    An advanced web browsing server enabling headless browser interactions via a secure API, providing f...

    6 tools
    Added May 30, 2025
  • Powertools MCP Search Server
    Powertools MCP Search Server

    Enables LLMs to search through AWS Lambda Powertools documentation across multiple runtimes (Python,...

    2 tools
    Added May 30, 2025
  • MkDocs MCP Search Server
    MkDocs MCP Search Server

    Enables Claude and other LLMs to search through any published MkDocs documentation site using the Lu...

    Added May 30, 2025
  • RAG Documentation MCP Server
    RAG Documentation MCP Server

    Enables AI assistants to enhance their responses with relevant documentation through a semantic vect...

    Added May 30, 2025
  • sanderkooger-mcp-server-ragdocs
    sanderkooger-mcp-server-ragdocs

    An MCP server implementation that provides tools for retrieving and processing documentation through...

    Added May 30, 2025