A FastMCP server that enables browser automation through natural language commands, allowing Language Models to browse the web, fill out forms, click buttons, and perform other web-based tasks via a simple API.
A FastMCP server that enables browser automation through natural language commands. This server allows Language Models to browse the web, fill out forms, click buttons, and perform other web-based tasks via a simple API.
Install with a specific provider (e.g., OpenAI)
pip install -e "git+https://github.com/yourusername/browser-use-mcp.git#egg=browser-use-mcp[openai]"
Or install all providers
pip install -e "git+https://github.com/yourusername/browser-use-mcp.git#egg=browser-use-mcp[all-providers]"
Install Playwright browsers
playwright install chromium
Add the browser-use-mcp server to your MCP client configuration:
{ "mcpServers": { "browser-use-mcp": { "command": "browser-use-mcp", "args": ["--model", "gpt-4o"], "env": { "OPENAI_API_KEY": "your-openai-api-key", // Or any other provider's API key "DISPLAY": ":0" // For GUI environments } } } }
Replace "your-openai-api-key"
with your actual API key or use an environment variable reference like process.env.OPENAI_API_KEY
.
import asyncio import os from dotenv import load_dotenv from langchain_openai import ChatOpenAI from mcp_use import MCPAgent, MCPClient async def main(): # Load environment variables load_dotenv() # Create MCPClient from config file client = MCPClient( config={ "mcpServers": { "browser-use-mcp": { "command": "browser-use-mcp", "args": ["--model", "gpt-4o"], "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY"), "DISPLAY": ":0", }, } } } ) # Create LLM llm = ChatOpenAI(model="gpt-4o") # Create agent with the client agent = MCPAgent(llm=llm, client=client, max_steps=30) # Run the query result = await agent.run( """ Navigate to https://github.com, search for "browser-use-mcp", and summarize the project. """, max_steps=30, ) print(f" Result: {result}") if __name__ == "__main__": asyncio.run(main())
~/Library/Application Support/Claude/claude_desktop_config.json
%AppData%\Claude\claude_desktop_config.json
{ "mcpServers": { "browser-use": { "command": "browser-use-mcp", "args": ["--model", "claude-3-opus-20240229"] } } }
The following LLM providers are supported for browser automation:
Provider | API Key Environment Variable |
---|---|
OpenAI | OPENAI_API_KEY |
Anthropic | ANTHROPIC_API_KEY |
GOOGLE_API_KEY | |
Cohere | COHERE_API_KEY |
Mistral AI | MISTRAL_API_KEY |
Groq | GROQ_API_KEY |
Together AI | TOGETHER_API_KEY |
AWS Bedrock | AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY |
Fireworks | FIREWORKS_API_KEY |
Azure OpenAI | AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT |
Vertex AI | GOOGLE_APPLICATION_CREDENTIALS |
NVIDIA | NVIDIA_API_KEY |
AI21 | AI21_API_KEY |
Databricks | DATABRICKS_HOST and DATABRICKS_TOKEN |
IBM watsonx.ai | WATSONX_API_KEY |
xAI | XAI_API_KEY |
Upstage | UPSTAGE_API_KEY |
Hugging Face | HUGGINGFACE_API_KEY |
Ollama | OLLAMA_BASE_URL |
Llama.cpp | LLAMA_CPP_SERVER_URL |
For more information check out: https://python.langchain.com/docs/integrations/chat/
You can create a .env
file in the project directory with your API keys:
OPENAI_API_KEY=your_openai_key_here
# Or any other provider key
.env
file.playwright install chromium
.--model
flag to specify a valid model for your provider.--debug
to enable more detailed logging that can help identify issues.MIT # browser-use-mcp
Discover shared experiences
Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!