An interactive tool that enables users to benchmark vLLM endpoints through MCP, allowing performance testing of LLM models with customizable parameters.
This is proof of concept on how to use MCP to interactively benchmark vLLM.
We are not new to benchmarking, read our blog:
This is just an exploration of possibilities with MCP.
{
"mcpServers": {
"mcp-vllm": {
"command": "uv",
"args": [
"run",
"/Path/TO/mcp-vllm-benchmarking-tool/server.py"
]
}
}
}
Then you can prompt for example like this:
Do a vllm benchmark for this endpoint: http://10.0.101.39:8888
benchmark the following model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
run the benchmark 3 times with each 32 num prompts, then compare the results, but ignore the first iteration as that is just a warmup.
Discover shared experiences
Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!