Provides voice recognition and text extraction services with support for file and base64 audio inputs, delivering structured, AIO protocol-compliant responses via Model Context Protocol (MCP) and stdio modes.
Unlock the full potential of Voice Recognition MCP Service through LangDB's AI Gateway. Get enterprise-grade security, analytics, and seamless integration with zero configuration.
Free tier available • No credit card required
This service provides voice recognition and text extraction capabilities through both stdio and MCP modes.
voice_service.py
- Core service implementationstdio_server.py
- stdio mode entry pointmcp_server.py
- MCP mode entry pointbuild.py
- Build script for executablesbuild_exec.sh
- Build execution scripttest_*.sh
- Test scripts for different functionalitiesgit clone https://github.com/AIO-2030/mcp_voice_identify.git cd mcp_voice_identify
pip install -r requirements.txt
.env
:API_URL=your_api_url
API_KEY=your_api_key
python stdio_server.py
{ "jsonrpc": "2.0", "method": "help", "params": {}, "id": 1 }
./dist/voice_stdio
python mcp_server.py
./dist/voice_mcp
The service follows the AIO protocol for response formatting. Here are examples of different response types:
{ "jsonrpc": "2.0", "output": { "type": "voice", "message": "Voice processed successfully", "text": "test test test", "metadata": { "language": "en", "emotion": "unknown", "audio_type": "speech", "speaker": "woitn", "raw_text": "test test test" } }, "id": 1 }
{ "jsonrpc": "2.0", "result": { "type": "voice_service", "description": "This service provides voice recognition and text extraction services", "author": "AIO-2030", "version": "1.0.0", "github": "https://github.com/AIO-2030/mcp_voice_identify", "transport": ["stdio"], "methods": [ { "name": "help", "description": "Show this help information." }, { "name": "identify_voice", "description": "Identify voice from file", "inputSchema": { "type": "object", "properties": { "file_path": { "type": "string", "description": "Voice file path" } }, "required": ["file_path"] } }, { "name": "identify_voice_base64", "description": "Identify voice from base64 encoded data", "inputSchema": { "type": "object", "properties": { "base64_data": { "type": "string", "description": "Base64 encoded voice data" } }, "required": ["base64_data"] } }, { "name": "extract_text", "description": "Extract text", "inputSchema": { "type": "object", "properties": { "text": { "type": "string", "description": "Text to extract" } }, "required": ["text"] } } ] }, "id": 1 }
{ "jsonrpc": "2.0", "output": { "type": "error", "message": "503 Server Error: Service Unavailable", "error_code": 503 }, "id": 1 }
The service provides three types of responses:
Voice Recognition Response (using output
field):
| Field | Description | Example Value |
|-----------|--------------------------------------|---------------|
| type | Response type | "voice" |
| message | Status message | "Voice processed successfully" |
| text | Recognized text content | "test test test" |
| metadata | Additional information | See below |
Help Information Response (using result
field):
| Field | Description | Example Value |
|---------------|--------------------------------------|---------------|
| type | Service type | "voice_service" |
| description | Service description | "This service provides..." |
| author | Service author | "AIO-2030" |
| version | Service version | "1.0.0" |
| github | GitHub repository URL | "https://github.com/..." |
| transport | Supported transport modes | ["stdio"] |
| methods | Available methods | See methods list |
Error Response (using output
field):
| Field | Description | Example Value |
|-------------|--------------------------------------|---------------|
| type | Response type | "error" |
| message | Error message | "503 Server Error: Service Unavailable" |
| error_code | HTTP status code | 503 |
The metadata
field in voice recognition responses contains:
Field | Description | Example Value |
---|---|---|
language | Language code | "en" |
emotion | Emotion state | "unknown" |
audio_type | Audio type | "speech" |
speaker | Speaker identifier | "woitn" |
raw_text | Original recognized text | "test test test" |
chmod +x build_exec.sh
./build_exec.sh
./build_exec.sh mcp
The executables will be created at:
dist/voice_stdio
dist/voice_mcp
Run the test scripts:
chmod +x test_*.sh ./test_help.sh ./test_voice_file.sh ./test_voice_base64.sh
This project is licensed under the MIT License - see the LICENSE file for details.
Discover shared experiences
Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!