A MCP server that helps determine if two sets of data belong to the same entity by comparing both exact and semantic equality through text normalization and language model integration.
Identify whether two sets of data are from the same entity. 识别两组数据是否来自同一主体
This is a MCP (Model Context Protocol) server. 这是一个支持MCP协议的服务器。
This tool provides a comprehensive way to compare two sets of data, evaluating both exact and semantic equality of their values. It leverages text normalization and a language model to determine if the data originates from the same entity.
To use this tool, ensure you have the necessary dependencies installed. You can install them using pip:
pip install genai
normalize_text(text):
compare_values(val1, val2):
compare_json(json1, json2):
compare_values
to evaluate each key's values.import json import genai import re # Define your JSON objects json1 = { "name": "John Doe", "address": "123 Main St, Anytown, USA", "hobbies": ["reading", "hiking", "coding"] } json2 = { "name": "john doe", "address": "123 Main Street, Anytown, USA", "hobbies": ["coding", "hiking", "reading"] } # Compare the JSON objects comparison_results = compare_json(json1, json2) # Generate final matching result model1 = genai.GenerativeModel("gemini-2.0-flash-thinking-exp") result_matching = model1.generate_content("综合这些信息,你认为可以判断两个数据来自同一主体吗?"+json.dumps(comparison_results, ensure_ascii=False, indent=4)) print(result_matching.text)
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
If you have any questions or suggestions, please contact me:
Wechat
Discover shared experiences
Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!