An MCP server that extracts meaningful content from websites and converts HTML to high-quality Markdown, using Mozilla's Readability engine.
A command-line tool and MCP server for scraping websites and converting HTML to Markdown.
# Install dependencies npm install # Build the project npm run build # Optionally, install globally npm install -g .
# Print output to console scrape https://example.com # Save output to a file scrape https://example.com output.md # Convert a local HTML file to Markdown scrape --html-file input.html # Convert a local HTML file and save output to a file scrape --html-file input.html output.md # Show help scrape --help # Or run via npm script npm run start:cli -- https://example.com
This tool can be used as a Model Context Protocol (MCP) server:
# Start in MCP server mode npm start
src/index.ts
- Core functionality and MCP server implementationsrc/cli.ts
- Command-line interface implementationsrc/data_processing.ts
- HTML to Markdown conversion functionalityThe tool exports the following functions:
// Scrape a website and convert to Markdown import { scrapeToMarkdown } from './build/index.js'; // Convert HTML string to Markdown directly import { htmlToMarkdown } from './build/data_processing.js'; async function example() { // Web scraping const markdown = await scrapeToMarkdown('https://example.com'); console.log(markdown); // Direct HTML conversion const html = 'Hello WorldThis is bold text.'; const md = htmlToMarkdown(html); console.log(md); }
ISC
Discover shared experiences
Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!