Prysm
π Prysm MCP Server
The Prysm MCP (Model Context Protocol) Server enables AI assistants like Claude and others to scrape web content with high accuracy and flexibility.
β¨ Features
- π― Multiple Scraping Modes: Choose from focused (speed), balanced (default), or deep (thorough) modes
- π§ Content Analysis: Analyze URLs to determine the best scraping approach
- π Format Flexibility: Format results as markdown, HTML, or JSON
- πΌοΈ Image Support: Optionally extract and even download images
- π Smart Scrolling: Configure scroll behavior for single-page applications
- π± Responsive: Adapts to different website layouts and structures
- πΎ File Output: Save formatted results to your preferred directory
π Quick Start
Installation
## Recommended: Install the LLM-optimized version
npm install -g @pinkpixel/prysm-mcp
## Or install the standard version
npm install -g prysm-mcp
## Or clone and build
git clone https://github.com/pinkpixel-dev/prysm-mcp.git
cd prysm-mcp
npm install
npm run build
Integration Guides
We provide detailed integration guides for popular MCP-compatible applications:
- Cursor Integration Guide
- Claude Desktop Integration Guide
- Windsurf Integration Guide
- Cline Integration Guide
- Roo Code Integration Guide
- Open WebUI Integration Guide
Usage
There are multiple ways to set up Prysm MCP Server:
Using mcp.json Configuration
Create a mcp.json
file in the appropriate location according to the above guides.
{
"mcpServers": {
"prysm-scraper": {
"description": "Prysm web scraper with custom output directories",
"command": "npx",
"args": [
"-y",
"@pinkpixel/prysm-mcp"
],
"env": {
"PRYSM_OUTPUT_DIR": "${workspaceFolder}/scrape_results",
"PRYSM_IMAGE_OUTPUT_DIR": "${workspaceFolder}/scrape_results/images"
}
}
}
}
π οΈ Tools
The server provides the following tools:
scrapeFocused
Fast web scraping optimized for speed (fewer scrolls, main content only).
Please scrape https://example.com using the focused mode
Available Parameters:
url
(required): URL to scrapemaxScrolls
(optional): Maximum number of scroll attempts (default: 5)scrollDelay
(optional): Delay between scrolls in ms (default: 1000)scrapeImages
(optional): Whether to include images in resultsdownloadImages
(optional): Whether to download images locallymaxImages
(optional): Maximum images to extractoutput
(optional): Output directory for downloaded images
scrapeBalanced
Balanced web scraping approach with good coverage and reasonable speed.
Please scrape https://example.com using the balanced mode
Available Parameters:
- Same as
scrapeFocused
with different defaults maxScrolls
default: 10scrollDelay
default: 2000- Adds
timeout
parameter to limit total scraping time (default: 30000ms)
scrapeDeep
Maximum extraction web scraping (slower but thorough).
Please scrape https://example.com using the deep mode with maximum scrolls
Available Parameters:
- Same as
scrapeFocused
with different defaults maxScrolls
default: 20scrollDelay
default: 3000maxImages
default: 100
formatResult
Format scraped data into different structured formats (markdown, HTML, JSON).
Format the scraped data as markdown
Available Parameters:
data
(required): The scraped data to formatformat
(required): Output format - "markdown", "html", or "json"includeImages
(optional): Whether to include images in output (default: true)output
(optional): File path to save the formatted result
You can also save formatted results to a file by specifying an output path:
Format the scraped data as markdown and save it to "my-results/output.md"
βοΈ Configuration
Output Directory
By default, when saving formatted results, files will be saved to ~/prysm-mcp/output/
. You can customize this in two ways:
- Environment Variables: Set environment variables to your preferred directories:
## Linux/macOS
export PRYSM_OUTPUT_DIR="/path/to/custom/directory"
export PRYSM_IMAGE_OUTPUT_DIR="/path/to/custom/image/directory"
## Windows (Command Prompt)
set PRYSM_OUTPUT_DIR=C:\path\to\custom\directory
set PRYSM_IMAGE_OUTPUT_DIR=C:\path\to\custom\image\directory
## Windows (PowerShell)
$env:PRYSM_OUTPUT_DIR="C:\path\to\custom\directory"
$env:PRYSM_IMAGE_OUTPUT_DIR="C:\path\to\custom\image\directory"
- Tool Parameter: Specify output paths directly when calling the tools:
## For general results
Format the scraped data as markdown and save it to "/absolute/path/to/file.md"
## For image downloads when scraping
Please scrape https://example.com and download images to "/absolute/path/to/images"
- MCP Configuration: In your MCP configuration file (e.g.,
.cursor/mcp.json
), you can set these environment variables:
{
"mcpServers": {
"prysm-scraper": {
"command": "npx",
"args": ["-y", "@pinkpixel/prysm-mcp"],
"env": {
"PRYSM_OUTPUT_DIR": "${workspaceFolder}/scrape_results",
"PRYSM_IMAGE_OUTPUT_DIR": "${workspaceFolder}/scrape_results/images"
}
}
}
}
If PRYSM_IMAGE_OUTPUT_DIR
is not specified, it will default to a subfolder named images
inside the PRYSM_OUTPUT_DIR
.
If you provide only a relative path or filename, it will be saved relative to the configured output directory.
Path Handling Rules
The formatResult
tool handles paths in the following ways:
- Absolute paths: Used exactly as provided (
/home/user/file.md
) - Relative paths: Saved relative to the configured output directory (
subfolder/file.md
) - Filename only: Saved in the configured output directory (
output.md
) - Directory path: If the path points to a directory, a filename is auto-generated based on content and timestamp
ποΈ Development
## Install dependencies
npm install
## Build the project
npm run build
## Run the server locally
node bin/prysm-mcp
## Debug MCP communication
DEBUG=mcp:* node bin/prysm-mcp
## Set custom output directories
PRYSM_OUTPUT_DIR=./my-output PRYSM_IMAGE_OUTPUT_DIR=./my-output/images node bin/prysm-mcp
Running via npx
You can run the server directly with npx without installing:
## Run with default settings
npx @pinkpixel/prysm-mcp
## Run with custom output directories
PRYSM_OUTPUT_DIR=./my-output PRYSM_IMAGE_OUTPUT_DIR=./my-output/images npx @pinkpixel/prysm-mcp
π License
MIT
π Credits
Developed by Pink Pixel
Powered by the Model Context Protocol and Puppeteer