RAG Documentation Search▌
by sanderkooger
Leverage retrieval augmented generation and Pinecone vector database for precise, context-aware document search and retr
Provides semantic document search and retrieval through vector embeddings, enabling context-aware responses backed by specific documentation sources
Both formats append explainx.ai attribution and the canonical URL for this MCP server listing.
best for
- / Building documentation-aware AI assistants
- / Developers needing context-aware tooling
- / Teams wanting to search knowledge bases semantically
capabilities
- / Search documentation using semantic vector embeddings
- / Retrieve relevant context from multiple documentation sources
- / Generate embeddings locally with Ollama or via OpenAI
- / Process and index documentation automatically
- / Augment AI responses with cited documentation sources
what it does
Enables semantic search through documentation using vector embeddings, allowing AI assistants to retrieve and cite relevant documentation context for user queries.
about
RAG Documentation Search is a community-built MCP server published by sanderkooger that provides AI assistants with tools and capabilities via the Model Context Protocol. Leverage retrieval augmented generation and Pinecone vector database for precise, context-aware document search and retr It is categorized under ai ml, developer tools.
how to install
You can install RAG Documentation Search in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.
license
MIT
RAG Documentation Search is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.
readme
MCP-server-ragdocs
An MCP server implementation that provides tools for retrieving and processing documentation through vector search, enabling AI assistants to augment their responses with relevant documentation context.
Table of Contents
- Usage
- Features
- Configuration
- Deployment
- Playwright Integration
- Tools
- Project Structure
- Using Ollama Embeddings
- License
- Development Workflow
- Contributing
- Forkception Acknowledgments
Usage
The RAG Documentation tool is designed for:
- Enhancing AI responses with relevant documentation
- Building documentation-aware AI assistants
- Creating context-aware tooling for developers
- Implementing semantic documentation search
- Augmenting existing knowledge bases
Features
- Vector-based documentation search and retrieval
- Support for multiple documentation sources
- Support for local (Ollama) embeddings generation or OPENAI
- Semantic search capabilities
- Automated documentation processing
- Real-time context augmentation for LLMs
Configuration
{
"mcpServers": {
"rag-docs": {
"command": "npx",
"args": ["-y", "@sanderkooger/mcp-server-ragdocs"],
"env": {
"EMBEDDINGS_PROVIDER": "ollama",
"QDRANT_URL": "your-qdrant-url",
"QDRANT_API_KEY": "your-qdrant-key" # if applicable
}
}
}
}
Usage with Claude Desktop
Add this to your claude_desktop_config.json:
OpenAI Configuration
{
"mcpServers": {
"rag-docs-openai": {
"command": "npx",
"args": ["-y", "@sanderkooger/mcp-server-ragdocs"],
"env": {
"EMBEDDINGS_PROVIDER": "openai",
"OPENAI_API_KEY": "your-openai-key-here",
"QDRANT_URL": "your-qdrant-url",
"QDRANT_API_KEY": "your-qdrant-key"
}
}
}
}
Ollama Configuration
{
"mcpServers": {
"rag-docs-ollama": {
"command": "npx",
"args": ["-y", "@sanderkooger/mcp-server-ragdocs"],
"env": {
"EMBEDDINGS_PROVIDER": "ollama",
"OLLAMA_BASE_URL": "http://localhost:11434",
"QDRANT_URL": "your-qdrant-url",
"QDRANT_API_KEY": "your-qdrant-key"
}
}
}
}
Ollama run from this codebase
"ragdocs-mcp": {
"command": "node",
"args": [
"/home/sander/code/mcp-server-ragdocs/build/index.js"
],
"env": {
"QDRANT_URL": "http://127.0.0.1:6333",
"EMBEDDINGS_PROVIDER": "ollama",
"OLLAMA_URL": "http://localhost:11434"
},
"alwaysAllow": [
"run_queue",
"list_queue",
"list_sources",
"search_documentation",
"clear_queue",
"remove_documentation",
"extract_urls"
],
"timeout": 3600
}
Environment Variables Reference
| Variable | Required For | Default | remarks |
|---|---|---|---|
EMBEDDINGS_PROVIDER | All | ollama | "openai" or "ollama" |
OPENAI_API_KEY | OpenAI | - | Obtain from OpenAI dashboard |
OLLAMA_BASE_URL | Ollama | http://localhost:11434 | Local Ollama server URL |
QDRANT_URL | All | http://localhost:6333 | Qdrant endpoint URL |
QDRANT_API_KEY | Cloud Qdrant | - | From Qdrant Cloud console |
PLAYWRIGHT_WS_ENDPOINT | Playwright Remote | - | WebSocket endpoint for remote Playwright server (e.g., ws://localhost:3000/) |
Local Deployment
The repository includes Docker Compose configuration for local development:
docker compose up -d
This starts:
- Qdrant vector database on port 6333
- Ollama LLM service on port 11434
Access endpoints:
- Qdrant: http://localhost:6333
- Ollama: http://localhost:11434
Cloud Deployment
For production deployments:
- Use hosted Qdrant Cloud service
- Set these environment variables:
QDRANT_URL=your-cloud-cluster-url
QDRANT_API_KEY=your-cloud-api-key
Playwright Integration
This project supports running Playwright either locally or via a Docker container. This provides flexibility for environments where Playwright's dependencies might be challenging to install directly.
How it Works
The src/api-client.ts file automatically detects the presence of the PLAYWRIGHT_WS_ENDPOINT environment variable:
- If
PLAYWRIGHT_WS_ENDPOINTis set: The application will attempt to connect to a remote Playwright server at the specified WebSocket endpoint usingchromium.connect(). This is ideal for using a containerized Playwright instance. - If
PLAYWRIGHT_WS_ENDPOINTis not set: The application will launch a local Playwright browser instance usingchromium.launch().
Running Playwright in Docker
A playwright service has been added to the docker-compose.yml file to facilitate running Playwright in a Docker container.
To start the Playwright server in Docker:
docker-compose up playwright
This command will pull the mcr.microsoft.com/playwright:v1.53.0-noble image and start a Playwright server accessible on port 3000 of your host machine.
To configure your application to use this containerized Playwright instance, set the following environment variable:
PLAYWRIGHT_WS_ENDPOINT=ws://localhost:3000/
Tools
search_documentation
Search through stored documentation using natural language queries. Returns matching excerpts with context, ranked by relevance.
Inputs:
query(string): The text to search for in the documentation. Can be a natural language query, specific terms, or code snippets.limit(number, optional): Maximum number of results to return (1-20, default: 5). Higher limits provide more comprehensive results but may take longer to process.
list_sources
List all documentation sources currently stored in the system. Returns a comprehensive list of all indexed documentation including source URLs, titles, and last update times. Use this to understand what documentation is available for searching or to verify if specific sources have been indexed.
extract_urls
Extract and analyze all URLs from a given web page. This tool crawls the specified webpage, identifies all hyperlinks, and optionally adds them to the processing queue.
Inputs:
url(string): The complete URL of the webpage to analyze (must include protocol, e.g., https://). The page must be publicly accessible.add_to_queue(boolean, optional): If true, automatically add extracted URLs to the processing queue for later indexing. Use with caution on large sites to avoid excessive queuing.
remove_documentation
Remove specific documentation sources from the system by their URLs. The removal is permanent and will affect future search results.
Inputs:
urls(string[]): Array of URLs to remove from the database. Each URL must exactly match the URL used when the documentation was added.
list_queue
List all URLs currently waiting in the documentation processing queue. Shows pending documentation sources that will be processed when run_queue is called. Use this to monitor queue status, verify URLs were added correctly, or check processing backlog.
run_queue
Process and index all URLs currently in the documentation queue. Each URL is processed sequentially, with proper error handling and retry logic. Progress updates are provided as processing occurs. Long-running operations will process until the queue is empty or an unrecoverable error occurs.
clear_queue
Remove all pending URLs from the documentation processing queue. Use this to reset the queue when you want to start fresh, remove unwanted URLs, or cancel pending processing. This operation is immediate and permanent - URLs will need to be re-added if you want to process them later.
Project Structure
The package follows a modular architecture with clear separation between core components and MCP protocol handlers. See ARCHITECTURE.md for detailed structural documentation and design decisions.
Using Ollama Embeddings without docker
- Install Ollama:
curl -fsSL https://ollama.com/install.sh | sh
- Download the nomic-embed-text model:
ollama pull nomic-embed-text
- Verify installation:
ollama list
License
This MCP server is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.
Contributing
We welcome contributions! Please see our CONTRIBUTING.md for detailed guidelines, but here a
FAQ
- What is the RAG Documentation Search MCP server?
- RAG Documentation Search is a Model Context Protocol (MCP) server profile on explainx.ai. MCP lets AI hosts (e.g. Claude Desktop, Cursor) call tools and resources through a standard interface; this page summarizes categories, install hints, and community ratings.
- How do MCP servers relate to agent skills?
- Skills are reusable instruction packages (often SKILL.md); MCP servers expose live capabilities. Teams frequently combine both—skills for workflows, MCP for APIs and data. See explainx.ai/skills and explainx.ai/mcp-servers for parallel directories.
- How are reviews shown for RAG Documentation Search?
- This profile displays 34 aggregated ratings (sample rows for discoverability plus signed-in user reviews). Average score is about 4.6 out of 5—verify behavior in your own environment before production use.
Use Cases▌
Extended AI Capabilities
Add new capabilities to Claude beyond text generation
Example
Access external data sources, execute code, interact with tools and services
Transform Claude from chatbot to action-taking agent
Context Enhancement
Provide Claude with access to relevant context and data
Example
Load project documentation, access knowledge bases, query databases
Get more accurate, context-aware responses
Workflow Automation
Automate multi-step workflows combining AI and external tools
Example
Research → Summarize → Create document → Send notification
Complete complex tasks end-to-end without manual steps
Implementation Guide▌
Prerequisites
- ›Claude Desktop 0.7.0+ or Cursor IDE with MCP support
- ›Basic understanding of MCP architecture and capabilities
- ›Access credentials for integrated services (if required)
- ›Willingness to experiment and iterate on configuration
Time Estimate
15-60 minutes depending on server complexity
Installation Steps
- 1.Install MCP server: npm install -g [package-name] or via GitHub
- 2.Add server configuration to ~/.claude/mcp.json
- 3.Provide required credentials and configuration
- 4.Restart Claude Desktop to load new server
- 5.Test basic functionality with simple prompts
- 6.Explore capabilities and experiment with use cases
- 7.Document successful patterns for reuse
Troubleshooting
- ⚠MCP server not loading: Check config syntax, verify installation
- ⚠Connection errors: Check network, firewall, credentials
- ⚠Feature not working: Read server docs, check required parameters
- ⚠Performance issues: Monitor resource usage, check for network latency
- ⚠Conflicts with other servers: Check port assignments, namespace collisions
Best Practices▌
✓ Do
- +Read server documentation thoroughly before setup
- +Start with simple use cases to validate functionality
- +Test in non-production environment first
- +Monitor resource usage and performance
- +Keep servers updated for bug fixes and new features
- +Document configuration for team members
- +Use environment variables for sensitive configuration
✗ Don't
- −Don't grant overly permissive access to MCP servers
- −Don't skip reading security considerations in docs
- −Don't expose sensitive data without proper controls
- −Don't run untrusted MCP servers without code review
- −Don't ignore error messages—investigate root cause
💡 Pro Tips
- ★Combine multiple MCP servers for powerful workflows
- ★Create custom MCP servers for your specific needs
- ★Share successful configurations with team
- ★Use MCP inspector for debugging
- ★Join MCP community for tips and troubleshooting
Technical Details▌
Architecture
Model Context Protocol standardizes how AI hosts (Claude, Cursor) communicate with external tools and data sources through server implementations.
Protocols
- Model Context Protocol (MCP)
- JSON-RPC 2.0
- stdio or HTTP transport
Compatibility
- Claude Desktop
- Cursor IDE
- Custom MCP clients
When to Use This▌
✓ Use When
Use when you need Claude to access external data, execute actions, or integrate with tools. Best for extending AI capabilities beyond conversation.
✗ Avoid When
Avoid when native integrations exist (use official APIs directly), for real-time critical systems, or when security/compliance requires zero external dependencies.
Integration▌
- →Tool composition: Chain multiple MCP tools in workflows
- →Context augmentation: Provide AI with relevant external data
- →Action delegation: Let AI execute tasks on external systems
- →Bidirectional sync: Keep AI context and external systems in sync
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
List & Promote Your MCP Server
Share your MCP server with the developer community
Ratings
4.6★★★★★34 reviews- ★★★★★Chaitanya Patil· Dec 28, 2024
According to our notes, RAG Documentation Search benefits from clear Model Context Protocol framing — fewer ambiguous “AI plugin” claims.
- ★★★★★Hana Verma· Dec 24, 2024
RAG Documentation Search is among the better-indexed MCP projects we tried; the explainx.ai summary tracks the official description.
- ★★★★★Henry White· Dec 20, 2024
According to our notes, RAG Documentation Search benefits from clear Model Context Protocol framing — fewer ambiguous “AI plugin” claims.
- ★★★★★Anika Abbas· Dec 16, 2024
I recommend RAG Documentation Search for teams standardizing on MCP; the explainx.ai page compares cleanly with sibling servers.
- ★★★★★Advait Rao· Dec 16, 2024
RAG Documentation Search reduced integration guesswork — categories and install configs on the listing matched the upstream repo.
- ★★★★★Piyush G· Nov 19, 2024
We wired RAG Documentation Search into a staging workspace; the listing’s GitHub and npm pointers saved time versus hunting across READMEs.
- ★★★★★Maya Srinivasan· Nov 15, 2024
Strong directory entry: RAG Documentation Search surfaces stars and publisher context so we could sanity-check maintenance before adopting.
- ★★★★★Zara Gonzalez· Nov 11, 2024
We wired RAG Documentation Search into a staging workspace; the listing’s GitHub and npm pointers saved time versus hunting across READMEs.
- ★★★★★Kiara Verma· Nov 7, 2024
We evaluated RAG Documentation Search against two servers with overlapping tools; this profile had the clearer scope statement.
- ★★★★★Benjamin Kapoor· Oct 26, 2024
RAG Documentation Search is among the better-indexed MCP projects we tried; the explainx.ai summary tracks the official description.
showing 1-10 of 34