A high-performance, local RAG (Retrieval-Augmented Generation) system built in Rust that integrates with Claude Desktop via the Model Context Protocol (MCP). Search and analyze your PDF documents directly within Claude conversations without sending data to external services.
This project demonstrates how to build a production-ready MCP server using Rust that:
- Processes PDF documents locally using poppler for text extraction
- Generates embeddings using local Ollama models (no external API calls)
- Provides semantic search through document collections
- Integrates seamlessly with Claude Desktop via MCP protocol
- Maintains privacy by keeping all data processing local
The Model Context Protocol (MCP) is a standard that allows AI assistants like Claude to interact with external tools and data sources. Instead of Claude being limited to its training data, MCP enables it to:
- Call external tools and functions
- Access real-time data sources
- Integrate with local applications
- Maintain context across interactions
This implementation leverages the rmcp crate - the official Rust SDK for MCP - to create a server that exposes RAG capabilities to Claude Desktop.
βββββββββββββββββββ MCP Protocol ββββββββββββββββββββ
β β (stdin/stdout) β β
β Claude Desktop β βββββββββββββββββββΊ β Rust RAG β
β β β MCP Server β
βββββββββββββββββββ ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β Local RAG Stack β
β β
β β’ PDF Parser β
β β’ Ollama β
β β’ Vector Store β
β β’ Search Engine β
ββββββββββββββββββββ
#[tool(tool_box)]
impl ServerHandler for RagMcpServer {
fn get_info(&self) -> ServerInfo {
// Provides server metadata to Claude
}
}Uses rmcp macros to expose RAG functionality as MCP tools:
#[tool(description = "Search through uploaded documents using semantic similarity")]
async fn search_documents(&self, query: String, top_k: Option<usize>) -> Result<CallToolResult, McpError>
#[tool(description = "List all uploaded documents")]
async fn list_documents(&self) -> Result<CallToolResult, McpError>
#[tool(description = "Get RAG system statistics")]
async fn get_stats(&self) -> Result<CallToolResult, McpError>// Uses stdin/stdout transport for Claude Desktop integration
let service = server.serve(stdio()).await?;- Vector-based similarity search using Ollama embeddings
- Configurable result count (top-k)
- Relevance scoring for search results
- Automatic PDF text extraction via poppler
- Document chunking for optimal embedding generation
- Real-time document list and statistics
- All processing happens locally
- No external API calls for document content
- Embeddings stored locally for fast retrieval
- Rust's memory safety and performance
- Async/await for non-blocking operations
- Efficient vector storage and retrieval
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Install Ollama
brew install ollama
# Install Poppler (for PDF parsing)
brew install poppler
# Start Ollama and install embedding model
make setup-ollamagit clone <this-repository>
cd rust-local-rag
make installAdd to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"rust-local-rag": {
"command": "/Users/yourusername/.cargo/bin/rust-local-rag",
"env": {
"DATA_DIR": "/Users/yourusername/Documents/data",
"DOCUMENTS_DIR": "/Users/yourusername/Documents/rag",
"LOG_DIR": "/tmp/rust-local-rag",
"LOG_LEVEL": "info",
"LOG_MAX_MB": "10"
}
}
}
}# Add PDFs to documents directory
cp your-files.pdf ~/Documents/rag/
# Restart Claude Desktop
# Now ask Claude: "Search my documents for information about X"- π¦ Rust: Core application language for performance and safety
- π‘ rmcp: Official Rust MCP SDK for Claude integration
- π€ Ollama: Local embedding generation (nomic-embed-text)
- π Poppler: PDF text extraction
- ποΈ Custom Vector Store: In-memory vector database for fast search
- Document Ingestion: PDFs β Text extraction β Chunking
- Embedding Generation: Text chunks β Ollama β Vector embeddings
- Indexing: Embeddings β Local vector store
- Search: Query β Embedding β Similarity search β Results
- MCP Integration: Results β Claude Desktop via MCP protocol
| Aspect | MCP Approach | HTTP API Approach |
|---|---|---|
| Integration | Native Claude Desktop support | Requires custom client |
| Security | Process isolation, no network | Network exposure required |
| Performance | Direct stdin/stdout IPC | Network overhead |
| User Experience | Seamless tool integration | Manual API management |
-
search_documents- Purpose: Semantic search across document collection
- Input: Query string, optional result count
- Output: Ranked search results with similarity scores
-
list_documents- Purpose: Document inventory management
- Input: None
- Output: List of all indexed documents
-
get_stats- Purpose: System monitoring and debugging
- Input: None
- Output: Embedding counts, memory usage, performance metrics
- Setup Guide: Complete installation and configuration
- Usage Guide: Claude Desktop integration and usage examples
Contributions are welcome! This project demonstrates practical MCP server implementation patterns that can be adapted for other use cases.
# Run in development mode
make run
# Check formatting
cargo fmt --check
# Run linter
cargo clippyThis project is licensed under the MIT License - see the LICENSE file for details.
- Model Context Protocol for the specification
- rmcp for the excellent Rust MCP SDK
- Ollama for local embedding generation
- Claude Desktop for MCP integration support
Built with β€οΈ in Rust | Powered by MCP | Privacy-focused RAG