@reaatech/hybrid-rag-mcp-server
Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.
MCP (Model Context Protocol) server exposing 41+ tools for hybrid RAG integration with AI agent systems. Provides tool categories for retrieval, ingestion, evaluation, query analysis, session management, agent integration, cost management, quality assurance, observability, and administration.
Installation
npm install @reaatech/hybrid-rag-mcp-server @modelcontextprotocol/sdk
# or
pnpm add @reaatech/hybrid-rag-mcp-server @modelcontextprotocol/sdk
Feature Overview
41+ MCP tools across 10 categories covering the full RAG lifecycle
Transport flexibility — stdio, HTTP, and SSE transport support
Query analysis — intent classification, query decomposition, routing recommendations
Session management — multi-turn conversation context with session create/update/history
Agent integration — discover agents, route to specialized agents, register callbacks
Cost management — budget configuration, cost estimation, optimization, reporting
Quality assurance — LLM-as-judge, hallucination detection, A/B config comparison
Observability — real-time metrics, trace retrieval, health check, collection stats
Input validation — Zod-based request validation with structured error responses
Quick Start
Start the MCP Server (stdio)
import { createMCPServer } from '@reaatech/hybrid-rag-mcp-server' ;
import { RAGPipeline } from '@reaatech/hybrid-rag-pipeline' ;
const pipeline = new RAGPipeline ({
qdrantUrl: process.env.QDRANT_URL,
embeddingProvider: 'openai' ,
embeddingModel: 'text-embedding-3-small' ,
});
const server = createMCPServer ({
pipeline,
transport: 'stdio' ,
});
await server. start ();
Start via CLI
hybrid-rag server --qdrant-url http://localhost:6333 --collection documents
API Reference
createMCPServer(config)
Creates and configures an MCP server instance with all tools registered.
MCPServerConfig
Property Type Description pipelineRAGPipelineConfigured RAG pipeline instance transportstdio' | 'http' | 'sseTransport layer portnumberPort for HTTP/SSE transport (default: 3000) hoststringHost for HTTP/SSE transport (default: ‘localhost’)
Tool Categories
Core RAG Tools (4 tools)
Tool Description rag.retrieveExecute hybrid retrieval (vector + BM25) with optional reranking rag.vector_searchExecute vector-only semantic search rag.bm25_searchExecute BM25 keyword-only search rag.rerankRerank existing retrieval results using cross-encoder
Ingestion Tools (3 tools)
Tool Description rag.ingest_documentIngest a single document rag.ingest_batchBatch process multiple documents rag.chunk_documentPreview chunking strategies on a document
Evaluation Tools (3 tools)
Tool Description rag.evaluateRun evaluation on a dataset rag.ablationExecute ablation study rag.benchmarkRun performance benchmarks
Query Analysis Tools (3 tools)
Tool Description rag.analyze_queryQuery intent analysis and routing recommendation rag.decompose_queryMulti-step query decomposition for complex questions rag.classify_intentClassify query intent for optimal retrieval strategy
Session Management Tools (3 tools)
Tool Description rag.get_contextRetrieve conversation context for multi-turn RAG rag.session_manageCreate, update, and manage RAG sessions rag.session_historyRetrieve session query history
Agent Integration Tools (4 tools)
Tool Description rag.discover_agentsDiscover available agents in agent-mesh rag.route_to_agentRoute query to specialized agent based on intent rag.get_agent_capabilitiesQuery capabilities of registered agents rag.register_callbackRegister callback for async agent responses
Cost Management Tools (6 tools)
Tool Description rag.get_cost_estimateEstimate cost for a query before execution rag.set_budgetConfigure budget limits (per-query, daily, monthly) rag.get_budget_statusCurrent budget status and remaining capacity rag.optimize_costGet cost optimization recommendations rag.get_cost_reportDetailed cost breakdown by component rag.set_cost_controlsConfigure cost controls and alerts
Quality Tools (6 tools)
Tool Description rag.judge_qualityLLM-as-judge for result quality assessment rag.validate_resultsValidate retrieval results against quality criteria rag.detect_hallucinationDetect potential hallucinations in results rag.compare_configsA/B test different RAG configurations rag.get_quality_metricsReal-time quality metrics dashboard rag.run_quality_checkRun automated quality check for production queries
Observability Tools (6 tools)
Tool Description rag.get_metricsReal-time system metrics (latency, throughput, errors) rag.get_traceRetrieve OpenTelemetry trace for a query rag.health_checkComprehensive system health status rag.get_performancePerformance analytics and trends over time rag.get_collection_statsStatistics for specific Qdrant collections rag.monitor_alertsActive alerts and monitoring status
Admin Tools (3 tools)
Tool Description rag.statusSystem status and health overview rag.collectionsQdrant collection management rag.configConfiguration management and inspection
Usage Patterns
Retrieval with Full Configuration
{
"name" : "rag.retrieve" ,
"arguments" : {
"query" : "How do I reset my password?" ,
"topK" : 10 ,
"retrievalMode" : "hybrid" ,
"vectorWeight" : 0.7 ,
"bm25Weight" : 0.3 ,
"useReranker" : true ,
"rerankerProvider" : "cohere" ,
"filter" : { "department" : "engineering" }
}
}
Multi-Turn Conversation
// Create a session
{ "name" : "rag.session_manage" , "arguments" : { "action" : "create" , "user_id" : "user-123" } }
// Query with session context
{ "name" : "rag.retrieve" , "arguments" : { "query" : "What about macOS?" , "session_id" : "sess-abc" , "use_context" : true } }
Cost-Aware Querying
// Check budget first
{ "name" : "rag.get_budget_status" , "arguments" : { "scope" : { "user_id" : "team-alpha" } } }
// If budget allows, use full config; otherwise use cheaper retrieval
{ "name" : "rag.retrieve" , "arguments" : { "query" : "..." , "useReranker" : false , "topK" : 5 } }
Quality Check
{
"name" : "rag.judge_quality" ,
"arguments" : {
"query" : "How do I configure SSO?" ,
"results" : [
{ "chunk_id" : "chunk-001" , "content" : "..." , "score" : 0.92 }
],
"judge_model" : "claude-opus" ,
"criteria" : [ "relevance" , "completeness" ],
"consensus_count" : 3
}
}
Related Packages
License
MIT