Skip to content
reaatechREAATECH

@reaatech/hybrid-rag-mcp-server

npm v0.1.0

Exposes over 40 Model Context Protocol (MCP) tools for managing RAG lifecycles, including retrieval, ingestion, evaluation, and observability. It provides a `createMCPServer` function that wraps a `RAGPipeline` instance and supports stdio, HTTP, or SSE transport layers.

@reaatech/hybrid-rag-mcp-server

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

MCP (Model Context Protocol) server exposing 41+ tools for hybrid RAG integration with AI agent systems. Provides tool categories for retrieval, ingestion, evaluation, query analysis, session management, agent integration, cost management, quality assurance, observability, and administration.

Installation

terminal
npm install @reaatech/hybrid-rag-mcp-server @modelcontextprotocol/sdk
# or
pnpm add @reaatech/hybrid-rag-mcp-server @modelcontextprotocol/sdk

Feature Overview

  • 41+ MCP tools across 10 categories covering the full RAG lifecycle
  • Transport flexibility — stdio, HTTP, and SSE transport support
  • Query analysis — intent classification, query decomposition, routing recommendations
  • Session management — multi-turn conversation context with session create/update/history
  • Agent integration — discover agents, route to specialized agents, register callbacks
  • Cost management — budget configuration, cost estimation, optimization, reporting
  • Quality assurance — LLM-as-judge, hallucination detection, A/B config comparison
  • Observability — real-time metrics, trace retrieval, health check, collection stats
  • Input validation — Zod-based request validation with structured error responses

Quick Start

Start the MCP Server (stdio)

typescript
import { createMCPServer } from '@reaatech/hybrid-rag-mcp-server';
import { RAGPipeline } from '@reaatech/hybrid-rag-pipeline';
 
const pipeline = new RAGPipeline({
  qdrantUrl: process.env.QDRANT_URL,
  embeddingProvider: 'openai',
  embeddingModel: 'text-embedding-3-small',
});
 
const server = createMCPServer({
  pipeline,
  transport: 'stdio',
});
 
await server.start();

Start via CLI

terminal
hybrid-rag server --qdrant-url http://localhost:6333 --collection documents

API Reference

createMCPServer(config)

Creates and configures an MCP server instance with all tools registered.

MCPServerConfig

PropertyTypeDescription
pipelineRAGPipelineConfigured RAG pipeline instance
transportstdio' | 'http' | 'sseTransport layer
portnumberPort for HTTP/SSE transport (default: 3000)
hoststringHost for HTTP/SSE transport (default: ‘localhost’)

Tool Categories

Core RAG Tools (4 tools)

ToolDescription
rag.retrieveExecute hybrid retrieval (vector + BM25) with optional reranking
rag.vector_searchExecute vector-only semantic search
rag.bm25_searchExecute BM25 keyword-only search
rag.rerankRerank existing retrieval results using cross-encoder

Ingestion Tools (3 tools)

ToolDescription
rag.ingest_documentIngest a single document
rag.ingest_batchBatch process multiple documents
rag.chunk_documentPreview chunking strategies on a document

Evaluation Tools (3 tools)

ToolDescription
rag.evaluateRun evaluation on a dataset
rag.ablationExecute ablation study
rag.benchmarkRun performance benchmarks

Query Analysis Tools (3 tools)

ToolDescription
rag.analyze_queryQuery intent analysis and routing recommendation
rag.decompose_queryMulti-step query decomposition for complex questions
rag.classify_intentClassify query intent for optimal retrieval strategy

Session Management Tools (3 tools)

ToolDescription
rag.get_contextRetrieve conversation context for multi-turn RAG
rag.session_manageCreate, update, and manage RAG sessions
rag.session_historyRetrieve session query history

Agent Integration Tools (4 tools)

ToolDescription
rag.discover_agentsDiscover available agents in agent-mesh
rag.route_to_agentRoute query to specialized agent based on intent
rag.get_agent_capabilitiesQuery capabilities of registered agents
rag.register_callbackRegister callback for async agent responses

Cost Management Tools (6 tools)

ToolDescription
rag.get_cost_estimateEstimate cost for a query before execution
rag.set_budgetConfigure budget limits (per-query, daily, monthly)
rag.get_budget_statusCurrent budget status and remaining capacity
rag.optimize_costGet cost optimization recommendations
rag.get_cost_reportDetailed cost breakdown by component
rag.set_cost_controlsConfigure cost controls and alerts

Quality Tools (6 tools)

ToolDescription
rag.judge_qualityLLM-as-judge for result quality assessment
rag.validate_resultsValidate retrieval results against quality criteria
rag.detect_hallucinationDetect potential hallucinations in results
rag.compare_configsA/B test different RAG configurations
rag.get_quality_metricsReal-time quality metrics dashboard
rag.run_quality_checkRun automated quality check for production queries

Observability Tools (6 tools)

ToolDescription
rag.get_metricsReal-time system metrics (latency, throughput, errors)
rag.get_traceRetrieve OpenTelemetry trace for a query
rag.health_checkComprehensive system health status
rag.get_performancePerformance analytics and trends over time
rag.get_collection_statsStatistics for specific Qdrant collections
rag.monitor_alertsActive alerts and monitoring status

Admin Tools (3 tools)

ToolDescription
rag.statusSystem status and health overview
rag.collectionsQdrant collection management
rag.configConfiguration management and inspection

Usage Patterns

Retrieval with Full Configuration

json
{
  "name": "rag.retrieve",
  "arguments": {
    "query": "How do I reset my password?",
    "topK": 10,
    "retrievalMode": "hybrid",
    "vectorWeight": 0.7,
    "bm25Weight": 0.3,
    "useReranker": true,
    "rerankerProvider": "cohere",
    "filter": { "department": "engineering" }
  }
}

Multi-Turn Conversation

json
// Create a session
{ "name": "rag.session_manage", "arguments": { "action": "create", "user_id": "user-123" } }
 
// Query with session context
{ "name": "rag.retrieve", "arguments": { "query": "What about macOS?", "session_id": "sess-abc", "use_context": true } }

Cost-Aware Querying

json
// Check budget first
{ "name": "rag.get_budget_status", "arguments": { "scope": { "user_id": "team-alpha" } } }
 
// If budget allows, use full config; otherwise use cheaper retrieval
{ "name": "rag.retrieve", "arguments": { "query": "...", "useReranker": false, "topK": 5 } }

Quality Check

json
{
  "name": "rag.judge_quality",
  "arguments": {
    "query": "How do I configure SSO?",
    "results": [
      { "chunk_id": "chunk-001", "content": "...", "score": 0.92 }
    ],
    "judge_model": "claude-opus",
    "criteria": ["relevance", "completeness"],
    "consensus_count": 3
  }
}

License

MIT