Skip to content
reaatechREAATECH

@reaatech/hybrid-rag-pipeline

npm v0.1.0

Provides a unified `RAGPipeline` class for orchestrating document ingestion, hybrid vector and BM25 retrieval, and reranking. It requires a Qdrant instance and integrates with various embedding and reranking providers to manage the end-to-end RAG lifecycle.

@reaatech/hybrid-rag-pipeline

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Main RAGPipeline orchestrator — provides a unified interface for document ingestion and hybrid retrieval with optional reranking. This is the primary entry point for most users of the hybrid RAG ecosystem.

Installation

terminal
npm install @reaatech/hybrid-rag-pipeline
# or
pnpm add @reaatech/hybrid-rag-pipeline

Feature Overview

  • Single-class API — one RAGPipeline class handles ingestion and retrieval
  • Lazy initialization — automatic connection setup on first use, deduplicated concurrent init calls
  • Hybrid retrieval — vector + BM25 search with configurable fusion weights
  • Optional reranking — plug in Cohere, Jina, OpenAI, or local cross-encoder reranker
  • Configurable chunking — choose strategy and parameters at pipeline level
  • Pipeline stats — get collection stats, document counts, chunk counts

Quick Start

typescript
import { RAGPipeline } from '@reaatech/hybrid-rag-pipeline';
import { ChunkingStrategy } from '@reaatech/hybrid-rag';
 
const pipeline = new RAGPipeline({
  qdrantUrl: process.env.QDRANT_URL || 'http://localhost:6333',
  collectionName: 'knowledge-base',
 
  embeddingProvider: 'openai',
  embeddingModel: 'text-embedding-3-small',
  embeddingApiKey: process.env.OPENAI_API_KEY,
 
  chunkingStrategy: ChunkingStrategy.FIXED_SIZE,
  chunkSize: 512,
  chunkOverlap: 50,
 
  useHybrid: true,
  vectorWeight: 0.7,
  bm25Weight: 0.3,
  fusionStrategy: 'rrf',
 
  rerankerProvider: 'cohere',
  rerankerApiKey: process.env.COHERE_API_KEY,
  rerankTopK: 20,
  rerankFinalK: 10,
 
  topK: 10,
});
 
await pipeline.initialize();
 
// Ingest
const chunks = await pipeline.ingest([
  { id: 'doc-1', content: 'Password reset requires email verification...' },
  { id: 'doc-2', content: 'Refund policy: requests must be submitted within 14 days...' },
]);
 
// Query
const results = await pipeline.query('How do I reset my password?', {
  topK: 5,
  useReranker: true,
  filter: { department: 'engineering' },
});
 
for (const r of results) {
  console.log(`[${r.score.toFixed(3)}] ${r.content.substring(0, 80)}...`);
}
 
// Stats
const stats = await pipeline.getStats();
console.log(`Collection: ${stats.collectionName}, Docs: ${stats.totalDocuments}, Chunks: ${stats.totalChunks}`);
 
// Cleanup
await pipeline.close();

API Reference

RAGPipeline

Constructor

typescript
new RAGPipeline(config: RAGPipelineConfig)

RAGPipelineConfig

CategoryPropertyTypeDefaultDescription
QdrantqdrantUrlstring(required)Qdrant server URL
qdrantApiKeystringQdrant API key
collectionNamestringdocumentsQdrant collection name
EmbeddingembeddingProvideropenai' | 'vertex' | 'localopenaiEmbedding provider
embeddingModelstringtext-embedding-3-smallModel name
embeddingApiKeystringAPI key
ChunkingchunkingStrategyChunkingStrategyFIXED_SIZEChunking strategy
chunkSizenumber512Chunk size in tokens
chunkOverlapnumber50Overlap in tokens
RetrievaltopKnumber10Default result count
useHybridbooleantrueEnable hybrid (vector + BM25)
vectorWeightnumber0.7Vector score weight
bm25Weightnumber0.3BM25 score weight
BM25bm25K1number1.2BM25 term frequency saturation
bm25Bnumber0.75BM25 length normalization
FusionfusionStrategyrrf' | 'weighted-sum' | 'normalizedrrfFusion algorithm
RerankerrerankerProviderstring | nullnullReranker provider (null = disabled)
rerankerModelstringReranker model name
rerankerApiKeystringReranker API key
rerankTopKnumber20Candidates to rerank
rerankFinalKnumber10Results after reranking

Methods

MethodReturnsDescription
initialize()Promise<void>Set up connections, lazy and deduplicated
ingest(documents)Promise<Chunk[]>Chunk and index documents
query(text, options?)Promise<RetrievalResult[]>Hybrid retrieval with optional reranking
getStats()Promise<PipelineStats>Collection stats and doc/chunk counts
close()Promise<void>Release connections and reset

QueryOptions

PropertyTypeDefaultDescription
topKnumberpipeline defaultResult count
useRerankerbooleantrue if reranker configuredEnable reranking
rerankTopKnumber20Candidates to rerank
rerankFinalKnumber10Final result count after reranking
vectorWeightnumber0.7Vector weight override
bm25Weightnumber0.3BM25 weight override
filterRecord<string, unknown>Metadata filter
retrievalModehybrid' | 'vector' | 'bm25hybridSearch mode

Usage Patterns

Vector-Only Mode

typescript
const results = await pipeline.query('query', {
  retrievalMode: 'vector',
  topK: 10,
});

Cost-Conscious Mode (No Reranker)

typescript
const pipeline = new RAGPipeline({
  qdrantUrl: process.env.QDRANT_URL,
  rerankerProvider: null, // skip reranker entirely
});
 
const results = await pipeline.query('query', { topK: 10 });
typescript
const results = await pipeline.query('API rate limits', {
  filter: {
    department: 'engineering',
    status: 'published',
    version: 'v2',
  },
});

License

MIT