Skip to content
reaatechREAATECH

@reaatech/hybrid-rag

npm v0.1.0

Provides a shared library of TypeScript interfaces and Zod schemas for defining documents, retrieval results, and evaluation metrics within a hybrid RAG pipeline. It serves as the type-safe foundation for the `@reaatech/hybrid-rag` ecosystem, requiring only `zod` as a runtime dependency.

@reaatech/hybrid-rag

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Core domain types, Zod schemas, and shared type utilities for the hybrid RAG ecosystem. This package is the single source of truth for all document, chunking, retrieval, evaluation, ablation, and benchmarking types used throughout the @reaatech/hybrid-rag-* family.

Installation

terminal
npm install @reaatech/hybrid-rag
# or
pnpm add @reaatech/hybrid-rag

Feature Overview

  • Domain typesDocument, Chunk, RetrievalResult, HybridResult, RerankedResult, and more
  • Enum typesChunkingStrategy with four strategies (fixed-size, semantic, recursive, sliding-window)
  • Zod schemas — runtime validation for documents, chunks, chunking configs, evaluation samples, ablation configs, vector queries, and BM25 queries
  • Evaluation typesEvaluationSample, EvaluationResult, summary metrics with standard deviations
  • Ablation typesAblationConfig, AblationResult with delta comparisons
  • Benchmarking typesBenchmarkResult with latency percentiles, component breakdowns, cost tracking, throughput, and environment info
  • Zero runtime dependencies beyond zod — lightweight and tree-shakeable
  • Dual ESM/CJS output — works with import and require

Quick Start

typescript
import {
  ChunkingStrategy,
  type Document,
  type Chunk,
  type ChunkingConfig,
  type RetrievalResult,
  type EvaluationSample,
} from '@reaatech/hybrid-rag';
 
const doc: Document = {
  id: 'doc-001',
  content: 'Your document text here...',
  source: '/data/knowledge-base.md',
  metadata: { department: 'engineering' },
};
 
const config: ChunkingConfig = {
  strategy: ChunkingStrategy.SEMANTIC,
  chunkSize: 512,
  overlap: 50,
};

Schema Validation

Every type has a matching Zod schema for runtime validation:

typescript
import { DocumentSchema, ChunkingConfigSchema, validateDocument } from '@reaatech/hybrid-rag';
 
// Validate at the boundary — throws ZodError on invalid data
const validDoc = DocumentSchema.parse(rawJson);
 
// Or use the convenience validators
const validatedDoc = validateDocument(rawJson);

Exports

Document Types

ExportDescription
DocumentSource document with id, content, source, metadata, optional title/author/date
DocumentSchema / DocumentInputZod schema and inferred type for document validation

Chunking Types

ExportDescription
ChunkingStrategyEnum: fixed-size, semantic, recursive, sliding-window
ChunkingConfigStrategy configuration: chunkSize, overlap, thresholds, separators
ChunkingConfigSchemaZod schema with defaults (512 chunk size, 50 overlap)
ChunkText chunk: id, documentId, content, embedding, tokenCount, position, metadata

Retrieval Types

ExportDescription
VectorQueryVector search input: vector, topK, distance metric, filter
BM25QueryBM25 search input: query text, topK, k1/b parameters, filter
RetrievalResultSingle result: chunkId, documentId, content, score, source
HybridResultPost-fusion results with vectorScore, bm25Score, vectorRank, bm25Rank
RerankedResultPost-reranker results with rerankerScore, fusedScore, finalScore

Evaluation Types

ExportDescription
EvaluationSampleGround truth: query_id, query, relevant_docs, relevant_chunks, ideal_answer
EvaluationResultPer-query metrics + aggregate summary with standard deviations
EvaluationSampleSchemaZod schema with required relevant_docs and relevant_chunks arrays

Ablation Types

ExportDescription
AblationConfigBaseline + variants with changes to chunking, retrieval, reranker, weights
AblationResultBaseline metrics + per-variant results with delta comparisons
AblationConfigSchemaZod schema with nullable reranker field

Benchmarking Types

ExportDescription
BenchmarkResultLatency percentiles (p50/p90/p95/p99), component breakdown, cost, throughput

License

MIT