rag-eval-pack · packages

Every package shipped from reaatech/rag-eval-pack, published or pending.

Sort

10 packages

@reaatech/rag-eval-cli

v0.1.0

A CLI that runs RAG evaluation suites, quality gates, run comparisons, cost breakdowns, markdown reports, LLM-based judging, and an MCP server, exposed as the `rag-eval-pack` command. It also re-exports the full programmatic API from all `@reaatech/rag-eval-*` packages as a single importable library.

View package View on npm

status: published
published: 1 month ago

rag-eval-core

@reaatech/rag-eval-core

v0.1.0

Canonical TypeScript types and Zod schemas for RAG evaluation data shapes. Exports 18+ types (`EvaluationSample`, `EvalSuiteConfig`, `SampleEvalResult`, `GateConfig`, `JudgeConfig`, etc.) and two Zod schemas (`EvaluationSampleSchema`, `EvalSuiteConfigSchema`) for runtime validation, with zero runtime dependencies beyond `zod`.

View package View on npm

status: published
published: 1 month ago

rag-eval-cost

@reaatech/rag-eval-cost

v0.1.0

Cost tracking, pricing, budgeting, and reporting infrastructure for RAG evaluations, providing `CostTracker`, `Pricing`, `BudgetManager`, and `CostReporter` classes that track per-sample token consumption, enforce budget limits with configurable alert thresholds, and generate cost reports in JSON and JUnit XML formats.

View package View on npm

status: published
published: 1 month ago

rag-eval-dataset

@reaatech/rag-eval-dataset

v0.1.0

A Zod-validated dataset loader and validator for RAG evaluation samples, supporting JSONL, JSON, and YAML formats with duplicate detection, synthetic generation from templates, and version tracking. Exports `DatasetLoader`, `DatasetValidator`, and `loadEvalConfig` functions.

View package View on npm

status: published
published: 1 month ago

rag-eval-gate

@reaatech/rag-eval-gate

v0.1.0

A quality gate engine for RAG evaluation pipelines that enforces threshold-based metric checks and baseline regression detection, returning a `GateResult` object with pass/fail status and per-gate failure messages. It pairs with `@reaatech/rag-eval-core` for evaluation result types and is designed for CI/CD integration with formatted output and configurable exit codes.

View package View on npm

status: published
published: 1 month ago

rag-eval-judge

@reaatech/rag-eval-judge

v0.1.0

A TypeScript class (`JudgeEngine`) that uses an LLM (Anthropic, OpenAI, or Google) to score RAG outputs on metrics like faithfulness and relevance, with optional consensus voting across multiple models and calibration against human labels.

View package View on npm

status: published
published: 1 month ago

rag-eval-mcp-server

@reaatech/rag-eval-mcp-server

v0.1.0

An MCP server that exposes RAG evaluation tools as a three-layer API of atomic judge operations, orchestrated suite runs, and CI-style regression gates, providing `createMcpServer()` and `startMcpServer()` functions for integration with MCP clients like Claude Desktop or Cursor.

View package View on npm

status: published
published: 1 month ago

rag-eval-metrics

@reaatech/rag-eval-metrics

v0.1.0

Provides four heuristic metric scorers (faithfulness, relevance, context precision, context recall) for evaluating RAG outputs, plus a `MetricsEngine` orchestrator that runs them in parallel with configurable concurrency. Each scorer is a class with a `score` method that returns a numeric score and supporting details, using only NLP libraries (`compromise`, `natural`) with no LLM calls.

View package View on npm

status: published
published: 1 month ago

rag-eval-observability

@reaatech/rag-eval-observability

v0.1.0

Provides structured JSON logging via Pino, OpenTelemetry tracing, and OpenTelemetry metrics specifically for RAG evaluation pipelines, exporting functions like `createLogger`, `traceEvalRun`, and `recordEvalRun`.

View package View on npm

status: published
published: 1 month ago

rag-eval-suite

@reaatech/rag-eval-suite

v0.1.0

A class (`EvaluationSuite`) that orchestrates RAG evaluation runs by executing heuristic metrics, optional LLM judge scoring, cost tracking, and quality gates against a dataset, returning a `SuiteRunResult` with aggregated metrics and gate pass/fail status.

View package View on npm

status: published
published: 1 month ago

Back to rag-eval-pack