Skip to content
reaatechREAATECH

classifier-evals · packages

Every package shipped from reaatech/classifier-evals, published or pending.

8 packages

@reaatech/classifier-evals

v0.1.0
Provides a shared library of TypeScript types, Zod schemas, and observability utilities for classification evaluation workflows. It includes pre-configured Pino logging, OpenTelemetry instrumentation, and PII redaction helpers to standardize data handling across the classifier-evals ecosystem.
status
published
published
7 days ago

@reaatech/classifier-evals-cli

v0.1.0
Provides a CLI for executing classifier evaluations, comparing model performance, enforcing regression gates, and running LLM-as-judge workflows. It outputs results in JSON, HTML, or JUnit formats and is designed for integration into CI pipelines.
status
published
published
7 days ago

@reaatech/classifier-evals-dataset

v0.1.0
Provides utilities for loading, validating, and partitioning classifier evaluation datasets from CSV, JSON, or JSONL files. It exports a set of functions for performing stratified splits, K-fold cross-validation, and label normalization on standardized dataset objects.
status
published
published
7 days ago

@reaatech/classifier-evals-exporters

v0.1.0
Exports classifier evaluation results into JSON, interactive HTML reports, or observability traces for Arize Phoenix and Langfuse. It provides a set of utility functions that transform `EvalRun` objects into these formats for reporting and analysis.
status
published
published
7 days ago

@reaatech/classifier-evals-gates

v0.1.0
Evaluates classification model performance against threshold, baseline, and distribution gates using a configurable engine. It provides a `GateEngine` instance that processes metrics and exports results into GitHub Actions annotations, JUnit XML, or PR comment markdown.
status
published
published
7 days ago

@reaatech/classifier-evals-judge

v0.1.0
Evaluates classification model outputs using LLM-as-a-judge with support for consensus voting, real-time cost tracking, and PII redaction. It provides a `createJudgeEngine` factory function that returns an engine instance for executing batch evaluations against OpenAI or Anthropic APIs.
status
published
published
7 days ago

@reaatech/classifier-evals-mcp-server

v0.1.0
Exposes classifier evaluation workflows—including running evaluations, checking regression gates, and performing LLM-as-judge comparisons—as a set of Model Context Protocol (MCP) tools. It provides a CLI executable and a `startMCPServer` function that runs over stdio, requiring the `@modelcontextprotocol/sdk` at runtime.
status
published
published
7 days ago

@reaatech/classifier-evals-metrics

v0.1.0
Calculates classification performance metrics, including confusion matrices, multi-class F1 scores, and statistical model comparisons. It provides a collection of utility functions that operate on arrays of classification result objects defined by the `@reaatech/classifier-evals` core package.
status
published
published
7 days ago