classifier-evals · packages
Every package shipped from reaatech/classifier-evals, published or pending.
8 packages
@reaatech/classifier-evals
Provides a shared library of TypeScript types, Zod schemas, and observability utilities for classification evaluation workflows. It includes pre-configured Pino logging, OpenTelemetry instrumentation, and PII redaction helpers to standardize data handling across the classifier-evals ecosystem.
- status
- published
- published
- 7 days ago
@reaatech/classifier-evals-cli
Provides a CLI for executing classifier evaluations, comparing model performance, enforcing regression gates, and running LLM-as-judge workflows. It outputs results in JSON, HTML, or JUnit formats and is designed for integration into CI pipelines.
- status
- published
- published
- 7 days ago
@reaatech/classifier-evals-dataset
Provides utilities for loading, validating, and partitioning classifier evaluation datasets from CSV, JSON, or JSONL files. It exports a set of functions for performing stratified splits, K-fold cross-validation, and label normalization on standardized dataset objects.
- status
- published
- published
- 7 days ago
@reaatech/classifier-evals-exporters
Exports classifier evaluation results into JSON, interactive HTML reports, or observability traces for Arize Phoenix and Langfuse. It provides a set of utility functions that transform `EvalRun` objects into these formats for reporting and analysis.
- status
- published
- published
- 7 days ago
@reaatech/classifier-evals-gates
Evaluates classification model performance against threshold, baseline, and distribution gates using a configurable engine. It provides a `GateEngine` instance that processes metrics and exports results into GitHub Actions annotations, JUnit XML, or PR comment markdown.
- status
- published
- published
- 7 days ago
@reaatech/classifier-evals-judge
Evaluates classification model outputs using LLM-as-a-judge with support for consensus voting, real-time cost tracking, and PII redaction. It provides a `createJudgeEngine` factory function that returns an engine instance for executing batch evaluations against OpenAI or Anthropic APIs.
- status
- published
- published
- 7 days ago
@reaatech/classifier-evals-mcp-server
Exposes classifier evaluation workflows—including running evaluations, checking regression gates, and performing LLM-as-judge comparisons—as a set of Model Context Protocol (MCP) tools. It provides a CLI executable and a `startMCPServer` function that runs over stdio, requiring the `@modelcontextprotocol/sdk` at runtime.
- status
- published
- published
- 7 days ago
@reaatech/classifier-evals-metrics
Calculates classification performance metrics, including confusion matrices, multi-class F1 scores, and statistical model comparisons. It provides a collection of utility functions that operate on arrays of classification result objects defined by the `@reaatech/classifier-evals` core package.
- status
- published
- published
- 7 days ago
