Skip to content
reaatechREAATECH

@reaatech/llm-judge-infra

npm v0.1.0

Provides infrastructure utilities for LLM evaluation, including a `BatchProcessor` for concurrent execution, a `CostTracker` for budget enforcement, and a `MetricsCollector` for monitoring performance. It exports these as class-based tools and structured logging helpers that integrate with Pino.

@reaatech/llm-judge-infra

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Infrastructure utilities including cost tracking with budget enforcement, structured Pino logging, metrics collection, and batch processing with configurable concurrency and retry.

Installation

terminal
npm install @reaatech/llm-judge-infra
# or
pnpm add @reaatech/llm-judge-infra

Feature Overview

  • CostTracker with period-aware cost aggregation and budget alerts
  • Pino logger with structured log helpers (judgment, error, cache hit/miss, budget exceeded)
  • MetricsCollector tracking judgments, latency, scores, costs, and cache hit rates
  • BatchProcessor with concurrency control, progress callbacks, and automatic retry
  • Zero external dependencies beyond pino for logging

Quick Start

typescript
import { CostTracker } from '@reaatech/llm-judge-infra';
 
const tracker = new CostTracker({
  budget: { limit: 10.0, period: 'daily' },
});
 
tracker.track(judgment);
 
const report = tracker.generateReport();
console.log(report.totalCost, report.averageCostPerJudgment);
typescript
import { BatchProcessor } from '@reaatech/llm-judge-infra';
 
const processor = new BatchProcessor({
  engine: judgmentEngine,
  concurrency: 5,
  onProgress: (done, total) => console.log(`${done}/${total}`),
  onError: (id, error) => console.error(`Failed ${id}:`, error.message),
});
 
const results = await processor.process(items);

API Reference

CostTracker

ExportDescription
constructor({budget?, eventBus?})Create a tracker with optional budget and event bus
track(judgment)Record judgment cost (throws BudgetExceededError if over limit)
getTotalCost()Sum of all tracked costs
getPeriodCost()Cost within the current period window
getCostByCriteria()Cost filtered by evaluation criteria
getCostByProvider()Cost filtered by provider name
getCostByModel()Cost filtered by model name
generateReport()Full cost report with breakdowns

BatchProcessor

ExportDescription
constructor({engine, concurrency?, onProgress?, onError?})Create with engine, concurrency (default 3), and callbacks
process(items[])Evaluate all items, return BatchResult[]
processWithRetry(items[], options?)Process with automatic retries on transient errors

MetricsCollector

ExportDescription
recordJudgment()Record a single judgment
recordCacheHit()Increment cache hit counter
recordCacheMiss()Increment cache miss counter
recordFailure()Increment failure counter
snapshot()Return MetricsSnapshot with all current values
reset()Reset all counters to zero

Logging Helpers

ExportDescription
loggerRaw Pino instance
logJudgment()Log a completed judgment with duration
logError()Log an error with optional context
logCacheHit()Log a cache hit event
logCacheMiss()Log a cache miss event
logBudgetExceeded()Log when budget threshold is exceeded

License

MIT