Computes latency metrics, enforces SLA budgets, and identifies performance bottlenecks for AI agent trajectories. It provides a suite of utility functions and a `LatencyTracker` class to analyze turn-level and component-specific timing data.
Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.
Turn-level and trajectory-level latency monitoring with SLA enforcement and optimization analysis. Computes P50/P90/P99 percentiles, detects anomalies, and provides actionable bottleneck recommendations for AI agent latency budgets.
Installation
terminal
npm install @reaatech/agent-eval-harness-latency
Feature Overview
Percentile computation — P50, P90, P99 latency metrics computed per turn and aggregated across the full trajectory
Component breakdown — Separates LLM call latency from tool invocation latency and system overhead for targeted optimization
SLA enforcement — Configurable per-turn and per-trajectory latency thresholds with severity-graded violation detection and early-warning signals
Each preset also includes per-component budgets. Pass a custom LatencyBudget with a components field to enforce LLM call, tool invocation, and overhead thresholds independently:
typescript
import { enforceBudget } from '@reaatech/agent-eval-harness-latency';const budget = createLatencyBudget('strict');// budget.components = { llmCall: 400, toolInvocation: 100, overhead: 50 }const result = monitorLatency(trajectory);const enforcement = enforceBudget(result, budget);for (const v of enforcement.violations) { console.log(`[${v.severity.toUpperCase()}] ${v.type}: ${v.description}`);}// Enforcement score: 1.0 = perfect, deducts 0.4 for critical, 0.25 for high, etc.console.log(`Enforcement score: ${enforcement.score}`);
Advanced: Optimization Analysis
The optimizer identifies the most impactful bottlenecks and generates actionable, priority-ranked recommendations:
typescript
import { analyzeOptimization, LatencyTracker } from '@reaatech/agent-eval-harness-latency';const optimization = analyzeOptimization(latencyResult, trajectory);console.log(`Bottlenecks: ${optimization.bottlenecks.length}`);for (const b of optimization.bottlenecks) { console.log(` ${b.type}: severity=${b.severity.toFixed(2)}, ${b.description}`);}console.log(`Top recommendations:`);for (const r of optimization.recommendations.slice(0, 3)) { console.log(` [${r.priority}] ${r.description} (effort: ${r.effort}, est. gain: ${r.expectedImprovementMs}ms)`);}// Track latency across multiple evaluation runsconst tracker = new LatencyTracker();tracker.record(result);console.log(`Trend: ${tracker.getTrend().improving ? 'improving' : 'degrading'}`);console.log(`Average score: ${tracker.getAverageScore()}`);