@reaatech/llm-judge-consensus

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Multi-judge consensus strategies for combining individual judgment scores into a final evaluation. Includes majority voting, weighted voting, and a cheap-first tiebreaker strategy to minimize API costs.

Installation

terminal

npm install @reaatech/llm-judge-consensus
# or
pnpm add @reaatech/llm-judge-consensus

Feature Overview

Three consensus strategies implementing shared interface
Confidence-weighted score aggregation
Automatic agreement score computation using variance-based formula
Cheap-first pattern uses N cheap model judgments with optional expensive tiebreakers
Zero external dependencies beyond types

Quick Start

typescript

import { MajorityVoting, CheapFirstTiebreaker } from '@reaatech/llm-judge-consensus';
 
const strategy = new MajorityVoting();
const result = strategy.execute([judgment1, judgment2, judgment3]);
 
console.log(result.finalScore, result.agreementScore);
// 0.82, 0.94

typescript

const cheapFirst = new CheapFirstTiebreaker(2);
 
const result = cheapFirst.execute([
  gpt4oMiniJudgment,
  gpt4oMiniJudgment2,
  gpt4oJudgment,    // tiebreaker — only used if cheap judges disagree
]);
 
console.log(result.tiebreakerUsed);
// false (or true if cheap judges disagreed)

API Reference

MajorityVoting

Property	Description
`strategy`	`majority-voting` — confidence-weighted average with agreement computation
`execute(judgments)`	Weight scores by confidence, return consensus

CheapFirstTiebreaker

Property	Description
`strategy`	`cheap-first-tiebreaker`
`constructor(cheapCount)`	`cheapCount` fast/cheap judgments to compare first (default 2)
`agreementThreshold`	0.8 — escalate to remaining judges if cheap pair agreement is below this
`execute(judgments)`	Compare cheap pair; escalate to remaining judges if agreement < threshold

WeightedVoting

Property	Description
`strategy`	`weighted-voting`
`constructor(weights)`	User-defined weights array (must match `judgments.length`)
`execute(judgments)`	Weight scores by provided weights, return consensus

ConsensusStrategy Interface

Member	Type	Description
`name`	`string`	Strategy identifier
`execute(judgments)`	`(judgments: Judgment[]) => ConsensusResult`	Execute consensus on input judgments

ConsensusResult

Field	Type	Description
`finalScore`	`number`	Consensus score (0–1)
`agreementScore`	`number`	Inter-judge agreement (0–1)
`method`	`string`	Strategy name used
`individualJudgments`	`Judgment[]`	Input judgments
`tiebreakerUsed`	`boolean`	Whether escalation happened (CheapFirstTiebreaker)

@reaatech/llm-judge-types — ConsensusStrategy, ConsensusResult, Judgment types

License

MIT

@reaatech/llm-judge-consensus

@reaatech/llm-judge-consensus

Installation

Feature Overview

Quick Start

API Reference

MajorityVoting

CheapFirstTiebreaker

WeightedVoting

ConsensusStrategy Interface

ConsensusResult

Related Packages

License