prompt-injection-bench
Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.
Umbrella package and CLI for the prompt-injection-bench suite. Re-exports the full public API from all @reaatech/pi-bench-* packages and provides the prompt-injection-bench CLI for running benchmarks from the command line.
Installation
npm install prompt-injection-bench
# or
pnpm add prompt-injection-bench# Global install for CLI usage
npm install -g prompt-injection-benchFeature Overview
- Full API re-export — Single import for all
@reaatech/pi-bench-*packages - CLI tool —
prompt-injection-benchcommand with 6 subcommands - 9 adapters — All defense adapters available via
--defenseflag - Dual ESM/CJS output — works with
importandrequire
CLI Quick Start
# Run a benchmark with the mock adapter
prompt-injection-bench benchmark --defense mock
# Compare two defense results
prompt-injection-bench compare --results rebuff.json lakera.json
# View leaderboard
prompt-injection-bench leaderboard view
# Generate an HTML report
prompt-injection-bench report --results latest.json --format html --output report.htmlCLI Commands
benchmark
Run a full benchmark against a defense adapter:
prompt-injection-bench benchmark \
--defense rebuff \
--corpus default \
--categories direct-injection,role-playing \
--parallel 10 \
--timeout 30000 \
--output results/benchmark.json| Flag | Description |
|---|---|
--defense | Defense adapter name (mock, rebuff, lakera, llm-guard, garak, moderation-openai, etc.) |
--corpus | Corpus source: default or path to corpus directory |
--categories | Comma-separated attack categories (default: all) |
--parallel | Max parallel attack executions (default: 10) |
--timeout | Per-attack timeout in ms (default: 30000) |
--output | JSON output file path |
attack
Run attacks from a single category:
prompt-injection-bench attack \
--category direct-injection \
--defense mock \
--count 100compare
Compare two defense result files:
prompt-injection-bench compare \
--results results/rebuff.json results/lakera.json \
--significance 0.05corpus
Manage attack corpora:
# List available categories
prompt-injection-bench corpus list
# Generate a new corpus
prompt-injection-bench corpus generate --output corpus/v2026.05
# Validate an existing corpus
prompt-injection-bench corpus validate --input corpus/v2026.04
# Export corpus to JSON
prompt-injection-bench corpus export --input corpus/v2026.04 --format jsonleaderboard
View and manage the leaderboard:
prompt-injection-bench leaderboard view
prompt-injection-bench leaderboard submit \
--results results/latest.json \
--defense my-defense \
--version 1.0.0report
Generate reports from benchmark results:
prompt-injection-bench report \
--results results/latest.json \
--format html \
--output reports/benchmark.htmlSupported formats: json, html, markdown.
Library Quick Start
import { createBenchmarkEngine, createMockAdapter, generateDefaultCorpus } from "prompt-injection-bench";
const adapter = createMockAdapter(0.95, 0.03);
const corpus = generateDefaultCorpus();
const engine = createBenchmarkEngine({ defense: adapter });
const result = await engine.runBenchmark(corpus);
console.log(`Detection rate: ${(result.attackResults.filter(r => r.detected).length / result.attackResults.length * 100).toFixed(1)}%`);Re-exported Packages
| Package | Description |
|---|---|
@reaatech/pi-bench-core | Core types, taxonomy, Zod schemas |
@reaatech/pi-bench-observability | Logging, tracing, and metrics |
@reaatech/pi-bench-corpus | Corpus builder, validator, template engine |
@reaatech/pi-bench-adapters | Defense adapter implementations |
@reaatech/pi-bench-scoring | Scoring engine and statistical analysis |
@reaatech/pi-bench-runner | Benchmark execution engine |
@reaatech/pi-bench-leaderboard | Leaderboard management |
@reaatech/pi-bench-mcp-server | MCP server and reproducibility tools |
Related Packages
Each package can also be installed individually:
pnpm add @reaatech/pi-bench-core
pnpm add @reaatech/pi-bench-adapters
pnpm add @reaatech/pi-bench-scoring