Skip to content
reaatech

prompt-injection-bench

npm v1.0.1

A CLI and library for benchmarking LLM prompt-injection defenses against standardized attack corpora. Exports a `createBenchmarkEngine` function and CLI with subcommands (`benchmark`, `attack`, `compare`, `corpus`, `leaderboard`, `report`) that work with defense adapters for services like Rebuff, Lakera, and OpenAI Moderation.

prompt-injection-bench

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Umbrella package and CLI for the prompt-injection-bench suite. Re-exports the full public API from all @reaatech/pi-bench-* packages and provides the prompt-injection-bench CLI for running benchmarks from the command line.

Installation

terminal
npm install prompt-injection-bench
# or
pnpm add prompt-injection-bench
terminal
# Global install for CLI usage
npm install -g prompt-injection-bench

Feature Overview

  • Full API re-export — Single import for all @reaatech/pi-bench-* packages
  • CLI toolprompt-injection-bench command with 6 subcommands
  • 9 adapters — All defense adapters available via --defense flag
  • Dual ESM/CJS output — works with import and require

CLI Quick Start

terminal
# Run a benchmark with the mock adapter
prompt-injection-bench benchmark --defense mock
 
# Compare two defense results
prompt-injection-bench compare --results rebuff.json lakera.json
 
# View leaderboard
prompt-injection-bench leaderboard view
 
# Generate an HTML report
prompt-injection-bench report --results latest.json --format html --output report.html

CLI Commands

benchmark

Run a full benchmark against a defense adapter:

terminal
prompt-injection-bench benchmark \
  --defense rebuff \
  --corpus default \
  --categories direct-injection,role-playing \
  --parallel 10 \
  --timeout 30000 \
  --output results/benchmark.json
FlagDescription
--defenseDefense adapter name (mock, rebuff, lakera, llm-guard, garak, moderation-openai, etc.)
--corpusCorpus source: default or path to corpus directory
--categoriesComma-separated attack categories (default: all)
--parallelMax parallel attack executions (default: 10)
--timeoutPer-attack timeout in ms (default: 30000)
--outputJSON output file path

attack

Run attacks from a single category:

terminal
prompt-injection-bench attack \
  --category direct-injection \
  --defense mock \
  --count 100

compare

Compare two defense result files:

terminal
prompt-injection-bench compare \
  --results results/rebuff.json results/lakera.json \
  --significance 0.05

corpus

Manage attack corpora:

terminal
# List available categories
prompt-injection-bench corpus list
 
# Generate a new corpus
prompt-injection-bench corpus generate --output corpus/v2026.05
 
# Validate an existing corpus
prompt-injection-bench corpus validate --input corpus/v2026.04
 
# Export corpus to JSON
prompt-injection-bench corpus export --input corpus/v2026.04 --format json

leaderboard

View and manage the leaderboard:

terminal
prompt-injection-bench leaderboard view
prompt-injection-bench leaderboard submit \
  --results results/latest.json \
  --defense my-defense \
  --version 1.0.0

report

Generate reports from benchmark results:

terminal
prompt-injection-bench report \
  --results results/latest.json \
  --format html \
  --output reports/benchmark.html

Supported formats: json, html, markdown.

Library Quick Start

typescript
import { createBenchmarkEngine, createMockAdapter, generateDefaultCorpus } from "prompt-injection-bench";
 
const adapter = createMockAdapter(0.95, 0.03);
const corpus = generateDefaultCorpus();
const engine = createBenchmarkEngine({ defense: adapter });
 
const result = await engine.runBenchmark(corpus);
console.log(`Detection rate: ${(result.attackResults.filter(r => r.detected).length / result.attackResults.length * 100).toFixed(1)}%`);

Re-exported Packages

PackageDescription
@reaatech/pi-bench-coreCore types, taxonomy, Zod schemas
@reaatech/pi-bench-observabilityLogging, tracing, and metrics
@reaatech/pi-bench-corpusCorpus builder, validator, template engine
@reaatech/pi-bench-adaptersDefense adapter implementations
@reaatech/pi-bench-scoringScoring engine and statistical analysis
@reaatech/pi-bench-runnerBenchmark execution engine
@reaatech/pi-bench-leaderboardLeaderboard management
@reaatech/pi-bench-mcp-serverMCP server and reproducibility tools

Each package can also be installed individually:

terminal
pnpm add @reaatech/pi-bench-core
pnpm add @reaatech/pi-bench-adapters
pnpm add @reaatech/pi-bench-scoring

License

MIT