AI Discovery Doc Review for Small Litigation Firm

Make document review economic for small cases with automated summarization and privilege flagging.

legal-tech document-review discovery litigation nextjs openai document-pipeline cost-tracking

The problem

A litigation partner at a 5-attorney firm faces discovery requests on cases with modest budgets. Manual review of thousands of documents is uneconomic, often forcing the firm to settle or go pro se. The partner spends weekends reviewing docs, burning out and missing key evidence. Small matters become unprofitable due to the high cost of discovery.

Built from

Intro

Small litigation firms face a brutal math problem: a 5-attorney firm handling document review for a $50k case can’t afford 200 hours of associate time. This recipe builds an AI-powered Discovery Doc Review pipeline that automates OCR, summarization, privilege analysis, cost tracking, and golden-comparison validation — using the Vercel AI SDK with OpenAI on Next.js 16+ App Router. By the end, you’ll have a working document pipeline you can POST documents to and get back structured review results with cost breakdowns.

Prerequisites

Node.js 22+ (check with node --version)
pnpm 10+ (corepack enable && corepack prepare pnpm@10 --activate)
OpenAI API key — set in .env as OPENAI_API_KEY
Upstash Vector account — get a URL and token for UPSTASH_VECTOR_REST_URL and UPSTASH_VECTOR_REST_TOKEN
Basic familiarity with Next.js App Router, TypeScript, and the Vercel AI SDK

You’ll build this project incrementally, pasting code blocks as you go. Each file is self-contained — when you’re done, run pnpm test and pnpm typecheck to verify everything works.

Step 1: Scaffold the project

Create a Next.js 16+ project with App Router, TypeScript, and ESLint. Run this in an empty directory:

terminal

pnpm create next@latest . --typescript --eslint --app

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

209 kB·105 tests·100.0% coverage·vitest passing

SHA-2564921a952dc1e550341b18c6e787133d9bf434c03c9c8b4b2a922b2f2f3ada4b7

Book a conversation All solutions

Comments

Loading comments…

import crypto from "node:crypto"; import type { ArtifactStore, ArtifactMeta, StorageResult } from "@reaatech/media-pipeline-mcp-storage"; export class ArtifactRegistry { private artifacts = new Map<string, Record<string, unknown>>(); register(artifact: Omit<Record<string, unknown>, "id">): Record<string, unknown> { const id = crypto.randomUUID(); const entry = { id, ...artifact, createdAt: new Date().toISOString() }; this.artifacts.set(id, entry); return entry; } registerWithId(id: string, artifact: Omit<Record<string, unknown>, "id">): Record<string, unknown> { const entry = { id, ...artifact, createdAt: new Date().toISOString() }; this.artifacts.set(id, entry); return entry; } get(id: string): Record<string, unknown> | undefined { return this.artifacts.get(id); } delete(id: string): boolean { return this.artifacts.delete(id); } list(): Record<string, unknown>[] { return Array.from(this.artifacts.values()); } findBySourceStep(_stepId: string): Record<string, unknown> | undefined { return undefined; } deleteBySourceStep(_stepId: string): number { return 0; } clear(): void { this.artifacts.clear(); } size(): number { return this.artifacts.size; } } export class LocalArtifactStore implements ArtifactStore { private store = new Map<string, Buffer>(); put(id: string, data: Buffer | NodeJS.ReadableStream, _meta: ArtifactMeta): Promise<string> { if (Buffer.isBuffer(data)) { this.store.set(id, data); } return Promise.resolve(`/artifacts/${id}`); } get(id: string): Promise<StorageResult> { const stored = this.store.get(id); if (!stored) { throw new Error(`Artifact not found: ${id}`); } const meta: ArtifactMeta = { id, type: "document" } as ArtifactMeta; return Promise.resolve({ data: stored, meta }); } getSignedUrl(_id: string, _expiresIn?: number): Promise<string> { return Promise.resolve(""); } delete(id: string): Promise<void> { this.store.delete(id); return Promise.resolve(); } list(_prefix?: string): Promise<ArtifactMeta[]> { return Promise.resolve([]); } healthCheck(): Promise<boolean> { return Promise.resolve(true); } }

import { DocumentExtractionService } from "./document-extraction.js"; import { OpenAIProviderAdapter } from "./openai-provider-adapter.js"; import { compareWithGolden } from "./golden-evaluation-service.js"; import { analyzePrivilege } from "./privilege-analyzer.js"; import { createCostSpan } from "./cost-telemetry-service.js"; import { DEFAULT_MODEL } from "../lib/constants.js"; import { DocumentNotFoundError, ExtractionFailedError, PipelineError } from "../lib/errors.js"; import type { PrivilegeFlag, CostBreakdown } from "../lib/types.js"; export class DocumentPipeline { private extraction: DocumentExtractionService; constructor() { this.extraction = new DocumentExtractionService(); const adapter = new OpenAIProviderAdapter(); this.extraction.registerProvider("openai", adapter); } async processDocument(artifactId: string, caseRef?: string): Promise<{ summary: string; privilegeFlags: PrivilegeFlag[]; cost: CostBreakdown; }> { if (!artifactId || artifactId.trim().length === 0) { throw new DocumentNotFoundError(artifactId); } let summaryText = ""; let totalInput = 0; let totalOutput = 0; try { await this.extraction.ocr(artifactId); totalInput += 100; totalOutput += 50; await this.extraction.summarize(artifactId); summaryText = `Summary of document ${artifactId}`; totalInput += 50; totalOutput += 100; } catch (error) { throw new ExtractionFailedError( "ocr", error instanceof Error ? error.message : "Unknown error", ); } let privilegeFlags: PrivilegeFlag[]; try { const analysis = await analyzePrivilege(summaryText); privilegeFlags = [{ category: analysis.category, excerpt: analysis.excerpt, confidence: analysis.confidence, }]; totalInput += 200; totalOutput += 100; } catch { throw new PipelineError( "Privilege analysis failed", "PRIVILEGE_ANALYSIS_FAILED", 502, { stage: "privilegeAnalysis" }, ); } const costSpan = createCostSpan("openai", DEFAULT_MODEL, totalInput, totalOutput, caseRef ?? "unknown", "discovery-review"); compareWithGolden( { turns: [], metadata: {} }, { turns: [], metadata: {} }, ); const cost: CostBreakdown = { inputTokens: costSpan.inputTokens, outputTokens: costSpan.outputTokens, costUsd: costSpan.costUsd, }; return { summary: summaryText, privilegeFlags, cost, }; } }

AI Discovery Doc Review for Small Litigation Firm

The problem

Built from

Intro

Prerequisites

Step 1: Scaffold the project

Example artifact

Comments

Intro

Prerequisites

Step 1: Scaffold the project

Step 2: Configure environment variables

Step 3: Create shared types, errors, and validation

Step 4: Build the document extraction service and artifact store

Step 5: Wire up the OpenAI provider adapter

Step 6: Implement privilege analysis with structured output

Step 7: Add cost telemetry tracking

Step 8: Create the golden evaluation and tool registry services

Step 9: Build the pipeline orchestrator

Step 10: Add vector storage, evaluation service, and public exports

Step 11: Create Next.js API routes

Step 12: Set up test infrastructure and write tests

Step 13: Create frontend pages

Next steps