A solo recruiter at a 5-person firm spends 6+ hours per role manually scoring 50-200 resumes against a rubric. Inconsistent scoring leads to missed top candidates and client complaints. Enterprise ATS scoring tools are too expensive and complex. The recruiter needs a fast, fair, and auditable way to rank candidates without hiring more staff.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
The Automated Resume Scoring Agent lets boutique recruiters score 50-200 resumes per role in seconds using a consistent LLM-powered rubric. You upload a resume file (PDF or DOCX), define criteria with weights, and the agent returns structured criterion scores, an overall score, and an independent quality evaluation — all cached and cost-tracked. No spreadsheets, no manual grading.
This tutorial walks you through building the full agent from scratch: resume parsing, LLM scoring with retry logic, semantic caching, budget tracking, judge evaluation, and a set of REST API routes. You’ll wire up six @reaatech/* packages alongside open-source parsing libraries and the Vercel AI SDK.
Prerequisites
Node.js 22+ and pnpm 10+
An OpenAI-compatible API key (set as OPENAI_API_KEY in your environment)
Familiarity with TypeScript and Next.js App Router conventions
Basic knowledge of Zod schemas and multipart/form-data uploads
Step 1: Scaffold the Next.js project and install dependencies
Create a new Next.js project with TypeScript and the App Router, then install all the packages you’ll need.
Expected output: Your package.json now lists all dependencies with exact versions and no range prefixes. The pnpm-lock.yaml resolves everything.
Step 2: Configure environment variables
Open .env.example and replace its contents with the variables the agent reads at runtime.
env
# Env vars used by agnostic-recruiter-resume-scoring-agent-2.# The builder adds entries here as it wires up each integration.# Keep placeholders only — never commit real values.NODE_ENV=developmentOPENAI_API_KEY=<your-openai-api-key>OPENAI_BASE_URL=<https://api.openai.com/v1>SCORING_MODEL=<model-name>EMBEDDING_MODEL=text-embedding-3-smallJUDGE_MODEL=<judge-model-name>JUDGE_PROVIDER=claude|gpt4|gemini|openrouterBLOB_READ_WRITE_TOKEN=<your-blob-token>AGENT_MEMORY_EMBEDDING_API_KEY=<your-embedding-key>AGENT_MEMORY_EMBEDDING_MODEL=text-embedding-3-small
Copy this to .env and fill in real values for local development.
Expected output: The .env.example file has all 10 variables listed with placeholder values. No real secrets are committed.
Step 3: Define domain types
Create src/types.ts with the core interfaces your agent uses throughout the pipeline. These describe resumes, rubrics, scores, candidates, and trajectories.
Expected output:src/types.ts contains 11 interfaces, one const object, and one type alias — no runtime dependencies, just pure TypeScript types.
Step 4: Build the resume parser
The parser detects a file’s format by inspecting its binary signature (not the extension), then dispatches to the right library: pdf-parse for PDFs and mammoth for DOCX files.
Create src/lib/resume-parser.ts:
ts
import { PDFParse } from "pdf-parse";import mammoth from "mammoth";import { fileTypeFromBuffer } from "file-type";import type { ResumeText } from "../types.js";export async function detectFileType(buffer: Buffer): Promise<{ ext: string; mime: string } | undefined> { return fileTypeFromBuffer(buffer);}export async function parsePdf(buffer: Buffer): Promise<string> { const parser = new PDFParse({ data: buffer }); return parser.getText().then(r => r.text);}export async function parseDocx(buffer: Buffer): Promise<string> { return mammoth.extractRawText({ buffer }).then(result => result.value);}export async function parseResume(buffer: Buffer, filename: string): Promise<ResumeText> { const type = await fileTypeFromBuffer(buffer); if (!type) { throw new Error("Unsupported resume format: unknown"); } const { ext } = type; let text: string; if (ext === "pdf") { text = await parsePdf(buffer); } else if (ext === "docx") { text = await parseDocx(buffer); } else { throw new Error(`Unsupported resume format: ${ext}`); } return { id: filename.replace(/\.[^.]+$/, ""), text, format: ext, };}
Expected output: The parser correctly identifies PDF vs DOCX via fileTypeFromBuffer and extracts raw text. Unknown formats throw with a descriptive error message.
Step 5: Create the LLM scorer
The scorer uses the Vercel AI SDK’s generateText with Output.object() for structured JSON output. Each criterion is listed in the system prompt, and the model returns per-criterion scores plus an overall score. The scorer also implements exponential-backoff retry and instructor-based structured extraction.
Create src/lib/scorer.ts:
ts
import { generateText, Output } from "ai";import { createOpenAICompatible } from "@ai-sdk/openai-compatible";import { z } from "zod";import Instructor from "@instructor-ai/instructor";import OpenAI from "openai";import type { ResumeText, Rubric, ResumeScore, ScoringConfig } from "../types.js";export function createScoringModel(config: ScoringConfig) { return createOpenAICompatible({ name: "resumeScorer", baseURL: config.baseUrl, apiKey: config.apiKey })(config.model);}export const CriterionScoreSchema = z.object({ criterionId: z.string
Expected output: The scorer retries on failure with exponential backoff (2s, 4s, …). An empty rubric throws "Rubric must have at least one criterion". batchScoreResumes stops on the first failure rather than returning partial results.
Step 6: Wire up semantic caching
The cache service uses @reaatech/llm-cache to store scores keyed by the resume text. On a cache hit, the score is returned instantly without calling the LLM again.
Expected output:getCachedScore returns a ResumeScore on hit or null on miss. setCachedScore stores the score under the "resume-scoring" use case segment. The similarity threshold is 0.8 (cosine), and the default TTL is 1 hour.
Step 7: Add budget tracking for LLM costs
The budget service uses @reaatech/agent-budget-spend-tracker to record every LLM call and expose cumulative spend, rate, and anomaly detection.
Create src/lib/budget-service.ts:
ts
import { SpendStore } from "@reaatech/agent-budget-spend-tracker";import { BudgetScope } from "@reaatech/agent-budget-types";export function createBudgetStore(maxEntries?: number): SpendStore { return new SpendStore({ maxEntries: maxEntries ?? 100_000 });}export function recordScoringCost( store: SpendStore, requestId: string, cost: number, inputTokens: number, outputTokens: number, modelId: string,): number { return store.record({ requestId, scopeType: BudgetScope.User, scopeKey: "resume-scorer", cost, inputTokens, outputTokens, modelId, provider: "agnostic", timestamp: new Date(), });}export function getTotalSpend(store: SpendStore): number { return store.getSpend(BudgetScope.User, "resume-scorer");}export function getSpendRate(store: SpendStore, windowMinutes?: number): number { return store.getRate(BudgetScope.User, "resume-scorer", windowMinutes ?? 5);}export function detectSpendSpikes(store: SpendStore) { return store.detectSpikes(BudgetScope.User, "resume-scorer", 2);}
Expected output: Each call to recordScoringCost returns an entry ID. getTotalSpend returns the accumulated cost for the resume-scorer scope. detectSpendSpikes returns entries exceeding 2 standard deviations from the mean.
Step 8: Build the judge evaluator
The judge provides an independent quality assessment of each score, separate from the scorer model. You configure it with its own model and provider.
Create src/lib/judge-service.ts:
ts
import { JudgeEngine, JudgeCalibrator, type JudgeScore, type JudgeRequest,} from "@reaatech/agent-eval-harness-judge";import type { Rubric, ResumeScore } from "../types.js";export function createJudgeService(model: string): JudgeEngine { return new JudgeEngine({ model, provider: (process.env.JUDGE_PROVIDER ?? "openrouter") as "claude" | "gpt4" | "gemini" | "openrouter", temperature: 0.1, });}export async function evaluateScore( judge: JudgeEngine, rubric: Rubric, score: ResumeScore,): Promise<JudgeScore> { const request: JudgeRequest = { type: "overall_quality", context: JSON.stringify(rubric), response: JSON.stringify(score), }; return judge.judge(request);}export async function evaluateScoreBatch( judge: JudgeEngine, items: Array<{ id: string; request: JudgeRequest }>,) { return judge.judgeBatch(items, 3);}export function calibrateJudge( judgeScores: JudgeScore[], humanLabels: Array<{ sampleId: string; score: number; type: string; explanation?: string }>,) { const calibrator = new JudgeCalibrator("temperature_scaling"); calibrator.addCalibrationData(humanLabels, judgeScores); calibrator.calibrate(); return calibrator;}
Expected output:evaluateScore sends the rubric and score to an independent judge model and returns a JudgeScore. calibrateJudge returns a calibrated JudgeCalibrator using temperature scaling.
Step 9: Add cost telemetry and agent memory
The telemetry service records cost spans for each LLM call, and the memory service stores candidate scoring history for retrieval.
Expected output:createCostSpan returns a span validated through CostSpanSchema. storeCandidateScore calls extractAndStore on the memory instance, and retrieveCandidateHistory returns up to 10 relevant memory entries.
Step 10: Set up golden trajectories and file storage
The golden service lets you create reference scoring trajectories and compare new runs against them for regression testing. The storage layer uploads resume files to Vercel Blob.
Create src/lib/golden-service.ts:
ts
import { quickCreateGolden, compareAgainstGolden, batchCompare, createCurator, batchQualityCheck, type GoldenTrajectory, type GoldenCurator,} from "@reaatech/agent-eval-harness-golden";import type { Trajectory } from "../types.js";export function createGoldenTrajectory( trajectory: Trajectory, description: string, tags: string[],) { return quickCreateGolden(trajectory, description, tags);}export function compareRunAgainstGolden( golden: GoldenTrajectory, candidate: Trajectory, threshold?: number,) { return compareAgainstGolden(golden, candidate, { similarityThreshold: threshold ?? 0.8 });}export function batchCompareRuns( golden: GoldenTrajectory, candidates: Trajectory[],) { return batchCompare(golden, candidates);}export function createCurationWorkflow(trajectory: Trajectory): GoldenCurator { return createCurator(trajectory);}export function qualityCheckGoldens( goldens: GoldenTrajectory[],) { return batchQualityCheck(goldens);}
Create the candidate in-memory store (src/candidate-store.ts):
ts
import type { CandidateResult } from "./types.js";export const candidates = new Map<string, CandidateResult>();
Expected output:compareRunAgainstGolden returns a comparison result with similarity and passesThreshold fields. uploadResume returns the blob URL from Vercel Blob.
Step 11: Build the score API route
This route ties everything together: it receives a multipart request with a resume file and rubric JSON, parses the file, checks the cache, calls the scorer on a miss, records the cost, evaluates quality with the judge, and returns the combined result.
Create app/api/score/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import { parseResume } from "../../../src/lib/resume-parser.js";import { scoreResume } from "../../../src/lib/scorer.js";import { createCacheService, getCachedScore, setCachedScore } from "../../../src/lib/cache-service.js";import { createBudgetStore, recordScoringCost } from "../../../src/lib/budget-service.js";import { createJudgeService, evaluateScore } from "../../../src/lib/judge-service.js";import type { ResumeScore, Rubric } from "../../../src/types.js";let cache: ReturnType<typeof createCacheService> | null = null;let budgetStore: ReturnType<typeof
Expected output: A curl request with a valid PDF resume and rubric returns {"score": {...}, "evaluation": {...}, "cached": false} with status 200. A request with no file returns 400. An unsupported file format returns 415. An LLM failure returns 502.
Step 12: Create the API routes for jobs, candidates, and health
The jobs route manages job postings with their rubrics. The candidates routes store scored results and retrieve scoring history from the memory layer. A simple health check rounds out the API.
Create app/api/jobs/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import type { JobPosting, Rubric, RubricCriterion } from "../../../src/types.js";const jobs = new Map<string, JobPosting>();export async function POST(req: NextRequest) { const body = await req.json() as { title: string; description: string; criteria: RubricCriterion[] }; const { title, description, criteria } = body; const rubric: Rubric = { id: crypto.randomUUID(), jobTitle: title, criteria: criteria, overallMaxScore: 100, }; const job: JobPosting = { id: crypto.randomUUID(), title, description, rubric, createdAt: new Date(), }; jobs.set(job.id, job); return NextResponse.json(job, { status: 201 });}export function GET() { const all = Array.from(jobs.values()); return NextResponse.json(all, { status: 200 });}export async function DELETE(req: NextRequest) { const { id } = await req.json() as { id: string }; jobs.delete(id); return NextResponse.json({ success: true }, { status: 200 });}
Create app/api/candidates/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import { AgentMemory, OpenAILLMProvider } from "@reaatech/agent-memory";import type { CandidateResult, ResumeScore } from "../../../src/types.js";import { candidates } from "../../../src/candidate-store.js";import { storeCandidateScore } from "../../../src/lib/memory-service.js";let memory: AgentMemory | null = null;function getMemory(): AgentMemory { if (!memory) { memory = new AgentMemory({ storage: { provider: "memory" }, embedding: { provider: "openai", model: process.env.EMBEDDING_MODEL ?? "text-embedding-3-small", apiKey: process.env.OPENAI_API_KEY ?? "", }, extraction: { llmProvider: new OpenAILLMProvider({ apiKey: process.env.OPENAI_API_KEY ?? "", model: process.env.SCORING_MODEL ?? "gpt-4o", }), enabledTypes: [], batchSize: 10, confidenceThreshold: 0.7, }, }); } return memory;}export async function POST(req: NextRequest) { const body = await req.json() as { jobId: string; candidateName: string; resumeFilename: string; score: ResumeScore }; const { jobId, candidateName, resumeFilename, score } = body; const candidate: CandidateResult = { id: crypto.randomUUID(), jobId, candidateName, resumeFilename, score, createdAt: new Date(), }; candidates.set(candidate.id, candidate); const mem = getMemory(); await storeCandidateScore(mem, candidate.id, candidate.score, jobId); return NextResponse.json(candidate, { status: 201 });}export function GET(req: NextRequest) { const jobId = req.nextUrl.searchParams.get("jobId"); let all = Array.from(candidates.values()); if (jobId) { all = all.filter((c) => c.jobId === jobId); } return NextResponse.json(all, { status: 200 });}
import { NextResponse } from "next/server";export function GET() { return NextResponse.json( { status: "ok", timestamp: new Date().toISOString() }, { status: 200 }, );}
Expected output:POST /api/jobs with { title, description, criteria } returns 201 with the created job. GET /api/candidates?jobId=<id> returns only candidates for that job. GET /api/candidates/<id> returns the candidate plus their scoring history. GET /api/health returns {"status": "ok", "timestamp": "..."}.
Step 13: Write tests and run the suite
Every module needs tests covering the happy path, error handling, and edge cases. Here are key test patterns from the suite (the full test suite has 15 files covering all modules and routes).
import { describe, it, expect } from "vitest";import { GET } from "../app/api/health/route.js";describe("health route", () => { it("GET returns 200 with status ok", async () => { const response = GET(); const body = await response.json() as Record<string, unknown>; expect(response.status).toBe(200); expect(body).toHaveProperty("status", "ok"); });});
Now run the full test suite:
terminal
pnpm test
Expected output:pnpm test runs vitest with coverage. All tests pass. Coverage thresholds — lines, branches, functions, and statements — all hit 90% or higher on the runtime source files (src/**/*.ts and app/**/route.ts). Then run:
terminal
pnpm typecheckpnpm lint
Both exit with status 0, confirming no type errors and no lint violations.
Next steps
Add a dashboard UI — Build a Next.js client page that lets recruiters upload resumes through a drag-and-drop form and view scored results in a sortable table.
Integrate with ATS platforms — Connect to Greenhouse, Lever, or Ashby via their APIs to auto-score incoming applications as they arrive.
Persist to a database — Replace the in-memory stores (Map<string, ...>) with PostgreSQL or SQLite so scores survive server restarts and can be queried historically.
const systemPrompt = `You are a resume scoring assistant. Score the following resume against the rubric criteria.\n\nRubric criteria:\n${criteriaLines}`;
let lastError: unknown;
for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
try {
const result = await generateText({
model: createScoringModel(config),
system: systemPrompt,
prompt: `Score this resume against the rubric:\n\n${text}`,
output: Output.object({
schema: ResumeScoreSchema,
name: "ResumeScore",
}),
});
return { ...result.output, scoredAt: new Date() };
} catch (error) {
lastError = error;
if (attempt < config.maxRetries) {
await new Promise(resolve => setTimeout(resolve, 1000 * Math.pow(2, attempt)));
}
}
}
throw lastError;
}
export async function extractStructuredResume(text: string, config: ScoringConfig): Promise<Record<string, unknown>> {
const oai = new OpenAI({ apiKey: config.apiKey, baseURL: config.baseUrl });