Veterinary practice managers and associate vets spend 15+ minutes per referral case manually summarizing medical history, lab results, and treatment plans into a letter for specialists. This paperwork eats into appointment time and leads to delayed referrals, frustrated specialists, and lost revenue. The process is error-prone, with vets often omitting critical details due to time pressure. An AI agent that reads the medical record, extracts relevant data, and drafts a formatted letter can cut drafting time by 90% and improve referral quality.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building an Automated Referral Letter Generator for veterinary practices. You’ll create a Next.js app that accepts a PDF or DOCX veterinary PIMS (Practice Information Management System) record, extracts clinical data using AI, and drafts a formatted referral letter — cutting manual paperwork from 15+ minutes to under 2 minutes. The project wires together six REAA packages (agent-memory, llm-cache, guardrail-chain, context-window-planner, and their supporting libraries) with the Vercel AI SDK into a single pipeline accessible through a REST API.
Prerequisites
Node.js 22+ and pnpm 10.x installed on your machine
An OpenAI API key with access to gpt-5.2 and text-embedding-3-small
Langfuse account (optional, for observability tracing)
Basic familiarity with TypeScript, Next.js App Router route handlers, and the multipart/form-data POST pattern
Step 1: Scaffold the Next.js project and pin all dependencies
Create a new Next.js 16 project with TypeScript and the App Router. You’ll use pnpm create next-app with the minimal template, then replace the default dependencies with the exact versions needed.
Open the generated package.json and replace its dependencies and devDependencies with these exact pinned versions — every version is locked with no ^ or ~ range prefix.
Now install everything and remove any lockfile artifacts from the scaffold:
terminal
rm -rf node_modules pnpm-lock.yamlpnpm install
Expected output: pnpm resolves all 24 packages and writes a fresh pnpm-lock.yaml. You should see no errors or warnings about missing peer dependencies.
Step 2: Define the domain types
Create the type definitions that every service layer will reference. Start with the veterinary records and referral letter shapes.
Create src/types/index.ts to re-export everything from both files:
ts
export type { ExtractedRecord, ReferralLetter, DocumentSource, LetterFormat,} from "./document.js";export type { AppConfig, PipelineConfig, ClinicalSummary, PipelineResult,} from "./pipeline.js";
Expected output:pnpm typecheck runs with zero errors. The four interfaces and two type aliases are ready for every downstream service module to import.
Step 3: Create the environment configuration module
The config module reads environment variables, side-effect imports dotenv so .env is loaded at module scope, and exposes loadConfig() and getPipelineConfig().
Create src/config/index.ts:
ts
import "dotenv/config";import type { AppConfig, PipelineConfig } from "../types/index.js";export function loadConfig(): AppConfig { const openaiApiKey = process.env.OPENAI_API_KEY; if (!openaiApiKey) { throw new Error("Missing required env: OPENAI_API_KEY"); } return { openaiApiKey, langfusePublicKey: process.env.LANGFUSE_PUBLIC_KEY || "", langfuseSecretKey: process.env.LANGFUSE_SECRET_KEY || "", langfuseHost: process.env.LANGFUSE_HOST || "", };}export function getPipelineConfig(): PipelineConfig { return { model: "gpt-5.2", maxTokens: 4096, temperature: 0.3, letterFormat: "docx", };}
Notice that Langfuse keys are optional — if missing they default to empty strings, so observability degrades gracefully instead of crashing the pipeline.
Now create .env.example with the environment variables the app reads:
env
# Env vars used by agnostic-referral-letter-drafter-2.# The builder adds entries here as it wires up each integration.# Keep placeholders only — never commit real values.NODE_ENV=developmentOPENAI_API_KEY=<your-openai-key>LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>LANGFUSE_HOST=<your-langfuse-host>
Expected output: Running pnpm typecheck compiles without errors. The config module will throw a clear error if OPENAI_API_KEY is unset at runtime.
Step 4: Build the document text extraction layer
The extraction layer reads raw text from PDF or DOCX uploads. Two dedicated extractors handle each format, and a dispatcher routes by DocumentSource.type.
import { PDFParse } from "pdf-parse";import mammoth from "mammoth";import type { DocumentSource } from "../../types/index.js";import { Buffer } from "node:buffer";export async function extractFromPdf(buffer: Uint8Array): Promise<string> { const parser = new PDFParse({ data: buffer }); const result = await parser.getText(); await parser.destroy(); return result.text;}export async function extractFromDocx(buffer: Uint8Array): Promise<string> { const buf = Buffer.isBuffer(buffer) ? buffer : Buffer.from(buffer); const result = await mammoth.extractRawText({ buffer: buf }); return result.value;}export async function extractText(source: DocumentSource): Promise<string> { switch (source.type) { case "pdf": return extractFromPdf(source.buffer); case "docx": return extractFromDocx(source.buffer); }}
Next, create the PIMS record parser at src/services/document-extraction/parser.ts. It uses regex patterns to match common veterinary record section labels:
Create a barrel export at src/services/document-extraction/index.ts:
ts
export { extractText, extractFromPdf, extractFromDocx } from "./extractor.js";export { parseVeterinaryRecord } from "./parser.js";
Expected output:pnpm typecheck passes. The extractor module can turn any PDF or DOCX buffer into plain text, and the parser will extract structured fields even from partial or messy records (unmatched fields return empty strings).
Step 5: Add AI-powered clinical analysis and letter generation
This step creates the prompt templates and the LLM service that calls gpt-5.2 via the Vercel AI SDK’s generateText with structured output.
First, create the prompt templates at src/services/ai/prompt-templates.ts:
ts
import type { ExtractedRecord, ClinicalSummary } from "../../types/index.js";export const SYSTEM_PROMPT_CLINICAL_EXTRACTION = "You are a veterinary medical records analyst. Extract structured clinical data from the provided PIMS record. Return only the requested fields; do not fabricate data not present in the record.";export const SYSTEM_PROMPT_LETTER_DRAFTING = "You are a veterinary referral specialist. Draft a professional referral letter using the provided clinical record and any relevant memory context. Include sections: Referring Veterinarian, Patient Information, Reason for Referral, Clinical Summary, Relevant History, Current Medications, Diagnostic Results, Recommended Assessment. Use formal veterinary terminology.";export function formatLetterPrompt( record: ExtractedRecord, summary: ClinicalSummary, context: string,): string { return `Patient Name: ${record.patientName}Species: ${record.species}Breed: ${record.breed}Age: ${record.age}Weight: ${record.weight}Presenting Complaint: ${record.presentingComplaint}History: ${record.history}Exam Findings: ${record.examFindings}Lab Results: ${record.labResults}Diagnosis: ${record.diagnosis}Treatment Plan: ${record.treatmentPlan}Medications: ${record.medications}Veterinarian: ${record.vetName}Practice: ${record.practiceName}Date: ${record.date}Clinical Summary:- Key Findings: ${summary.keyFindings.join(", ")}- Recommended Actions: ${summary.recommendedActions.join(", ")}- Urgency: ${summary.urgency}- Referral Rationale: ${summary.referralRationale}Retrieved Context:${context}Please draft a professional veterinary referral letter using the above information.`;}
Now create the LLM service at src/services/ai/llm-service.ts:
ts
import { generateText, Output } from "ai";import { openai } from "@ai-sdk/openai";import { z } from "zod";import type { ExtractedRecord, ClinicalSummary } from "../../types/index.js";import { SYSTEM_PROMPT_CLINICAL_EXTRACTION, SYSTEM_PROMPT_LETTER_DRAFTING, formatLetterPrompt,} from "./prompt-templates.js";export const ClinicalSummarySchema = z.object({ keyFindings: z.array(z.string()), recommendedActions: z.array(z.string()), urgency: z.enum(["routine", "urgent", "emergency"]), referralRationale: z.string(),});export async function analyzeRecord( record: ExtractedRecord,): Promise<ClinicalSummary> { const result = await generateText({ model: openai("gpt-5.2"), output: Output.object({ schema: ClinicalSummarySchema }), system: SYSTEM_PROMPT_CLINICAL_EXTRACTION, prompt: record.rawText, }); return result.output;}export async function generateLetterContent( record: ExtractedRecord, summary: ClinicalSummary, retrievedContext: string,): Promise<string> { const result = await generateText({ model: openai("gpt-5.2"), system: SYSTEM_PROMPT_LETTER_DRAFTING, messages: [ { role: "user", content: formatLetterPrompt(record, summary, retrievedContext), }, ], maxOutputTokens: 4096, temperature: 0.3, }); return result.text;}
Create the barrel at src/services/ai/index.ts:
ts
export { analyzeRecord, generateLetterContent, ClinicalSummarySchema } from "./llm-service.js";export { formatLetterPrompt, SYSTEM_PROMPT_CLINICAL_EXTRACTION, SYSTEM_PROMPT_LETTER_DRAFTING } from "./prompt-templates.js";
Expected output: Typecheck passes. You now have two LLM-powered functions — one that extracts structured clinical data from raw PIMS text, and one that drafts the full referral letter with the clinical summary as context.
Step 6: Wire up agent memory and LLM cache
The memory service uses three REAA packages together: @reaatech/agent-memory provides the AgentMemory facade with in-memory storage and LLM-powered extraction; @reaatech/agent-memory-embedding wraps embeddings in a cached layer; and @reaatech/agent-memory-retrieval provides semantic retrieval with a context injector.
Create src/services/memory/memory-service.ts:
ts
import { AgentMemory, OpenAILLMProvider, MemoryType } from "@reaatech/agent-memory";import { OpenAIEmbeddingProvider, CachedEmbeddingProvider, InMemoryEmbeddingCache } from "@reaatech/agent-memory-embedding";import { MemoryRetriever, ContextInjector, RetrievalStrategy } from "@reaatech/agent-memory-retrieval";import type { AppConfig, ExtractedRecord } from "../../types/index.js";export function createMemoryService(config: AppConfig): Promise<AgentMemory> { if (!config.openaiApiKey) { throw new Error("OpenAI API key is required for memory service"); } const memory = new AgentMemory({ storage: { provider: "memory" }, embedding: { provider: "openai", model: "text-embedding-3-small", apiKey: config.openaiApiKey, }, extraction: { llmProvider: new OpenAILLMProvider({ apiKey: config.openaiApiKey, model: "gpt-5.2", }), enabledTypes: [ MemoryType.FACT, MemoryType.PREFERENCE, ], batchSize: 10, confidenceThreshold: 0.7, }, }); memory.events.on("memory:stored", (event) => { console.log("[memory:stored]", event.payload); }); return Promise.resolve(memory);}export function createRetriever(memory: AgentMemory): MemoryRetriever { const storage = memory.getStorage(); const baseEmbedder = new OpenAIEmbeddingProvider({ apiKey: process.env.OPENAI_API_KEY ?? "", model: "text-embedding-3-small", }); const cachedEmbedder = new CachedEmbeddingProvider( baseEmbedder, new InMemoryEmbeddingCache({ maxSize: 1000, ttlMs: 60000 }), ); return new MemoryRetriever(storage, cachedEmbedder, { defaultLimit: 5, useCrossEncoder: false, diversityFactor: 0.3, strategies: [RetrievalStrategy.SEMANTIC, RetrievalStrategy.RECENCY], });}export function createContextInjector(): ContextInjector { return new ContextInjector(100000, 4);}export async function storeRecordMemory( memory: AgentMemory, record: ExtractedRecord,): Promise<void> { await memory.extractAndStore([ { speaker: "user", content: record.rawText, timestamp: new Date() }, ]);}export async function retrieveContext( retriever: MemoryRetriever, injector: ContextInjector, query: string, maxTokens: number,): Promise<string> { const memories = await retriever.retrieve(query, { limit: 5 }); const result = await injector.injectMemoriesIntoContext([], memories, maxTokens); return result;}export async function closeMemory(memory: AgentMemory): Promise<void> { await memory.close();}
Next, create the LLM cache service at src/services/cache/cache-service.ts. It uses @reaatech/llm-cache’s CacheEngine with in-memory storage and OpenAI embeddings for semantic similarity lookups:
Create the barrel files. First ensure the directories exist:
terminal
mkdir -p src/services/memory src/services/cache
src/services/memory/index.ts:
ts
export { createMemoryService, createRetriever, createContextInjector, storeRecordMemory, retrieveContext, closeMemory,} from "./memory-service.js";
src/services/cache/index.ts:
ts
export { createCacheService, getCachedResponse, setCachedResponse } from "./cache-service.js";
Expected output:pnpm typecheck reports zero errors. The memory service stores every processed record and retrieves semantically similar context, while the cache avoids re-invoking the LLM for identical or near-identical prompts.
Step 7: Add context window planning and output guardrails
The context planner from @reaatech/context-window-planner budgets tokens and packs system prompts, conversation turns, and generation buffers into a single window. The guardrail chain from @reaatech/guardrail-chain runs content safety and structural checks on the generated letter before it’s shipped.
import { GuardrailChain, ChainBuilder, setLogger, ConsoleLogger, createChainContext, generateCorrelationId, type Guardrail, type GuardrailResult, type ChainContext, type ChainResult, type BudgetConfig,} from "@reaatech/guardrail-chain";setLogger(new ConsoleLogger());class RequiredSectionsGuardrail implements Guardrail<string, string> { readonly id = "required-sections"; readonly name = "Required Sections"; readonly type = "output" as const; readonly enabled = true; async execute(input: string, _context: ChainContext): Promise<GuardrailResult<string>> { const required = [ "Referring Veterinarian", "Patient", "Reason for Referral", "Clinical Summary", ] as const; for (const section of required) { if (!input.includes(section)) { return { passed: false, output: input, error: new Error(`Missing required section: ${section}`) }; } } return { passed: true, output: input }; }}class ContentSafetyGuardrail implements Guardrail<string, string> { readonly id = "content-safety"; readonly name = "Content Safety"; readonly type = "output" as const; readonly enabled = true; readonly priority = 0; async execute(input: string, _context: ChainContext): Promise<GuardrailResult<string>> { const ssnPattern = /\d{3}-\d{2}-\d{4}/; const hallucinationMarkers = ["I am not sure", "Please verify"]; if (ssnPattern.test(input)) { return { passed: false, output: input, error: new Error("Content safety check failed") }; } for (const marker of hallucinationMarkers) { if (input.includes(marker)) { return { passed: false, output: input, error: new Error("Content safety check failed") }; } } return { passed: true, output: input }; }}const budgetConfig: BudgetConfig = { maxLatencyMs: 2000, maxTokens: 1000 };function createGuardrailChain(): GuardrailChain { return new ChainBuilder() .withBudget(budgetConfig) .withGuardrail(new ContentSafetyGuardrail()) .withGuardrail(new RequiredSectionsGuardrail()) .build();}function validateLetter(chain: GuardrailChain, letter: string): Promise<ChainResult> { return chain.execute(letter, createChainContext({ userId: "recipe", sessionId: generateCorrelationId() }, budgetConfig));}export { createGuardrailChain, validateLetter, RequiredSectionsGuardrail, ContentSafetyGuardrail };
Barrel files:
src/services/context/index.ts:
ts
export { createContextPlanner, packLetterContext } from "./context-planner-service.js";export type { ContextPlanner, PackingResult } from "@reaatech/context-window-planner";
src/services/guardrail/index.ts:
ts
export { createGuardrailChain, validateLetter, RequiredSectionsGuardrail, ContentSafetyGuardrail } from "./guardrail-service.js";
Expected output: You now have a token-budgeted context planner that fits system instructions, retrieved context, and a generation buffer within 8,000 tokens, plus a two-stage guardrail chain that rejects letters missing required sections or containing PII.
Step 8: Generate DOCX and PDF referral letters
The letter generator takes a populated ReferralLetter object and produces a real file — either a .docx (using the docx npm package) or a .pdf (using pdf-lib).
import { PDFDocument, StandardFonts, rgb } from "pdf-lib";import { ReferralLetter } from "../../types/index.js";export async function generatePdf(letter: ReferralLetter): Promise<Uint8Array> { const pdfDoc = await PDFDocument.create(); const font = await pdfDoc.embedFont(StandardFonts.TimesRoman); const boldFont = await pdfDoc.embedFont(StandardFonts.TimesRomanBold); const fontSize = 11; const labelFontSize = 11; const margin = 50; const pageWidth = 612; const pageHeight = 792; const maxWidth = pageWidth - margin * 2; const lineHeight = fontSize * 1.4; const topY = pageHeight - margin; let page = pdfDoc.addPage([pageWidth, pageHeight]); let y = topY; function addNewPage(): void { page = pdfDoc.addPage([pageWidth, pageHeight]); y = topY; } function drawText(text: string, opts: { bold?: boolean; size?: number } = {}): void { const size = opts.size ?? fontSize; const f = opts.bold ? boldFont : font; const words = text.split(" "); let line = ""; for (const word of words) { const testLine = line.length > 0 ? `${line} ${word}` : word; const tw = f.widthOfTextAtSize(testLine, size); if (tw > maxWidth && line.length > 0) { if (y - lineHeight < margin) { addNewPage(); } page.drawText(line, { x: margin, y, size, font: f, color: rgb(0, 0, 0) }); y -= lineHeight; line = word; } else { line = testLine; } } if (line.length > 0) { if (y - lineHeight < margin) { addNewPage(); } page.drawText(line, { x: margin, y, size, font: f, color: rgb(0, 0, 0) }); y -= lineHeight; } } if (y - lineHeight * 2 < margin) { addNewPage(); } drawText("Veterinary Referral Letter", { bold: true, size: 16 }); y -= lineHeight; const entries = Object.entries(letter) as Array<[string, string]>; for (const [key, value] of entries) { if (value.trim().length > 0) { const label = key.replace(/([A-Z])/g, " $1").replace(/^./, (s) => s.toUpperCase()); y -= lineHeight * 0.5; if (y - lineHeight < margin) { addNewPage(); } drawText(`${label}:`, { bold: true, size: labelFontSize }); drawText(value, { size: fontSize }); } } return await pdfDoc.save();}
Create the format selector at src/services/letter-generation/format-selector.ts:
ts
import { ReferralLetter } from "../../types/index.js";import { generateDocx } from "./docx-generator.js";import { generatePdf } from "./pdf-generator.js";export async function generateLetter( letter: ReferralLetter, format: string,): Promise<Uint8Array> { if (format === "docx") { return generateDocx(letter); } if (format === "pdf") { return generatePdf(letter); } throw new Error(`Unknown format: ${format}`);}
Barrel at src/services/letter-generation/index.ts:
ts
export { generateDocx } from "./docx-generator.js";export { generatePdf } from "./pdf-generator.js";export { generateLetter } from "./format-selector.js";
Expected output:pnpm typecheck is clean. The letter generators produce valid DOCX and PDF buffers from any ReferralLetter object — even one with mostly empty fields.
Step 9: Wire the pipeline orchestrator and observability
This is the heart of the application. The pipeline factory wires every service into a single processReferral function that runs each stage in order: extract text, parse record, store in memory, analyze with LLM, retrieve context, pack context, check cache or generate letter, validate with guardrails, build the letter object, and render it in the chosen format.
import { extractText, parseVeterinaryRecord } from "../document-extraction/index.js";import { analyzeRecord, generateLetterContent, SYSTEM_PROMPT_LETTER_DRAFTING } from "../ai/index.js";import { createMemoryService, createRetriever, createContextInjector, storeRecordMemory, retrieveContext } from "../memory/index.js";import { createCacheService, getCachedResponse, setCachedResponse } from "../cache/index.js";import { createContextPlanner, packLetterContext } from "../context/index.js";import { createGuardrailChain, validateLetter } from "../guardrail/index.js";import type { ChainResult } from "@reaatech/guardrail-chain";import { generateLetter } from "../letter-generation/index.js";import { getPipelineConfig } from "../../config/index.js";import type { AppConfig, ClinicalSummary, PipelineResult }
Create the Langfuse observability adapter at src/services/pipeline/langfuse-observer.ts. It gracefully degrades to a no-op tracer when Langfuse keys aren’t configured:
export { createPipeline, buildReferralLetter } from "./referral-pipeline.js";export { initObservability } from "./langfuse-observer.js";
Expected output:pnpm typecheck is clean. The pipeline orchestrator is a single async function that sequences every service layer, wraps each stage in a named try/catch, and throws descriptive errors like "extraction-failed: <reason>" or "guardrail-failed: content-safety" so the API layer can map them to appropriate HTTP status codes.
Step 10: Create the Next.js API route handlers
The API has two routes — a health check and the letter generation endpoint that accepts multipart form data.
Create app/api/health/route.ts:
ts
import { type NextRequest, NextResponse } from "next/server"export async function GET(_req: NextRequest) { return NextResponse.json({ status: "ok", timestamp: new Date().toISOString() })}
Create app/api/referral-letter/route.ts:
ts
import { type NextRequest, NextResponse } from "next/server";import { z } from "zod";import { loadConfig } from "../../../src/config/index.js";import { createPipeline } from "../../../src/services/pipeline/index.js";const bodySchema = z.object({ format: z.enum(["docx", "pdf"]).default("docx"),});let pipelinePromise: ReturnType<typeof createPipeline> | null = null;async function getPipeline() { if (!pipelinePromise) { const config = loadConfig(); pipelinePromise = createPipeline(config); } return pipelinePromise;}export async function POST(req: NextRequest) { try { const formData = await req.formData(); const file = formData.get("file"); if (!file || !(file instanceof File)) { return NextResponse.json({ error: "File is required" }, { status: 422 }); } const formatRaw = (formData.get("format") as string) || "docx"; const parsed = bodySchema.safeParse({ format: formatRaw }); if (!parsed.success) { return NextResponse.json({ error: "Format must be docx or pdf" }, { status: 400 }); } const { format } = parsed.data; const buffer = new Uint8Array(await file.arrayBuffer()); const type = file.name.toLowerCase().endsWith(".pdf") ? "pdf" : "docx"; const pipeline = await getPipeline(); const result = await pipeline.processReferral( { type, buffer }, format, ); return NextResponse.json({ letter: Buffer.from(result.letterBytes).toString("base64"), format: result.format, metadata: result.metadata, }); } catch (error: unknown) { const message = error instanceof Error ? error.message : "Unknown error"; if (message.startsWith("extraction-failed") || message.startsWith("guardrail-failed")) { return NextResponse.json({ error: message }, { status: 422 }); } return NextResponse.json({ error: message }, { status: 500 }); }}
Expected output: The route handlers use NextRequest / NextResponse.json() as required by the Next.js App Router conventions. POSTing a valid record file returns a 200 with a base64-encoded letter, format string, and metadata object.
Step 11: Write and run the test suite
Create the MSW test setup at tests/setup.ts that mocks the OpenAI and Langfuse HTTP endpoints:
Open referral-letter.docx in your word processor — it will contain a fully formatted veterinary referral letter with all sections populated from your source record.
Next steps
Add a specialist-name field to the POST body so the letter targets a specific veterinarian. Pass it through the pipeline to populate referralLetter.specialistName.
Store letters in cloud storage — after generation, upload the buffer to S3 or R2 and return a signed download URL instead of base64-encoded bytes.
Integrate with a real PIMS — swap the file upload for an API call to a veterinary practice management system like Avimark or eVetPractice to fetch records by patient ID.
Add multi-language support — extend the prompt templates to accept a language parameter so letters can be drafted in Spanish, French, or other languages common in multi-region practices.
from
"../../types/index.js"
;
import type { DocumentSource, ExtractedRecord, LetterFormat, ReferralLetter } from "../../types/index.js";