A solo immigration attorney spends over 30 minutes per new client manually gathering case details, checking for conflicts across years of paper files, and entering data into the case management system. This administrative drag means fewer billable hours and delayed responses to prospective clients. The attorney often loses leads because intake takes too long and the process feels impersonal. They need a way to automate the initial triage without sacrificing accuracy or compliance.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial builds an AI-powered intake automation agent for a solo immigration attorney. A prospective client sends a message through a web form, and the agent screens them, checks for conflicts against past cases stored in a Postgres + pgvector database, classifies their legal need, generates a structured case summary, and responds with empathy — all while scrubbing PII, enforcing compliance disclaimers, and tracking telemetry with Langfuse. By the end, intake drops from 30 minutes to 5.
You’ll wire up 6 REAA packages (agent-mesh, hybrid-rag, agent-memory, guardrail-chain, agent-handoff, llm-cache) into a Next.js 16 App Router project with Zod-validated env config, an LLM service built on the Vercel AI SDK (ai), a PDF/OCR document parser, and a full test suite with msw HTTP mocking.
Prerequisites
Node.js 22+ with pnpm 10 installed
An OpenAI API key with access to gpt-5.2 and text-embedding-3-small
A Langfuse account (free tier works) with public and secret keys
A PostgreSQL database with the pgvector extension enabled
Familiarity with Next.js App Router route handlers and TypeScript generics
Step 1: Scaffold the project and install dependencies
Start from an empty directory. Create the project with Next.js and install every dependency at exact pinned versions.
terminal
pnpm create next@16.2.7
.
--typescript
--app
--src-dir
--eslint
--import-alias
"@/"
Add the REAA packages and supporting libraries. Every version is pinned to an exact semver — no ^ or ~ allowed.
Expected output: A package.json with all dependencies at exact versions and a pnpm-lock.yaml. The scaffold provides next.config.ts, tsconfig.json, vitest.config.ts, and eslint.config.mjs.
Step 2: Configure environment variables with Zod
Environment variables are the backbone of configuration. You’ll parse them at import time using Zod so every access is type-safe and missing values fail fast.
# Env vars used by agnostic-intake-automation-agent.# Keep placeholders only — never commit real values.NODE_ENV=developmentOPENAI_API_KEY=<your-openai-key>DEFAULT_LLM_MODEL=gpt-5.2INTAKE_AGENT_MAX_TOKENS=2048LLM_CACHE_SIMILARITY_THRESHOLD=0.8LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>POSTGRES_URL=<your-postgres-connection-string>
Expected output: A parsed env object that throws a ZodError at module load time if any required key is missing. The defaults for DEFAULT_LLM_MODEL, LLM_CACHE_SIMILARITY_THRESHOLD, and INTAKE_AGENT_MAX_TOKENS let you omit them.
Step 3: Define the domain types and prompt constants
Centralise the types and prompts. This keeps your services loosely coupled — each imports only what it needs.
export const CONFLICT_THRESHOLD = 0.75;export const DEFAULT_CACHE_TTL_SEC = 3600;export const INTENT_CLASSIFICATION_PROMPT = `Classify the client's immigration intake message.Return the category, urgency level, and detected language.`;export const CONFLICT_CHECK_PROMPT = `You are a conflict-checking assistant for a solo immigration law practice.Given a new client case and an existing case document, determine if there is a real conflict of interest.Return YES if the parties are adverse or the cases share materially related subject matter.`;export const CASE_SUMMARY_PROMPT = `You are an experienced immigration paralegal.Summarize the client's case description and any relevant documents into a structured intake summary.Include key facts, immigration category, and recommended next steps.`;
Create src/lib/errors.ts with typed error classes:
Expected output: Three files under src/lib/: types.ts (5 interfaces), constants.ts (2 numeric constants + 3 prompt strings), and errors.ts (5 error classes).
Step 4: Set up the database with pgvector
The database stores intake sessions and case documents. The case_documents table includes a vector(1536) column for embedding-based similarity search, which powers the conflict checker.
Create src/services/db.ts:
ts
import { sql } from "@vercel/postgres";import pgvector from "pgvector";import type { IntakeSession, DocumentInfo } from "../lib/types.js";export async function initDatabase() { await sql`CREATE EXTENSION IF NOT EXISTS vector`; await sql`CREATE TABLE IF NOT EXISTS intake_sessions (id UUID PRIMARY KEY DEFAULT gen_random_uuid(), client_name TEXT NOT NULL, client_email TEXT, client_phone TEXT, case_description TEXT, status TEXT DEFAULT 'active', conflict_status TEXT, created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW())`; await sql`CREATE TABLE IF NOT EXISTS case_documents (id UUID PRIMARY KEY DEFAULT gen_random_uuid(), session_id UUID REFERENCES intake_sessions(id), file_name TEXT, mime_type TEXT, text_content TEXT, embedding vector(1536), created_at TIMESTAMPTZ DEFAULT NOW())`;}export function toSql(vector: number[]) { return pgvector.toSql(vector);}export async function saveSession(session: IntakeSession) { await sql`INSERT INTO intake_sessions (id, client_name, client_email, case_description, status) VALUES (${session.sessionId}, ${session.clientName}, ${session.clientEmail}, ${session.caseDescription}, ${session.status}) ON CONFLICT (id) DO UPDATE SET client_name = ${session.clientName}, client_email = ${session.clientEmail}, case_description = ${session.caseDescription}, status = ${session.status}, updated_at = NOW()`;}export async function getSession(sessionId: string): Promise<IntakeSession | null> { const { rows } = await sql`SELECT * FROM intake_sessions WHERE id = ${sessionId}`; if (rows.length === 0) return null; const row = rows[0]; return { sessionId: String(row.id), clientName: String(row.client_name), clientEmail: String(row.client_email), caseDescription: String(row.case_description), status: String(row.status), createdAt: new Date(String(row.created_at)), };}export async function saveDocument(doc: DocumentInfo, emb: number[]) { await sql`INSERT INTO case_documents (id, file_name, mime_type, text_content, embedding) VALUES (${doc.id}, ${doc.fileName}, ${doc.mimeType}, ${doc.textContent}, ${toSql(emb)}::vector)`;}export async function searchSimilarDocuments(embedding: number[], limit: number = 10) { const { rows } = await sql`SELECT id, session_id, file_name, text_content FROM case_documents ORDER BY embedding <-> ${toSql(embedding)}::vector LIMIT ${limit}`; return rows;}
Expected output: Six exported functions: initDatabase, toSql, saveSession, getSession, saveDocument, and searchSimilarDocuments. The searchSimilarDocuments function uses the cosine-distance operator (<->) for vector similarity search.
Step 5: Build the LLM service with the Vercel AI SDK
The LLM service wraps three operations: intent classification (structured output), case summarization, and free-form response generation. The Vercel AI SDK provides a provider-agnostic generateText function together with Output.object for typed structured outputs.
Create src/services/llm.ts:
ts
import { generateText, Output } from "ai";import { openai } from "@ai-sdk/openai";import { env } from "../lib/env.js";import { z } from "zod";import { INTENT_CLASSIFICATION_PROMPT, CASE_SUMMARY_PROMPT } from "../lib/constants.js";export function getModel() { return openai(env.DEFAULT_LLM_MODEL);}export async function classifyIntent(input: string) { const result = await generateText({ model: getModel(), system: INTENT_CLASSIFICATION_PROMPT, output: Output.object({ schema: z.object({ category: z.enum(["visa_inquiry", "green_card", "citizenship", "asylum", "other"]), urgency: z.enum(["low", "medium", "high"]), language: z.string().optional(), }), }), prompt: input, }); return result.output;}export async function summarizeCase(description: string, documents: string[]) { const result = await generateText({ model: getModel(), system: CASE_SUMMARY_PROMPT, prompt: `Case description: ${description}\n\nRelevant documents:\n${documents.join("\n---\n")}`, }); return result.text;}export async function generateResponse(context: string, clientMessage: string) { const result = await generateText({ model: getModel(), maxOutputTokens: env.INTAKE_AGENT_MAX_TOKENS, system: "You are a professional immigration paralegal assistant helping with client intake. Be empathetic, precise, and never provide legal advice.", messages: [{ role: "user", content: `Context: ${context}\n\nClient message: ${clientMessage}` }], }); return result.text;}
Expected output: Three functions. classifyIntent returns a typed object with category, urgency, and optional language. summarizeCase and generateResponse return plain strings.
Step 6: Build the document parser with pdf-parse and Tesseract.js
The parser handles PDF extraction and OCR for scanned images (PNG, JPEG, TIFF). A chunkText utility splits long documents by sentence boundaries so each chunk fits within an embedding model’s token limit.
Create src/services/document-parser.ts:
ts
import { createWorker } from "tesseract.js";import { PDFParse } from "pdf-parse";import { DocumentParseError } from "../lib/errors.js";export async function parsePdf(buffer: Buffer): Promise<string> { try { const parser = new PDFParse({ data: new Uint8Array(buffer) }); const textResult = await parser.getText(); return textResult.text; } catch (e) { throw new DocumentParseError(e instanceof Error ? e.message : String(e)); }}export async function ocrImage(buffer: Buffer): Promise<string> { try { const worker = await createWorker('eng'); const ret = await worker.recognize(buffer); await worker.terminate(); return ret.data.text; } catch (e) { throw new DocumentParseError(e instanceof Error ? e.message : String(e)); }}export async function parseDocument(buffer: Buffer, mimeType: string): Promise<string> { if (mimeType === "application/pdf") return parsePdf(buffer); if (mimeType === "image/png" || mimeType === "image/jpeg" || mimeType === "image/tiff") return ocrImage(buffer); throw new DocumentParseError("Unsupported mime type", "unsupported_mime_type");}export function chunkText(text: string, maxChunkSize: number = 512): string[] { const chunks: string[] = []; let current = ""; const sentences = text.split(/(?<=[.!?])\s+/); for (const sentence of sentences) { if ((current + " " + sentence).trim().length > maxChunkSize) { if (current) chunks.push(current.trim()); current = sentence; if (current.length > maxChunkSize) { while (current.length > maxChunkSize) { chunks.push(current.slice(0, maxChunkSize)); current = current.slice(maxChunkSize); } } } else { current = current ? current + " " + sentence : sentence; } } if (current) chunks.push(current.trim()); return chunks;}
Expected output: Four exported functions. parseDocument dispatches by mime type. chunkText splits on sentence boundaries and falls back to character splits for long runs of text without punctuation.
Step 7: Implement the guardrail chain
The guardrail chain runs input and output checks. Three guardrails are wired together: a PII scrubber (redacts SSNs, phone numbers, and emails from input), a compliance checker (ensures responses contain “not legal advice”), and a toxicity filter. The chain is built with @reaatech/guardrail-chain’s ChainBuilder with a 1-second latency budget and automatic retry.
Expected output: Three guardrail classes, a singleton chain builder, and two entry points (runInputGuardrails, runOutputGuardrails) that each return { passed, output }.
Step 8: Add memory for client conversation context
The memory service stores facts extracted from conversations — client preferences, corrections, and factual details — and retrieves them as context for responses. It wraps @reaatech/agent-memory’s AgentMemory with an OpenAI embedding model and an OpenAILLMProvider for extraction.
Create src/services/memory-service.ts:
ts
import { AgentMemory, MemoryType, OpenAILLMProvider, type ConversationTurn } from "@reaatech/agent-memory";import { env } from "../lib/env.js";export let _memory: AgentMemory | null = null;export function getMemory(): AgentMemory { if (!_memory) { _memory = new AgentMemory({ storage: { provider: "memory" }, embedding: { provider: "openai", model: "text-embedding-3-small", apiKey: env.OPENAI_API_KEY }, extraction: { llmProvider: new OpenAILLMProvider({ apiKey: env.OPENAI_API_KEY, model: "gpt-4o-mini" }), enabledTypes: [MemoryType.FACT, MemoryType.PREFERENCE, MemoryType.CORRECTION], batchSize: 10, confidenceThreshold: 0.7, }, }); } return _memory;}export async function storeClientMemory(sessionId: string, conversation: ConversationTurn[]) { const memory = getMemory(); const stored = await memory.extractAndStore(conversation); return stored;}export async function queryClientMemory(sessionId: string, query: string, limit: number = 5): Promise<string[]> { const memory = getMemory(); const results = await memory.retrieve(query, { limit }); const texts: string[] = []; for (const m of results) { texts.push(m.content); } return texts;}export async function closeMemory() { if (_memory) { await _memory.close(); _memory = null; }}
Expected output: A singleton AgentMemory that extracts facts, preferences, and corrections. storeClientMemory accepts a ConversationTurn[] and queryClientMemory returns an array of memory content strings.
Step 9: Set up the LLM response cache
The cache service avoids redundant LLM calls by storing prior responses with semantic similarity matching. It wraps @reaatech/llm-cache’s CacheEngine with an in-memory adapter and an OpenAI embedder for vector-based lookups.
Expected output: A singleton CacheEngine with semantic cosine-similarity lookups at the configured threshold. The getCachedOrGenerate helper checks the cache first and falls back to calling the provided function.
Step 10: Build the conflict checker
The conflict checker searches the pgvector-powered case document index for semantically similar cases and uses an LLM to determine if each match represents a real conflict of interest.
Create src/services/conflict-checker.ts:
ts
import { type Document as RAGDocument, DocumentSchema } from "@reaatech/hybrid-rag";import { generateText } from "ai";import { getModel } from "./llm.js";import { searchSimilarDocuments, saveDocument } from "./db.js";import { CONFLICT_CHECK_PROMPT } from "../lib/constants.js";import type { ConflictCheckResult, DocumentInfo } from "../lib/types.js";export async function indexCaseDocument(doc: RAGDocument, embedding: number[]) { const validDoc = DocumentSchema.parse(doc); const docInfo: DocumentInfo = { id: crypto.randomUUID(), fileName: "case-document", mimeType: "text/plain", textContent: validDoc.content, }; await saveDocument(docInfo, embedding);}export async function checkConflicts( clientName: string, caseDescription: string, embedding: number[],): Promise<ConflictCheckResult> { const similar = await searchSimilarDocuments(embedding, 5); const conflictingCaseIds: string[] = []; for (const doc of similar) { const conflictCheck = await generateText({ model: getModel(), system: CONFLICT_CHECK_PROMPT, prompt: `New case: ${clientName} — ${caseDescription}\nExisting case: ${JSON.stringify(doc)}`, }); if (conflictCheck.text.toLowerCase().includes("yes")) { const docId = String(doc.id); conflictingCaseIds.push(docId); } } return { hasConflict: conflictingCaseIds.length > 0, conflictingCaseIds, details: conflictingCaseIds.length > 0 ? "Found " + String(conflictingCaseIds.length) + " potentially conflicting case(s)" : "No conflicts detected", };}
Expected output: Two functions. indexCaseDocument validates input with DocumentSchema.parse and persists it. checkConflicts searches the top-5 similar documents and uses the LLM to judge each one.
Step 11: Create the intake agent orchestrator
The orchestrator ties every service together. It creates intake sessions, processes messages through the guardrail chain, retrieves client memory for context, generates LLM responses, indexes uploaded documents, and emits events for handoff.
Create src/services/intake-agent.ts:
ts
import { IncomingRequestSchema, type IncomingRequest, AgentResponseSchema, type AgentResponse } from "@reaatech/agent-mesh";import { env } from "../lib/env.js";import type { IntakeRequest, IntakeSession } from "../lib/types.js";import { saveSession, getSession } from "./db.js";import { summarizeCase, generateResponse } from "./llm.js";import { storeClientMemory, queryClientMemory } from "./memory-service.js";import { runInputGuardrails, runOutputGuardrails } from "./guardrail-service.js";import { indexCaseDocument } from "./conflict-checker.js";import { parseDocument, chunkText } from "./document-parser.js";import { traceLLMCall, traceIntakeSession } from
Expected output: Four exported orchestrator functions and re-exports. processIntake validates with @reaatech/agent-mesh schemas and returns a typed AgentResponse.
Step 12: Wire up API routes
Five route handlers expose the intake functionality as HTTP endpoints. Routes use Next.js 16 App Router conventions with NextRequest and NextResponse.
import { type NextRequest, NextResponse } from "next/server";import { getSession } from "@/src/services/db.js";import { checkConflicts } from "@/src/services/conflict-checker.js";interface ConflictCheckBody { sessionId: string;}export async function POST(req: NextRequest) { try { const body: ConflictCheckBody = await req.json() as ConflictCheckBody; if (!body.sessionId) { return NextResponse.json( { error: "sessionId is required" }, { status: 400 }, ); } const session = await getSession(body.sessionId); if (!session) { return NextResponse.json( { error: "session not found" }, { status: 404 }, ); } const embedding = new Array(1536).fill(0) as Array<number>; const result = await checkConflicts( session.clientName, session.caseDescription, embedding, ); return NextResponse.json({ hasConflict: result.hasConflict, conflictingCaseIds: result.conflictingCaseIds, details: result.details, }); } catch { return NextResponse.json( { error: "conflict check failed" }, { status: 500 }, ); }}
Create app/api/health/route.ts:
ts
import { NextResponse } from "next/server";export function GET() { return NextResponse.json({ status: "ok", version: "0.1.0", uptime: process.uptime(), });}
Expected output: Five route handler files under app/api/. All routes handle happy paths, validation errors (400), not-found (404), and internal errors (500).
Step 13: Run the tests
The test suite uses vitest with MSW HTTP mocking to simulate the OpenAI, Anthropic, and Langfuse APIs. Create the test setup and a few representative test files.
Create tests/setup.ts — an MSW server that intercepts OpenAI, Anthropic, and Langfuse requests:
The tests/env.test.ts file validates that env parsing works with all keys, throws on missing OPENAI_API_KEY, and correctly applies the DEFAULT_LLM_MODEL default:
Expected output: vitest prints a summary with zero failed tests and coverage above 90% across lines, branches, functions, and statements. The coverage report shows src/ and app/**/route.ts files with high coverage — UI files (app/**/page.tsx, layout.tsx) are excluded by the vitest config, so focus stays on service logic and route handlers.
Next steps
Add a real vector embedding step. The current onNewDocument function fills embeddings with zeros as a placeholder. Wire the CacheEngine’s embedder or @reaatech/hybrid-rag’s chunking pipeline to produce real 1536-dimensional embeddings from document content.
Replace in-memory adapters with persistent storage. The cache and memory services use InMemoryAdapter — switch to Redis or Postgres-backed adapters so data survives restarts and scales across instances.
Extend the blocking list. The ToxicityGuardrail has an empty BLOCKLIST array. Populate it with actual blocked terms or integrate with an external moderation API.
Deploy with a real database. Point POSTGRES_URL at a managed Postgres instance with pgvector enabled, then run initDatabase() on startup to create the schema.
"./tracing.js"
;
import { getIntakeHandoffConfig, intakeEvents } from "./handoff-service.js";
export async function startIntakeSession(request: IntakeRequest): Promise<IntakeSession> {
const finalResponse = outputResult.passed ? response : response + "\n\n*This response is for informational purposes only and does not constitute legal advice.*";
await storeClientMemory(sessionId, [
{ speaker: "user", content: message, timestamp: new Date() },
{ speaker: "agent", content: finalResponse, timestamp: new Date() },
]);
return finalResponse;
}
export async function processIntake(request: IncomingRequest): Promise<AgentResponse> {