Small law firms receive client inquiries across email and web; manually logging them in Clio, drafting initial replies, and assigning the right attorney consumes hours of non‑billable time.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building a multi-agent legal intake system for Clio, the legal practice management platform. Incoming client messages are triaged by a Claude-powered agent that classifies the intent (billing question, consultation request, document review, case update, or general inquiry), then hands off to a drafting agent that generates a professional reply and saves it as a Clio document, or to a case agent that summarizes the conversation and updates the Clio matter record. A budget engine caps LLM spend per agent, a LangGraph background worker monitors handoff events, and Langfuse provides observability.
You’ll use the @reaatech/agent-mesh package family for agent orchestration, handoff, and routing, along with the Anthropic SDK, LangGraph, Zod, and Langfuse. This recipe targets Node.js 22+, pnpm, and Next.js 16 (App Router).
Prerequisites
Node.js >= 22 installed on your machine
pnpm >= 10 installed (npm install -g pnpm if needed)
Expected output:pnpm resolves all packages and writes pnpm-lock.yaml. No warnings about missing peer dependencies.
Step 2: Configure environment variables
Create .env.example with placeholder values for every integration. This file is the source of truth for which environment variables the system needs. Never commit real values.
Expected output: The .env file is created. You’ll edit it with your Anthropic API key and Clio OAuth2 credentials. The system reads these via process.env.X at runtime.
Step 3: Define the shared types and Zod schemas
Create src/types/clio.ts — these interfaces model Clio’s API resource shapes and include a custom error class:
Finally, create src/types/index.ts to re-export everything from a single entry point:
ts
export type { ClioDocument, ClioMatter, ClioOAuthToken, ClioOAuthConfig } from "./clio.js";export { ClioApiError } from "./clio.js";export type { AgentIntent, DraftDocument, CaseUpdate } from "./agent.js";export { AgentIntentSchema, DraftDocumentSchema, CaseUpdateSchema } from "./agent.js";
Expected output: Three files under src/types/. Run pnpm typecheck — it should report no errors.
Step 4: Create the Anthropic client
Create src/lib/anthropicClient.ts — this module instantiates the Anthropic SDK singleton and exports three functions for the three agent roles: classifying intent, drafting a reply, and summarizing case notes.
ts
import Anthropic from "@anthropic-ai/sdk";import type { AgentIntent } from "../types/agent.js";export const anthropicClient = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY ?? "",});export async function classifyIntent( message: string, matterContext?: string): Promise<AgentIntent> { const systemMsg = matterContext ? `Classify the following legal client message into one category: billing, consultation, document_review, case_update, or general. Matter context: ${matterContext}. Return JSON with keys "category", "confidence" (0-1), and "entities" (object).` : 'Classify the following legal client message into one category: billing, consultation, document_review, case_update, or general. Return JSON with keys "category", "confidence" (0-1), and "entities" (object).'
Key design decisions here:
classifyIntent uses the fast Claude Haiku model (cheap, low latency) for classification and returns a typed AgentIntent with category, confidence score, and extracted entities.
draftReply uses Claude Sonnet (higher quality) for legal drafting, since the output is client-facing.
summarizeCaseNotes uses Haiku again — summarization is a simpler task.
All three functions cap token usage with max_tokens, pass system prompts via the top-level system parameter (not a message with role "system"), and catch Anthropic.APIError to provide typed error messages.
Expected output: A single file src/lib/anthropicClient.ts. Verify with pnpm typecheck.
Step 5: Register agents and build the handoff router
Two modules work together to make agent discovery and routing work.
Agent registry
Create src/lib/agentRegistry.ts — this registers three agents in the @reaatech/agent-handoff-routing registry, declaring their skills, domains, specializations, and availability:
Create src/lib/handoffRouter.ts — this sets up the capability-based router from @reaatech/agent-handoff-routing and wraps it in a typed helper:
ts
import { CapabilityBasedRouter } from "@reaatech/agent-handoff-routing";import { createHandoffConfig } from "@reaatech/agent-handoff";import type { HandoffPayload, RoutingDecision, AgentCapabilities } from "@reaatech/agent-handoff";export const handoffRouter = new CapabilityBasedRouter({ minConfidenceThreshold: 0.7, ambiguityThreshold: 0.15, maxAlternatives: 3, policy: "best_effort",});export const handoffConfig = createHandoffConfig({ routing: { minConfidenceThreshold: 0.7 },});export async function routeAgent( payload: HandoffPayload, candidates: AgentCapabilities[]): Promise<RoutingDecision> { const decision = await handoffRouter.route(payload, candidates); return decision;}
The CapabilityBasedRouter compares the payload’s intent and required skills against each agent’s declared capabilities and returns a RoutingDecision with the best match (or a fallback).
Expected output: Two files: src/lib/agentRegistry.ts and src/lib/handoffRouter.ts. Run pnpm typecheck to confirm clean compilation.
Step 6: Instrument with Langfuse
Create src/lib/instrumentation.ts — this module initializes Langfuse for LLM observability. The register() function is called at Next.js startup via the instrumentation hook. Langfuse is imported dynamically so the module can safely run in both Node and Edge runtimes.
ts
let langfuseInstance: { shutdownAsync: () => Promise<void> } | undefined;export async function register(): Promise<void> { if (process.env.NEXT_RUNTIME === "nodejs") { const { Langfuse } = await import("langfuse"); langfuseInstance = new Langfuse({ secretKey: process.env.LANGFUSE_SECRET_KEY ?? "", publicKey: process.env.LANGFUSE_PUBLIC_KEY ?? "", baseUrl: process.env.LANGFUSE_BASE_URL ?? "https://cloud.langfuse.com", }); }}export function getLangfuse(): { shutdownAsync: () => Promise<void> } { if (!langfuseInstance) { throw new Error("Langfuse not initialized. Call register() first."); } return langfuseInstance;}export async function shutdown(): Promise<void> { if (langfuseInstance) { await langfuseInstance.shutdownAsync(); }}
For register() to actually fire, you MUST enable the Next.js instrumentation hook in next.config.ts:
The NEXT_RUNTIME === "nodejs" guard ensures the dynamic import of Langfuse only happens in the Node.js runtime, not in Edge workers where Node-only APIs are unavailable.
Expected output:src/lib/instrumentation.ts created and next.config.ts updated with instrumentationHook: true. Verify with pnpm typecheck.
Step 7: Implement LLM spend tracking with the budget service
Create src/services/budgetService.ts — this wires up @reaatech/agent-budget-engine to cap spending per agent. Each agent has a dollar limit, a soft cap (80%) that triggers a model downgrade from Sonnet to Haiku, and a hard cap that stops all requests.
triage-agent: $1.00 limit (classification calls are cheap at ~$0.005 each)
drafting-agent: $2.00 limit (Sonnet is more expensive at ~$0.03 per call)
case-agent: $0.50 limit (summarization is infrequent)
The autoDowngrade policy means that once an agent hits 80% of its budget, subsequent requests that would use Sonnet are automatically downgraded to Haiku until the budget resets or the hard cap stops all requests.
Expected output:src/services/budgetService.ts. Run pnpm typecheck.
Step 8: Build the Clio REST API client
Create src/services/clioService.ts — this is the integration layer that talks to Clio’s API v4. It handles OAuth2 token refresh, rate-limit retries with exponential backoff (via withRetry from @reaatech/agent-handoff), and exposes typed CRUD methods for documents, matters, and notes.
ts
import { withRetry } from "@reaatech/agent-handoff";import type { ClioOAuthConfig, ClioDocument, ClioMatter } from "../types/clio.js";import { ClioApiError } from "../types/clio.js";const CLIO_API_BASE = "https://app.clio.com/api/v4";export class ClioApiClient { private config: ClioOAuthConfig; private accessToken: string | null = null; private tokenExpiresAt: number = 0; constructor(config: ClioOAuthConfig) { this.config = config;
The OAuth2 flow works like this: on first request, getAccessToken() calls POST /oauth/token with a refresh_token grant to get an access_token. That token is cached until it expires (based on expires_in). If a 401 comes back mid-request, the cached token is cleared and the retry logic (via withRetry) triggers a fresh token fetch. 429 rate-limits also retry with exponential backoff up to 3 attempts.
Expected output:src/services/clioService.ts. Run pnpm typecheck.
Step 9: Implement the triage, drafting, and case agents
Triage agent
Create src/agents/triageAgent.ts — this is the entry point for message processing. It checks the budget, classifies the message with Claude (or falls back to keyword rules if the budget is exceeded), and hands off to the appropriate specialist agent.
ts
import { classifyIntent } from "../lib/anthropicClient.js";import { handoffToAgent } from "../services/handoffOrchestrator.js";import { checkBudget, recordSpend } from "../services/budgetService.js";import { AgentResponseSchema } from "@reaatech/agent-mesh";import type { AgentResponse } from "@reaatech/agent-mesh";import { buildTurnEntry } from "@reaatech/agent-mesh-router";import type { HandoffPayload } from "@reaatech/agent-handoff";function ruleBasedClassification(rawInput: string): { category: "billing" | "consultation" | "document_review" | "case_update" |
The routing logic is:
Billing / consultation questions → hand off to the drafting agent (a substantive reply is needed)
Case update / document review requests → hand off to the case agent (the record needs updating)
General or low-confidence intents → escalate to human review
Drafting agent
Create src/agents/draftingAgent.ts — it generates a legal reply using Claude Sonnet, creates a Clio document from it, and returns the result:
If Clio is unreachable, the draft content is preserved in the workflow state so it can be retried or manually saved later.
Case agent
Create src/agents/caseAgent.ts — it summarizes the conversation history and updates the Clio matter with a note:
ts
import { summarizeCaseNotes } from "../lib/anthropicClient.js";import { ClioApiClient } from "../services/clioService.js";import { checkBudget, recordSpend } from "../services/budgetService.js";import { AgentResponseSchema } from "@reaatech/agent-mesh";import type { AgentResponse } from "@reaatech/agent-mesh";const clioClient = new ClioApiClient({ clientId: process.env.CLIO_CLIENT_ID ?? "", clientSecret: process.env.CLIO_CLIENT_SECRET ?? "", redirectUri: process.env.CLIO_REDIRECT_URI ?? "", refreshToken: process.env.CLIO_REFRESH_TOKEN ?? "",});export async function updateCase( matterId: string, history: Array<{ role: string; content: string }>, sessionId: string): Promise<AgentResponse> { const budgetCheck = checkBudget("case-agent", 0.01, "claude-haiku-4-5-20251001"); if (!budgetCheck.allowed) { return AgentResponseSchema.parse({ content: "Case update skipped: budget exceeded for case agent.", workflow_complete: false, workflow_state: { matterId, budgetExceeded: true }, }); } const noteSummary = await summarizeCaseNotes(history); recordSpend("case-agent", 0.01, 200, 100, "claude-haiku-4-5-20251001"); try { await clioClient.updateMatter(matterId, { status: "Awaiting Response" }); } catch { // Matter status update is best-effort } try { await clioClient.createNote(matterId, noteSummary || "Case summarized by AI assistant."); } catch (error) { return AgentResponseSchema.parse({ content: `Case notes generated but Clio save failed for matter ${matterId}.`, workflow_complete: false, workflow_state: { matterId, clioError: error instanceof Error ? error.message : "Clio save failed", noteSummary, }, }); } return AgentResponseSchema.parse({ content: `Case updated for matter ${matterId}. Summary logged.`, workflow_complete: true, workflow_state: { matterId, noteCreated: true, sessionId }, });}
Expected output: Three files under src/agents/. Run pnpm typecheck — all three should compile cleanly.
Step 10: Wire up the handoff orchestrator
Create src/services/handoffOrchestrator.ts — this is the coordination layer that connects incoming webhook messages to the triage and handoff flow. It validates incoming requests using IncomingRequestSchema from @reaatech/agent-mesh, classifies intent, routes via dispatchToAgent() from @reaatech/agent-mesh-router, and emits typed handoff events.
ts
import { dispatchToAgent, buildTurnEntry, shouldCloseSession, getUpdatedWorkflowState, mcpClientFactory } from "@reaatech/agent-mesh-router";import { IncomingRequestSchema, type IncomingRequest, AgentConfigSchema } from "@reaatech/agent-mesh";import { TypedEventEmitter, HandoffError } from "@reaatech/agent-handoff";import type { HandoffPayload, HandoffResult } from "@reaatech/agent-handoff";import { classifyIntent } from "../lib/anthropicClient.js";import { handoffRouter } from "../lib/handoffRouter.js";import { getAllAgentCapabilities } from "../lib/agentRegistry.js";import { checkBudget, recordSpend } from "./budgetService.js";export const handoffEvents = new TypedEventEmitter<{ handoffInitiated
Expected output:src/services/handoffOrchestrator.ts — this is the largest file in the project (~224 lines). Run pnpm typecheck.
Step 11: Create the LangGraph background worker
Create src/services/backgroundWorker.ts — this LangGraph state machine monitors handoff events and syncs case status to Clio asynchronously.
The graph runs every time a handoffCompleted event fires: it records the event in the message log and then attempts to sync a summary to Clio. Clio sync failures are best-effort — the worker swallows errors so a Clio outage doesn’t block subsequent handoffs.
Expected output:src/services/backgroundWorker.ts. Run pnpm typecheck.
Step 12: Create the API routes
All routes use NextRequest and NextResponse from next/server — never bare Request or Response objects. Bare Response omits the Content-Type: application/json header, and NextRequest carries important Next.js extensions (cookies, geo, ip).
The project ships with 18 test files covering every module. Here are two representative examples.
Testing the budget service
tests/services/budgetService.test.ts mocks @reaatech/agent-budget-engine and tests the checkBudget(), recordSpend(), and getBudgetState() functions across happy paths, budget exhaustion, and model downgrade scenarios:
Coverage is measured against runtime code only (src/**/*.ts and app/**/route.ts) — UI files like page.tsx and layout.tsx are excluded from the coverage gates.
Try the recipe
Start the dev server:
terminal
pnpm dev
Send a test message to the webhook:
terminal
curl -X POST http://localhost:3000/api/webhooks/incoming-message \ -H "Content-Type: application/json" \ -d '{ "session_id": "session-123", "raw_input": "I need help with my bill from last month", "employee_id": "emp-001" }'
Expected output: A JSON response indicating the message was classified and handed off to the drafting agent:
json
{ "content": "Classified as billing. Handing off to drafting agent.", "sessionId": "session-123", "workflowComplete": false}
Add a Slack or email intake channel — create a new webhook route that accepts messages from Slack’s Events API or SendGrid inbound parse and forwards them to processIncomingMessage
Implement a human-in-the-loop approval step — route high-confidence draft responses through a review queue before saving them to Clio, using the workflow_complete: false signal
Extend agent specializations — add a discovery agent that searches Clio for related matters by client name or case number before routing
Add budget dashboards — expose getBudgetState() through a /api/admin/budgets endpoint that shows real-time spend per agent
Deploy to production — build with pnpm build, deploy to Vercel or a Node.js host, and point your Clio webhook or intake forms at the /api/webhooks/incoming-message endpoint