Veterinary practice owners and associate vets dread after-hours calls because every pet owner thinks their issue is an emergency. Without a triage system, vets must interrupt their personal time to manually assess each case, often leading to fatigue and inconsistent decisions. Clients get frustrated by long wait times or feel dismissed, risking practice reputation and loyalty. A 24/7 triage agent that asks structured questions and provides a severity score can filter true emergencies from routine concerns, letting vets focus on critical cases.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This recipe builds a 24/7 after-hours pet emergency triage agent. When a client calls outside clinic hours, the agent asks structured questions, transcribes the conversation (STT), assesses urgency through an LLM, and speaks back a severity rating (TTS) — all over a Twilio phone call. Critical cases are flagged for immediate vet callback while routine concerns get appropriate next-day guidance. You’ll wire together six REAA voice-agent packages into a dual-server architecture: a Hono edge server for real-time telephony and a Next.js dashboard for call monitoring.
Prerequisites
Node.js 22+ and pnpm 10 installed on your machine
A Deepgram API key for speech-to-text and text-to-speech (DEEPGRAM_API_KEY)
An OpenAI API key for LLM-based triage assessment (OPENAI_API_KEY)
A Langfuse account and project for observability (optional — LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY)
Basic familiarity with TypeScript and Next.js App Router
Step 1: Scaffold the project and install dependencies
The scaffold agent has already created a Next.js 16 App Router project and run pnpm install. Your package.json is pinned:
Copy the .env.example file to get your environment variables:
env
# Env vars used by agnostic-afterhours-triage-agent.# The builder adds entries here as it wires up each integration.# Keep placeholders only — never commit real values.NODE_ENV=developmentDEEPGRAM_API_KEY=<your-deepgram-key>OPENAI_API_KEY=<your-openai-key>LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>OTEL_EXPORTER_OTLP_ENDPOINT=<your-otel-endpoint>TRIAGE_MODEL=gpt-5.2-miniPORT=3001API_URL=http://localhost:3001NEXT_PUBLIC_API_URL=http://localhost:3001
Fill in the placeholder values with your real API keys before running the server.
Expected output: After running pnpm install, you’ll have a node_modules/ directory and a pnpm-lock.yaml with all packages at their exact pinned versions.
Step 2: Define the triage types and app configuration
Create the type definitions for the triage domain. These types describe what a caller reports, what the LLM assesses, and how calls are tracked. Write src/types/triage.ts:
Expected output: Two TypeScript source files with no type errors. Run pnpm typecheck to confirm.
Step 3: Build the LLM-powered triage engine
The triage engine is the core medical reasoning component. It builds a veterinary triage system prompt with species-specific danger signs, sends symptoms to the LLM, and parses the structured JSON assessment. Write src/lib/triage-engine.ts:
ts
import { z } from "zod";import { generateText } from "ai";import { openai } from "@ai-sdk/openai";import { IncomingRequestSchema, AgentResponseSchema, type IncomingRequest, type AgentResponse,} from "@reaatech/agent-mesh";import { ModelDefinitionSchema, type ModelDefinition,} from "@reaatech/llm-router-core";import { langfuse } from "./observability.js";import { type TriageRequest, type TriageAssessment, type TriageSeverity } from "../types/triage.js";
Notice how assessUrgency uses the Vercel AI SDK’s generateText with the openai() provider, validates the LLM output through a Zod schema, and optionally sends a trace to Langfuse. The three validate* functions at the bottom wrap the agent-mesh and llm-router-core schemas for safe deserialization at the edges of the system.
Expected output: The file compiles cleanly. The classifyByProtocol function maps all five severity levels to concrete wait-time advice and escalation flags.
Step 4: Build the call session store
The call store tracks in-progress and completed calls in memory with TTL-based cleanup. Write src/lib/call-store.ts:
The store maintains a reverse index from streamSid to callSid so you can look up calls by either identifier. The 60-second cleanup interval prunes records older than the TTL (default 24 hours).
Expected output: A call-store module you can import and use across the application. The callStore singleton is reused by the Hono server routes and the triage service.
Step 5: Set up observability
Observability wires together OpenTelemetry tracing and Langfuse for LLM cost tracking. Write src/lib/observability.ts:
The latencyBudget sets per-stage targets (STT: 200ms, MCP: 400ms, TTS: 200ms) with an 800ms total target and a 1200ms hard cap — these drive the LatencyBudgetEnforcer in the pipeline.
Expected output: A module that initializes OTel on startup, exposes a latencyBudget, and provides graceful shutdown.
Step 6: Wire up the voice pipeline
The pipeline wires STT (Deepgram), TTS (Deepgram Aura), session management, latency enforcement, and cost tracking into a single Pipeline instance. Write src/lib/pipeline.ts:
The pipeline is the central orchestrator. Each call becomes a session, each utterance becomes a turn that flows STT to MCP to TTS, and the LatencyBudgetEnforcer ensures real-time constraints are met.
Expected output: Factory functions for STT, TTS, and pipeline creation, all using the REAA package APIs.
Step 7: Create the telephony WebSocket handler
This handler interprets the Twilio Media Streams WebSocket protocol — receiving audio chunks from the caller, routing them through the voice pipeline, and sending TTS audio back. Write src/services/telephony.ts:
There are two onUtterance listeners: one forwards interim (partial) transcripts for barge-in detection, and the other handles final transcripts by running them through the triage service and streaming the response back as TTS audio.
Expected output: A WebSocket handler function that wires STT utterances — both interim and final — through barge-in and triage processing.
Step 8: Build the triage service
The triage service connects a textual user input (from speech or text) to the LLM-powered triage engine and formats the response as a veterinary advice message. Write src/services/triage-service.ts:
ts
import { ContextPacketSchema, AgentResponseSchema, type AgentResponse,} from "@reaatech/agent-mesh";import { assessUrgency, classifyByProtocol } from "../lib/triage-engine.js";import { type TriageAssessment, type TriageRequest } from "../types/triage.js";import { type CallSessionStore } from "../lib/call-store.js";export class TriageService { constructor(private readonly callStore: CallSessionStore) {} async processUserInput(callSid: string, transcript: string): Promise<AgentResponse> { const record = this.callStore.getCall(callSid); if (!record) { const safeResponse = AgentResponseSchema.parse({ content: "I'm sorry, I couldn't find your session. Please try again.", workflow_complete: false, }); return safeResponse; } const contextPacket = { session_id: callSid, request_id: `${callSid}-${String(Date.now())}`, employee_id: "triage-agent", raw_input: transcript, turn_history: [], workflow_state: {}, }; ContextPacketSchema.parse(contextPacket); const species = record.patientInfo?.species || "other"; const request: TriageRequest = { sessionId: callSid, symptoms: [transcript], species: species as TriageRequest["species"], patientAge: record.patientInfo?.age, knownConditions: [], medications: [], }; const assessment = await assessUrgency(request); this.callStore.updateCall(callSid, { triageAssessment: assessment }); const protocol = classifyByProtocol(assessment.severity); const adviceText = [ `Based on the symptoms described, this appears to be ${assessment.severity === "non-urgent" ? "a" : "an"} ${assessment.severity} situation.`, assessment.recommendedAction, protocol.advice, assessment.disclaimer, ].join(" "); const agentResponse = AgentResponseSchema.parse({ content: adviceText, workflow_complete: true, }); return agentResponse; } shouldEscalateToVet(assessment: TriageAssessment): boolean { if (assessment.severity === "critical" || assessment.severity === "emergency") { return true; } if (assessment.confidence < 0.5) { return true; } return false; } getVetSummary(callSid: string): string { const record = this.callStore.getCall(callSid); if (!record || !record.triageAssessment) { return "No assessment available"; } const a = record.triageAssessment; const species = record.patientInfo?.species || "unknown"; const summary = [ `Species: ${species}.`, `Symptoms: ${a.symptoms.join("; ")}.`, `Assessment: ${a.severity} — ${a.recommendedAction}.`, `Confidence: ${String(a.confidence)}.`, a.disclaimer, ].join(" "); return summary; }}
The processUserInput method validates the incoming context through ContextPacketSchema from @reaatech/agent-mesh, runs the triage engine, stores the assessment in the call record, and returns a structured AgentResponse — ready for TTS synthesis.
Expected output: A service class that orchestrates the full text-input-to-advice pipeline and provides escalation logic.
Step 9: Create the Hono server with API routes
The Hono server runs alongside the Next.js app on port 3001 and handles real-time telephony endpoints. Write src/index.ts:
ts
import { Hono } from "hono";import { serve } from "@hono/node-server";import { WebSocketServer } from "ws";import { createServer } from "http";import { loadConfig } from "./types/config.js";import { init, shutdown } from "./lib/observability.js";import { createSttProvider, createTtsProvider, createVoicePipeline } from "./lib/pipeline.js";import { callStore } from "./lib/call-store.js";import { createTwilioWebSocketHandler } from "./services/telephony.js";import { TriageService } from "./services/triage-service.js";import { assessUrgency } from "./lib/triage-engine.js"
This file is the entry point. It creates the Hono app, registers REST and TwiML routes, starts the WebSocket server for Twilio Media Streams, wires all dependencies together, and handles graceful shutdown.
Expected output: A Hono server running on port 3001 with health, triage, calls, and Twilio endpoints.
Step 10: Add the Next.js App Router proxy routes
The Next.js app serves as a dashboard proxy. These route handlers forward requests from the browser to the Hono API. Write app/api/health/route.ts:
ts
import { NextResponse } from "next/server";export function GET() { return NextResponse.json({ status: "ok" });}
The tests also check coverage thresholds (90% on lines, branches, functions, and statements for runtime code). You can view the detailed coverage report:
terminal
cat coverage/coverage-summary.json
The integration test (tests/integration/triage-flow.test.ts) runs a full end-to-end flow: creating a call in the store, processing a symptom transcript through the triage service, and verifying the critical-severity assessment is stored and reflected in the response.
Next steps
Add SMS triage — Wire in a Twilio SMS handler via the Hono server to let clients text symptoms and receive a text-based severity assessment without a phone call.
Extend species support — Add red-flag rules for exotic pets (birds, reptiles, rabbits) in the triage prompt, and update the species type to include "avian" | "reptile".
Persist calls to a database — Replace the in-memory CallSessionStore with SQLite (via better-sqlite3) or Postgres so call history survives server restarts.
Deploy with Twilio Elastic SIP Trunk — Point a real Twilio phone number at the Hono server’s /twilio/incoming-call endpoint to accept live after-hours calls.
Add a vet callback queue — Build a separate route and UI component that lists calls needing vet follow-up, with one-click call-back buttons via Twilio’s REST API.
export function classifyByProtocol(severity: TriageSeverity): {
recommendedWaitTime: string;
requiresCall: boolean;
advice: string;
} {
switch (severity) {
case "critical":
return { recommendedWaitTime: "See vet immediately", requiresCall: true, advice: "Bring your pet to the nearest emergency veterinary hospital right away. Do not wait." };
case "emergency":
return { recommendedWaitTime: "Within 30 minutes", requiresCall: true, advice: "Contact your veterinarian or an emergency clinic immediately." };
case "urgent":
return { recommendedWaitTime: "Within 2 hours", requiresCall: false, advice: "Schedule an urgent appointment with your veterinarian today." };
case "non-urgent":
return { recommendedWaitTime: "Within 24 hours", requiresCall: false, advice: "Schedule a routine appointment with your veterinarian." };
case "wellness":
return { recommendedWaitTime: "Schedule next available appointment", requiresCall: false, advice: "This appears to be a routine question. Schedule at your convenience." };
}
}
export function validateTriageInput(raw: unknown): IncomingRequest {
return IncomingRequestSchema.parse(raw);
}
export function validateAgentResponse(raw: unknown): AgentResponse {
return AgentResponseSchema.parse(raw);
}
export function validateModelDefinition(raw: unknown): ModelDefinition {
return ModelDefinitionSchema.parse(raw);
}
;
import { type TriageRequest } from "./types/triage.js";