Small property management businesses lose tenants when maintenance requests go unanswered after hours. An automated voice system can take calls, but generic IVRs frustrate callers without understanding the issue context or creating actual work orders.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building a voice agent that answers after-hours property maintenance calls, transcribes what the caller says, classifies the issue into categories like plumbing or electrical, and creates work orders in AppFolio. You’ll use a Next.js application with the @reaatech/* voice agent package family, Deepgram for speech-to-text and text-to-speech, a Databricks-hosted LLM for intent classification, and Twilio for telephony.
By the end you’ll have a deployable voice intake system that turns a tenant’s spoken maintenance request into a structured work order — no manual triage required.
Prerequisites
Node.js 22+ and pnpm 10 installed
A Twilio account with a purchased phone number that has Voice capabilities
A Deepgram API key (for STT and TTS)
A Databricks workspace with a model serving endpoint deployed (DBRX-Instruct, Llama, or any chat model exposed via an OpenAI-compatible endpoint)
A Langfuse account (for LLM observability)
An AppFolio account with API credentials (for work order creation)
Familiarity with TypeScript and Next.js App Router
Step 1: Scaffold the project and install dependencies
Create the project directory and add a package.json. The exact-pinned dependencies include the @reaatech/* voice agent packages, Deepgram, Twilio, the Vercel AI SDK for Databricks LLM access, Zod for validation, Langfuse for tracing, and the ws library for WebSocket media streaming.
Expected output:pnpm install completes without errors and pnpm typecheck exits 0.
Step 2: Configure environment variables
Create .env.example with placeholder values for every external service the voice agent connects to. The application reads these at startup and validates them with Zod.
env
# Env vars used by databricks-voice-agent-for-after-hours-property-maintenance-intake.# Keep placeholders only — never commit real values.NODE_ENV=developmentDEEPGRAM_API_KEY=<your-deepgram-api-key>TWILIO_ACCOUNT_SID=<your-twilio-account-sid>TWILIO_AUTH_TOKEN=<your-twilio-auth-token>TWILIO_PHONE_NUMBER=<your-twilio-phone-number>DATABRICKS_HOST=<your-workspace-host-url>DATABRICKS_TOKEN=<your-personal-access-token>DATABRICKS_SERVING_ENDPOINT=<model-serving-endpoint-name>LANGFUSE_PUBLIC_KEY=<pk-lf-...>LANGFUSE_SECRET_KEY=<sk-lf-...>LANGFUSE_HOST=https://cloud.langfuse.comAPPOFOLIO_CLIENT_ID=<appfolio-client-id>APPOFOLIO_CLIENT_SECRET=<appfolio-client-secret>APPOFOLIO_BASE_URL=https://api.appfolio.comWS_PORT=8080WS_URL=wss://your-domain.com/media
Now create the runtime environment loader in src/config/environment.ts. It reads process.env for each variable and throws a descriptive error if any are missing:
ts
function readEnvVar(key: string): string { const value = process.env[key]; if (!value) { throw new Error(`Missing required environment variable: ${key}`); } return value;}export const env = { deepgramKey: readEnvVar("DEEPGRAM_API_KEY"), twilioAccountSid: readEnvVar("TWILIO_ACCOUNT_SID"), twilioAuthToken: readEnvVar("TWILIO_AUTH_TOKEN"), twilioPhoneNumber: readEnvVar("TWILIO_PHONE_NUMBER"), databricksHost: readEnvVar("DATABRICKS_HOST"), databricksToken: readEnvVar("DATABRICKS_TOKEN"), databricksServingEndpoint: readEnvVar("DATABRICKS_SERVING_ENDPOINT"), langfusePublicKey: readEnvVar("LANGFUSE_PUBLIC_KEY"), langfuseSecretKey: readEnvVar("LANGFUSE_SECRET_KEY"), langfuseHost: readEnvVar("LANGFUSE_HOST"), appfolioClientId: readEnvVar("APPOFOLIO_CLIENT_ID"), appfolioClientSecret: readEnvVar("APPOFOLIO_CLIENT_SECRET"), appfolioBaseUrl: readEnvVar("APPOFOLIO_BASE_URL"), wsPort: (() => { const raw = process.env.WS_PORT; if (!raw) return 8080; const port = Number(raw); if (Number.isNaN(port) || port < 0) { throw new Error("WS_PORT must be a positive integer"); } return port; })(),};export const DEFAULT_LATENCY_BUDGET = { target: 800, hardCap: 1200, stt: 200, mcp: 400, tts: 200,} as const;
Expected output: Importing env from this module throws with the name of the first missing variable — so you know exactly what to fill in.
Step 3: Define types and configuration schemas
Create src/types/index.ts to export the domain types your application shares across modules. The MaintenanceCategory union drives both intent classification and work-order dispatch:
Expected output: These files type-check cleanly. The maintenanceCategorySchema enum gives you a single source of truth for category routing.
Step 4: Build the conversation store with session continuity
The conversation store manages multi-turn call sessions using @reaatech/session-continuity. Create a local InMemoryStorageAdapter that implements the IStorageAdapter interface, and a SimpleTokenCounter for token budget tracking. Wrap everything in a SessionManager instance.
ts
import { randomUUID } from "node:crypto";import { SessionManager, SessionNotFoundError, ConcurrencyError } from "@reaatech/session-continuity";import type { Session, Message, IStorageAdapter, TokenCounter, SessionId, MessageId, HealthStatus, CreateSessionOptions,} from "@reaatech/session-continuity";export class SimpleTokenCounter implements TokenCounter { count(text: string): number { return Math.ceil(text.length / 4); } countMessages(messages:
Expected output: Calling createConversationStore() returns a SessionManager. Creating a call session with "CS123" returns a session object with status: "active" and userId: "CS123".
Step 5: Connect speech-to-text with Deepgram
The STT service wraps @reaatech/voice-agent-stt’s DeepgramSTTProvider. It exposes functions for connecting, streaming audio, subscribing to transcripts, and disconnecting.
ts
import { DeepgramSTTProvider } from "@reaatech/voice-agent-stt";import type { AudioChunk, Utterance } from "../types/index.js";import { env } from "../config/environment.js";export const provider = new DeepgramSTTProvider({ reconnectAttempts: 3, reconnectInterval: 1000,});export async function connectSTT(): Promise<void> { await provider.connect({ provider: "deepgram", apiKey: env.deepgramKey, model: "nova-2", language: "en", sampleRate: 8000, encoding: "mulaw", smartFormat: true, interimResults: true, endpointing: 300, });}export function onTranscript(cb: (utterance: Utterance) => void): void { provider.onUtterance(cb);}export function onEndOfSpeech(cb: () => void): void { provider.onEndOfSpeech(cb);}export function onSttError(cb: (error: Error) => void): void { provider.onError(cb);}export function streamAudio(chunk: AudioChunk): void { provider.streamAudio(chunk);}export async function disconnectSTT(): Promise<void> { await provider.close();}export function isSttConnected(): boolean { return provider.isConnected();}
Expected output:connectSTT() resolves once the Deepgram streaming connection is established. isSttConnected() returns true after connection.
Step 6: Connect text-to-speech with Deepgram
The TTS service wraps @reaatech/voice-agent-tts’s DeepgramTTSProvider. It provides an async generator that streams synthesized audio chunks, and a cancelSynthesis() function for barge-in support.
ts
import { DeepgramTTSProvider, TTSProviderInterface } from "@reaatech/voice-agent-tts";import type { AudioChunk } from "../types/index.js";import { env } from "../config/environment.js";export const provider = new DeepgramTTSProvider();export async function* synthesizeResponse(text: string): AsyncIterable<AudioChunk> { const stream = provider.synthesize(text, { provider: "deepgram", apiKey: env.deepgramKey, voice: "asteria", model: "aura", encoding: "mulaw", sampleRate: 8000, }); for await (const chunk of stream) { yield chunk; }}export function cancelSynthesis(): void { provider.cancel();}export function formatAudioForTwilio(chunk: AudioChunk): AudioChunk { return TTSProviderInterface.formatAudioForTwilio(chunk);}
Expected output:synthesizeResponse("Hello") yields audio chunks from Deepgram’s Aura voice. cancelSynthesis() stops an ongoing synthesis mid-stream.
Step 7: Classify intents with Databricks and the confidence router
The intent classifier sends the transcribed utterance to a Databricks model serving endpoint (via @ai-sdk/openai-compatible) and uses @reaatech/confidence-router-core’s DecisionEngine to decide whether to route to a specific maintenance category, ask a clarifying question, or fall back.
ts
import { DecisionEngine, mergeConfig, RouterError } from "@reaatech/confidence-router-core";import type { Prediction, ClassificationResult, RoutingDecision } from "@reaatech/confidence-router-core";import { createOpenAICompatible } from "@ai-sdk/openai-compatible";import { generateText } from "ai";import { env } from "../config/environment.js";import type { ClassifiedIntent, MaintenanceCategory } from "../types/index.js";const databricksModel = createOpenAICompatible({ baseURL: env.databricksHost, name: "databricks", apiKey: env.databricksToken,}).chatModel(env.databricksServingEndpoint);const engine = new DecisionEngine( mergeConfig({ routeThreshold: 0.8, fallbackThreshold: 0.3, clarificationEnabled: true, }));const CLASSIFICATION_PROMPT = `You are a maintenance intake classifier. Analyze the transcript and output JSON with predictions for the following categories: plumbing, electrical, hvac, general, emergency.Respond with valid JSON only in this format:{"predictions":[{"label":"plumbing","confidence":0.92},{"label":"general","confidence":0.08}]}The confidence scores must sum to 1.0. Only output the JSON object, no other text.`;function createFallbackIntent(sessionId: string, transcript: string): ClassifiedIntent { return { sessionId, transcript, confidence: 0, clarificationNeeded: true, };}export async function classifyIntent( sessionId: string, transcript: string): Promise<ClassifiedIntent> { let predictions: Prediction[]; try { const { text } = await generateText({ model: databricksModel, system: CLASSIFICATION_PROMPT, prompt: transcript, temperature: 0, }); const parsed = JSON.parse(text) as { predictions?: Prediction[] }; if (!parsed.predictions || parsed.predictions.length === 0) { return createFallbackIntent(sessionId, transcript); } predictions = parsed.predictions; } catch { return createFallbackIntent(sessionId, transcript); } try { const result: ClassificationResult = { predictions }; const decision: RoutingDecision = engine.decide(result); if (decision.type === "ROUTE") { const top = predictions[0] as Prediction | undefined; if (top === undefined) { return createFallbackIntent(sessionId, transcript); } return { sessionId, transcript, category: top.label as MaintenanceCategory, confidence: top.confidence, clarificationNeeded: false, predictions, decision, }; } if (decision.type === "CLARIFY") { return { sessionId, transcript, confidence: predictions[0]?.confidence ?? 0, clarificationNeeded: true, predictions, decision, }; } return createFallbackIntent(sessionId, transcript); } catch (err) { if (err instanceof RouterError) { return createFallbackIntent(sessionId, transcript); } return createFallbackIntent(sessionId, transcript); }}
Expected output:classifyIntent("s1", "the sink is leaking") returns a ClassifiedIntent with category: "plumbing", confidence: 0.92, and clarificationNeeded: false. A low-confidence transcript like “hello” returns clarificationNeeded: true.
Step 8: Dispatch work orders to AppFolio
The dispatcher uses @reaatech/agent-handoff’s TypedEventEmitter for lifecycle events and withRetry for resilient HTTP calls. Each MaintenanceCategory gets its own handler function that builds a WorkOrderPayload and POSTs it to AppFolio’s REST API.
Expected output:dispatchWorkOrder({ category: "plumbing", ... }) returns { success: true, workOrderId: "wo_xxx" } on success, or { success: false, error: "..." } on failure. Failed requests retry up to 3 times with exponential backoff.
Step 9: Wire up Twilio telephony
The telephony service wraps @reaatech/voice-agent-telephony’s createTwilioHandler. It handles incoming WebSocket connections from Twilio Media Streams, routes audio chunks to the STT service, handles barge-in events (which cancel TTS), and cleans up on call end.
Expected output:createCallHandler() returns a handler object with on, sendAudio, clearAudio, close, and acceptConnection methods.
Step 10: Create the pipeline orchestrator
The pipeline is the central coordinator. It uses @reaatech/voice-agent-core’s createPipeline() to wire the STT provider, TTS provider, MCP client (which routes through intent classification and work-order dispatch), and a latency budget enforcer. Pipeline events log transcripts, MCP responses, and turn completions.
ts
import { createPipeline, initializeSessionManager, createLatencyBudget, LatencyBudgetEnforcer, type Pipeline, type VoiceAgentKitConfig, type PipelineEvent,} from "@reaatech/voice-agent-core";import type { TwilioMediaStreamHandler } from "@reaatech/voice-agent-telephony";import { DEFAULT_LATENCY_BUDGET } from "../config/environment.js";import { provider as sttProvider, connectSTT } from "./stt.js";import { provider as ttsProvider } from "./tts.js";import { createCallHandler, closeCallHandler } from "./telephony.js";import { createCallSession, endCallSession } from "./conversation-store.js"
Expected output:startPipeline("CS123") starts a pipeline session and returns a Pipeline object. endPipeline("CS123") ends the session and cleans up.
Step 11: Create webhook routes and the WebSocket server
The incoming call webhook receives Twilio’s voice callback, creates a call session, and returns TwiML that instructs Twilio to connect a Media Stream WebSocket. The status callback handles call-completed events, tracking the outcome.
Create app/api/calls/incoming/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import { handleIncomingCall } from "../../../../src/api/calls/webhook.js";export async function POST(req: NextRequest): Promise<NextResponse> { try { const formData = await req.formData(); const params = new URLSearchParams(); const entries = Array.from(formData.entries()); for (const [key, value] of entries) { params.set(key, typeof value === "string" ? value : ""); } const result = await handleIncomingCall(params); return new NextResponse(result.twiml, { headers: { "Content-Type": "text/xml" }, }); } catch (error) { console.error("Incoming call webhook failed:", error); return NextResponse.json( { error: "webhook processing failed" }, { status: 500 }, ); }}
The webhook handler in src/api/calls/webhook.ts ties it all together — it creates the call session, starts a Langfuse trace, and returns TwiML that connects a <Stream> to the WebSocket URL:
ts
import twilio from "twilio";import { createCallSession, endCallSession } from "../../services/conversation-store.js";import { trackCallTrace, trackCallScore } from "../../services/observability.js";const WS_URL = process.env.WS_URL ?? "wss://localhost:8080/media";const callTraces = new Map<string, ReturnType<typeof trackCallTrace>>();export async function handleIncomingCall(reqBody: URLSearchParams): Promise<{ twiml: string }> { const callSid = reqBody.get("CallSid"); const from = reqBody.get("From") ?? ""; const to = reqBody.get("To") ?? ""; if (!callSid) { throw new Error("Missing CallSid"); } await createCallSession(callSid, { caller: from, dialed: to }); const trace = trackCallTrace(callSid); callTraces.set(callSid, trace); const response = new twilio.twiml.VoiceResponse(); const connect = response.connect(); connect.stream({ url: WS_URL }); return { twiml: response.toString() };}export async function handleStatusCallback(reqBody: URLSearchParams): Promise<void> { const callStatus = reqBody.get("CallStatus"); const callSid = reqBody.get("CallSid"); if (callSid && callStatus === "completed") { await endCallSession(callSid); const trace = callTraces.get(callSid); if (trace) { trackCallScore(trace, "work_order_created"); callTraces.delete(callSid); } }}
Now create the WebSocket server at src/services/websocket-server.ts. This starts alongside your Next.js app and handles the Twilio Media Stream connections:
ts
import { WebSocketServer } from "ws";import { createCallHandler, handleIncomingMediaStream } from "./telephony.js";import { env } from "../config/environment.js";export let wss: WebSocketServer | null = null;export function startWebSocketServer(): void { try { wss = new WebSocketServer({ port: env.wsPort }); wss.on("connection", (ws) => { try { const handler = createCallHandler(); handleIncomingMediaStream(handler, ws); } catch (error) { console.error("Failed to handle WebSocket connection:", error); } }); wss.on("error", (error) => { console.error("WebSocket server error:", error); }); console.log("WebSocket server started on port " + String(env.wsPort)); } catch (error) { console.error("Failed to start WebSocket server:", error); throw error; }}export async function stopWebSocketServer(): Promise<void> { return new Promise((resolve) => { if (wss) { for (const client of wss.clients) { client.close(); } wss.close(() => { wss = null; resolve(); }); } else { resolve(); } });}
Start the WebSocket server at boot time via Next.js instrumentation at src/instrumentation.ts:
ts
export async function register(): Promise<void> { if (process.env.NEXT_RUNTIME === "nodejs") { const { startWebSocketServer } = await import("./services/websocket-server.js"); startWebSocketServer(); }}
Expected output: Sending a POST to /api/calls/incoming with CallSid and From returns TwiML containing <Connect><Stream url="wss://..."/></Connect>. The WebSocket server starts on the configured WS_PORT.
Step 12: Add observability with Langfuse
Create src/services/observability.ts to trace every call, classification, and dispatch event through Langfuse. This gives you a dashboard to review transcripts, intent decisions, and work-order outcomes.
Expected output: Every call creates a trace named after the callSid in Langfuse. Completed calls have a score of 1 when a work order was created, 0 otherwise.
Step 13: Run the tests
The project includes a comprehensive test suite covering every service, route handler, and lib module. Tests mock external services (Deepgram, Twilio, Databricks LLM, AppFolio) using vi.mock so they run without real credentials.
Run the full test suite with coverage:
terminal
pnpm test
Check type correctness:
terminal
pnpm typecheck
Verify lint:
terminal
pnpm lint
Expected output: All tests pass with 0 failures. TypeScript type-checking exits with no errors. ESLint reports no issues.
Next steps
Deploy behind a reverse proxy — Use ngrok or a cloud load balancer to expose your Next.js app and WebSocket server to Twilio’s public endpoints.
Add a user verification step — Before creating a work order, ask the caller for their tenant ID or property address and validate it against your property management database.
Extend the maintenance categories — Add new handlers for pest control, lockouts, or appliance repair by adding entries to the MaintenanceCategory union and the dispatch switch statement.
Swap the conversation store backend — Replace InMemoryStorageAdapter with Firestore, DynamoDB, or Redis by implementing the IStorageAdapter interface from @reaatech/session-continuity.