Vertex AI Voice Agent for Buildium Maintenance Requests
A 24/7 voice receptionist that authenticates tenants, creates work orders in Buildium, and provides real‑time status updates — so property managers never miss a maintenance call.
Property management companies using Buildium lose after‑hours calls from tenants reporting leaks or HVAC failures. Manual entry into Buildium is slow, and missed requests lead to unhappy tenants and damage escalation.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
In this tutorial you’ll build a 24/7 voice receptionist that answers Twilio calls, authenticates tenants against Buildium’s REST API, and uses Vertex AI Gemini to create or update maintenance work orders. When a tenant calls after hours to report a leak, the voice agent handles the entire interaction — transcribing speech with Deepgram Nova-2, reasoning with Gemini 2.5 Flash, and responding with Cartesia Sonic TTS.
You’ll use Next.js 16 (App Router), TypeScript, and the @reaatech/voice-agent-core pipeline to wire up eight cloud services.
Prerequisites
Node.js 22+ and pnpm 10
A Twilio account with a phone number that has voice capabilities
A Google Cloud project with the Vertex AI API enabled and a service account key
Deepgram API key (Nova-2 model)
Cartesia API key (Sonic TTS model)
Buildium account with API credentials (client ID and secret)
Upstash Redis database (URL and token)
Langfuse account (optional, for observability)
Basic familiarity with TypeScript, Next.js App Router, and WebSocket concepts
Step 1: Scaffold the project and configure environment variables
Create a new Next.js project and install all dependencies. The scaffold agent has already set up the project shell for you. Start by examining the package.json and .env.example to see what’s wired up.
The .env.example file lists every environment variable the application reads:
env
# Env vars used by vertex-ai-voice-agent-for-buildium-maintenance-requests.# The builder adds entries here as it wires up each integration.# Keep placeholders only — never commit real values.NODE_ENV=developmentTWILIO_ACCOUNT_SID=<your-twilio-account-sid>TWILIO_AUTH_TOKEN=<your-twilio-auth-token>TWILIO_PHONE_NUMBER=<your-twilio-phone-number>DEEPGRAM_API_KEY=<your-deepgram-api-key>CARTESIA_API_KEY=<your-cartesia-api-key>GOOGLE_CLOUD_PROJECT=<your-gcp-project-id>GOOGLE_CLOUD_LOCATION=us-central1GOOGLE_GENAI_USE_VERTEXAI=trueGOOGLE_APPLICATION_CREDENTIALS=<path-to-service-account.json>UPSTASH_REDIS_URL=<your-upstash-redis-url>UPSTASH_REDIS_TOKEN=<your-upstash-redis-token>BUILDIUM_CLIENT_ID=<your-buildium-client-id>BUILDIUM_CLIENT_SECRET=<your-buildium-client-secret>BUILDIUM_API_BASE_URL=https://api.buildium.comLANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>LANGFUSE_BASE_URL=https://cloud.langfuse.comSESSION_TTL_SECONDS=3600SESSION_MAX_TURNS=20OTEL_EXPORTER_OTLP_ENDPOINT=<your-otel-endpoint>
Copy .env.example to .env.local and fill in real values for every service you plan to use.
The next.config.ts enables the instrumentation hook, which is required because this recipe uses src/instrumentation.ts to initialize observability at startup:
ts
import type { NextConfig } from "next";const nextConfig: NextConfig = { experimental: { instrumentationHook: true, },} as NextConfig;export default nextConfig;
Expected output:pnpm install completes without errors and pnpm dev starts the Next.js dev server on port 3000.
Step 2: Create the Buildium API client with OAuth2
The Buildium API client handles tenant lookup and work-order CRUD operations. It authenticates via OAuth2 client credentials, caches the access token, and retries failed requests with exponential backoff.
Expected output: The client validates config with Zod, fetches an OAuth2 token on first call, caches it (with a 60-second safety margin), and re-authenticates on 401.
Step 3: Build the Twilio client
The Twilio client wraps the Twilio SDK for phone number lookup and SMS. It also sets up the @reaatech/voice-agent-telephony handler that manages Twilio Media Streams.
Expected output: The Twilio handler is configured with barge-in support (the caller can interrupt the agent), a 300ms minimum speech duration, and a 0.7 confidence threshold for speech detection.
Step 4: Implement the Vertex AI LLM service with Gemini function calling
The LLM service initializes Vertex AI with Gemini 2.5 Flash, maps conversation messages to Vertex’s content format, and declares BUILDUM_TOOLS — function declarations that Gemini can invoke to look up tenants and manage work orders.
The implementation creates a Vertex AI GenerativeModel with a low temperature (0.2) for consistent behavior, and maps your LlmMessage objects to Vertex’s content format:
The function declarations tell Gemini what tools it can call:
ts
// Buildium tool declarations for Gemini function callingexport const BUILDUM_TOOLS: ToolDeclaration[] = [ { name: "lookupTenant", description: "Look up a tenant by phone number in Buildium", parameters: { type: SchemaType.OBJECT, properties: { phone: { type: SchemaType.STRING, description: "Phone number to look up" } }, required: ["phone"], }, }, { name: "createWorkOrder", description: "Create a new maintenance work order in Buildium", parameters: { type: SchemaType.OBJECT, properties: { tenantId: { type: SchemaType.STRING, description: "Tenant ID" }, subject: { type: SchemaType.STRING, description: "Short subject line" }, description: { type: SchemaType.STRING, description: "Detailed description of the issue" }, priority: { type: SchemaType.STRING, enum: ["low", "medium", "high", "emergency"], description: "Priority level" }, }, required: ["tenantId", "subject", "description"], }, }, { name: "getWorkOrderStatus", description: "Get the current status of a work order", parameters: { type: SchemaType.OBJECT, properties: { workOrderId: { type: SchemaType.STRING, description: "Work order ID" } }, required: ["workOrderId"], }, }, { name: "updateWorkOrder", description: "Update the status or details of an existing work order", parameters: { type: SchemaType.OBJECT, properties: { workOrderId: { type: SchemaType.STRING, description: "Work order ID" }, status: { type: SchemaType.STRING, enum: ["open", "in_progress", "completed", "cancelled"] }, description: { type: SchemaType.STRING, description: "Updated description or notes" }, }, required: ["workOrderId"], }, },];
Expected output:getLlmService() returns a singleton that sends messages to Gemini 2.5 Flash. The model can call lookupTenant, createWorkOrder, getWorkOrderStatus, and updateWorkOrder as function tools.
Step 5: Create the session service with Upstash Redis
The session service persists conversation state across call legs using Upstash Redis and the @reaatech/session-continuity package. It implements the IStorageAdapter interface and adds a token counter for budget management.
Create src/services/session-service.ts:
ts
import { SessionManager, type IStorageAdapter, type Session } from "@reaatech/session-continuity";import { Redis } from "@upstash/redis";import type { Message, SessionId, MessageId, HealthStatus, SessionFilters, UpdateSessionOptions } from "@reaatech/session-continuity";const TTL = parseInt(process.env.SESSION_TTL_SECONDS ?? "3600", 10);export class RedisStorageAdapter implements IStorageAdapter { private redis: Redis; constructor(redisUrl: string, redisToken: string) { this.redis = new Redis({ url: redisUrl, token: redisToken }); }
The SimpleTokenCounter provides a rough token estimate for budget enforcement:
Expected output: Sessions are stored in Upstash Redis with a configurable TTL (default 1 hour). The SessionManager enforces a 4000-token budget with sliding_window compression.
Step 6: Build the cost telemetry service
The cost telemetry service records LLM, STT, and TTS costs using the @reaatech/llm-cost-telemetry package. Each cost span includes the provider, model, token counts, and computed cost in USD.
Expected output:createCostTelemetry() returns a service that produces CostSpan objects validated by CostSpanSchema. STT costs are $0.0059/minute (Deepgram Nova-2), and TTS costs are $0.000015/character (Cartesia Sonic).
Step 7: Wire the observability and instrumentation
The observability module initializes OpenTelemetry tracing and Langfuse at startup. It’s called from src/instrumentation.ts, which Next.js runs during server startup.
Create src/services/observability.ts:
ts
import { initializeObservability } from "@reaatech/voice-agent-core";let initialized = false;export async function initObservability(): Promise<void> { if (initialized) return; initialized = true; await initializeObservability({ serviceName: "buildium-voice-agent", serviceVersion: "1.0.0", enabled: true, otlpEndpoint: process.env.OTEL_EXPORTER_OTLP_ENDPOINT, }); // langfuse is imported and initialized lazily — the SDK reads env vars automatically const { Langfuse } = await import("langfuse"); const langfuse = new Langfuse({ publicKey: process.env.LANGFUSE_PUBLIC_KEY ?? "", secretKey: process.env.LANGFUSE_SECRET_KEY ?? "", baseUrl: process.env.LANGFUSE_BASE_URL ?? "https://cloud.langfuse.com", }); langfuse.trace({ name: "voice-agent-init" });}
Expected output: The register() function runs only in the Node.js runtime (not Edge), dynamically imports the observability module to avoid Edge-compatibility issues, and initializes OpenTelemetry and Langfuse once.
Step 8: Build the voice agent service
The voice agent service is the core orchestrator. It creates a @reaatech/voice-agent-core pipeline that connects Deepgram STT, Vertex AI LLM, and Cartesia TTS. It manages in-memory session state, dispatches audio processing, and handles tool calls against Buildium.
Create src/services/voice-agent-service.ts:
ts
import { createPipeline, createLatencyBudget, LatencyBudgetEnforcer, defineConfig, createCostTracker, SessionManager, type MCPClient } from "@reaatech/voice-agent-core";import { DeepgramSTTProvider, STTProviderInterface } from "@reaatech/voice-agent-stt";import { CartesiaTTSProvider, TTSProviderInterface } from "@reaatech/voice-agent-tts";import { createCostTelemetry, type CostTelemetryService } from "./cost-telemetry.js";import { getLlmService, BUILDUM_TOOLS, type LlmMessage } from "./llm-service.js";import { createBuildiumClient, type BuildiumApiClient } from "../lib/buildium-api.js";export interface VoiceAgentService { startCallSession(callSid: string, streamSid: string, tenantContext: { phone: string }): Promise<{ sessionId: string }>; endCallSession(sessionId: string): Promise<void>; processAudio(sessionId: string, chunk: Buffer, sampleRate: number): Promise<void>; getCostTelemetry(): CostTelemetryService; handleBargeIn(sessionId: string): void;}interface SessionState { callSid: string; tenantName?: string; propertyName?: string; unit?: string; messages: LlmMessage[];}
The LlmMcpAdapter bridges the voice pipeline’s MCP interface to the Vertex AI LLM:
Expected output: The createVoiceAgent() singleton creates a pipeline with 800ms latency target (1200ms hard cap), connects Deepgram Nova-2 for STT and Cartesia Sonic for TTS, and dispatches Buildium tool calls when Gemini requests them.
Step 9: Create the Twilio call route handler
The POST /api/calls route receives Twilio’s voice webhook and returns TwiML that directs Twilio to stream audio to the WebSocket endpoint.
Expected output: Twilio sends a POST webhook with CallSid and From form fields. The returned TwiML connects the call to ws://<host>/api/twilio/stream, passing the call SID and caller’s number as parameters.
Step 10: Create the Twilio Media Streams WebSocket handler
The WebSocket handler at /api/twilio/stream manages the real-time audio streaming session. It listens for call start, audio chunks, barge-in events, and call end.
Create app/api/twilio/stream/route.ts:
ts
import { createTwilioHandler } from "@reaatech/voice-agent-telephony";import { type NextRequest, NextResponse } from "next/server";import { createVoiceAgent } from "../../../../src/services/voice-agent-service.js";export const runtime = "nodejs";const agent = createVoiceAgent();export function GET(req: NextRequest): NextResponse { const upgrade = req.headers.get("upgrade")?.toLowerCase(); if (upgrade !== "websocket") { return new NextResponse(null, { status: 426, statusText: "Upgrade Required" }); } const handler = createTwilioHandler({ bargeInEnabled: true, minSpeechDuration: 300, confidenceThreshold: 0.7, silenceThreshold: 0.3, }); // WebSocket acceptConnection should be called when a socket is obtained // from the platform upgrade: await handler.acceptConnection(ws); handler.on("call:start", ({ callSid, streamSid, customParameters }: { callSid: string; streamSid: string; customParameters?: Record<string, string> }) => { const fromNumber = customParameters?.fromNumber ?? ""; handler.setTTSPlaying(false); void agent.startCallSession(callSid, streamSid, { phone: fromNumber }); }); handler.on("audio:received", (chunk: { buffer: Buffer; sampleRate: number }) => { void agent.processAudio(handler.getCallSid() ?? "", chunk.buffer, chunk.sampleRate); }); handler.on("barge-in:detected", () => { agent.handleBargeIn(handler.getCallSid() ?? ""); void handler.clearAudio(); }); handler.on("call:end", () => { const callSid = handler.getCallSid(); if (callSid) { void agent.endCallSession(callSid); void handler.close(); } }); // In a real deployment, the pipeline:tts:chunk event would set TTS state: // handler.setTTSPlaying(true); // TTS chunk emission: const markName = await handler.sendMark(); return new NextResponse(null, { status: 101, statusText: "Switching Protocols" });}
Expected output: When Twilio establishes a WebSocket connection, the handler creates a createTwilioHandler instance, attaches event listeners for the call lifecycle, and delegates audio processing, barge-in handling, and session cleanup to the voice agent.
Step 11: Create the health check and work orders API routes
The health check endpoint reports the status of every configured integration:
Expected output:GET /api/health returns { status: "ok", services: { buildium: "configured|missing", ... } }. GET /api/work-orders?tenantId=x&status=open returns work orders from Buildium.
Step 12: Create the public entry point
src/index.ts re-exports the key modules and types for use by other packages or testing:
ts
export { createVoiceAgent } from "./services/voice-agent-service.js";export { createBuildiumClient } from "./lib/buildium-api.js";export { createSessionService } from "./services/session-service.js";export { createCostTelemetry } from "./services/cost-telemetry.js";export type { BuildiumApiClient, Tenant, WorkOrder, WorkOrderCreateParams, WorkOrderUpdateParams } from "./lib/buildium-api.js";export type { VoiceAgentService } from "./services/voice-agent-service.js";export { createTwilioHandler } from "@reaatech/voice-agent-telephony";
Expected output: Other modules can import from src/index.ts with named exports for all major services and types.
Step 13: Run the tests
The test suite covers every module with mocked external dependencies. Run it with:
terminal
pnpm test
Here’s a sample of what the Buildium API test looks like, verifying OAuth2 flow, tenant lookup, work-order creation, and error handling:
All tests mock external services so they run offline.
Next steps
Multi-property support — Extend the voice agent to recognize which property a caller is associated with by checking the caller’s phone number against multiple Buildium properties. You could use a property lookup table in Redis to cache the mapping.
SMS follow-up — After a call ends, send a text message to the tenant with their work order number and a link to track its status. The TwilioAppClient.sendSms() method is already wired up for this.
Voice agent dashboard — Add a real-time dashboard showing active calls, cost-per-session breakdowns, and work order creation rates using the OpenTelemetry traces and Langfuse integration.
return `You are a 24/7 voice receptionist for ${propertyName}. The caller is ${tenantName}, Unit ${unit}. Your job: take maintenance requests seriously, ask clarifying questions about the issue, create or update work orders in Buildium, and provide status updates. Be concise — this is a voice conversation.`;