Azure AI Phone Agent for Cal.com Appointment Scheduling
An AI voice agent that answers calls, converses with customers, and books appointments directly into Cal.com, using Azure OpenAI and REAA's orchestration toolkit.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This recipe builds an AI phone agent that answers calls, understands what the caller needs, and books appointments directly into Cal.com. The pipeline connects Twilio inbound calls to LiveKit audio rooms, transcribes speech with Deepgram, runs intent classification and booking logic through Azure OpenAI, creates real Cal.com v2 bookings, and reads confirmation back through Cartesia TTS. Cost tracking runs per-call through a budget engine, and a circuit breaker isolates Cal.com API failures so one bad slot lookup cannot take down the whole system.
Prerequisites
Node.js 22+
pnpm 10+
Azure OpenAI resource (endpoint, deployment name, API key, API version)
Cal.com account with an API key and at least one event type
LiveKit Cloud account (host URL, API key, API secret)
IntentResult maps directly to the RoutingDecision shape produced by @reaatech/confidence-router. The three type values — ROUTE, CLARIFY, FALLBACK — drive the voice agent’s branching logic.
Step 4: Build the Cal.com REST client
src/lib/calcom.ts wraps the Cal.com v2 API with typed methods. All requests go through a private request() helper that attaches Bearer auth, Content-Type, and the required cal-api-version header.
Expected output: The four methods (getAvailability, createBooking, cancelBooking, getBooking) throw CalcomError on any non-2xx response, with status and body attached.
Step 5: Build the Azure OpenAI provider
src/lib/azure-llm.ts wraps the openai base package (which the @azure/openai compatibility layer uses). It exposes a chat() method for general completions and a classify() method that forces json_object output for structured intent classification.
ts
import { AzureOpenAI } from "openai";import type { ChatCompletionCreateParamsNonStreaming, ChatCompletionMessageParam } from "openai/resources/chat/completions";export class ConfigurationError extends Error { constructor(message: string) { super(message); this.name = "ConfigurationError"; }}export class AzureLLMError extends Error { status?: number; body?: unknown; constructor(message: string, status?: number, body?: unknown) {
Expected output:classify() always returns { label, confidence } parsed from a JSON response. estimateCost() uses the pricing map to return a dollar figure based on both input and output token counts.
Step 6: Build the pricing provider and spend store
The budget engine from @reaatech/agent-budget-engine requires two injected dependencies: a PricingProvider and a SpendStore. src/lib/pricing-provider.ts implements the PricingProvider interface, and src/lib/spend-store.ts implements the SpendStore interface.
src/lib/pricing-provider.ts:
ts
export class PricingError extends Error { constructor(message: string) { super(message); this.name = "PricingError"; }}function loadRates(): Map<string, { input: number; output: number }> { const rates = new Map<string, { input: number; output: number }>(); const defaults: Record<string, { input: number; output: number }> = { "gpt-4o-mini": { input: 0.15, output: 0.60 }, "gpt-4o": { input: 2.50, output: 10.00 }, "gpt-4": { input: 30.00, output: 60.00 }, }; for (const [model, rate] of Object.entries(defaults)) { rates.set(model, rate); } const envPricing = process.env.AZURE_LLM_PRICING_JSON; if (envPricing) { try { const parsed = JSON.parse(envPricing) as Record< string, { input: number; output: number } >; for (const [model, rate] of Object.entries(parsed)) { rates.set(model, rate); } } catch { // env var contains invalid JSON, skip } } return rates;}export class AzurePricingProvider { private rates: Map<string, { input: number; output: number }>; constructor() { this.rates = loadRates(); } estimateCost(modelId: string, estimatedInputTokens: number): number { const rates = this.rates.get(modelId); if (!rates) { throw new PricingError(`Unknown model: ${modelId}`); } return (estimatedInputTokens / 1_000_000) * rates.input; }}
Expected output:AzurePricingProvider.estimateCost() accepts modelId and estimatedInputTokens, then multiplies by the input token rate from the pricing map. InMemorySpendStore accumulates spend keyed by scopeType:scopeKey.
Step 7: Build the intent classifier
src/services/intent-classifier.ts combines a keyword classifier (fast, no LLM call) with an Azure OpenAI LLM fallback for ambiguous input. It uses @reaatech/confidence-router with routeThreshold: 0.8 and fallbackThreshold: 0.3.
Expected output: A transcript like “I want to book an appointment” hits the keyword classifier and returns ROUTE with label: "booking" in one pass. A vague transcript like “so I was thinking maybe” falls through to LLM classification and then to router.decide().
Step 8: Build the booking agent and budget service
src/services/booking-agent.ts handles the three booking operations. It uses HybridCompressor and SimpleTokenCounter from @reaatech/agent-handoff-compression to compress conversation history before passing it downstream, and emits typed events via TypedEventEmitter.
src/services/budget-service.ts wraps BudgetController from @reaatech/agent-budget-engine and BudgetAwareStrategy from @reaatech/agent-budget-llm-router-plugin. Each call gets its own budget scope keyed by callSid. When spend hits the hard cap, the hard-stop event fires.
Key methods in BudgetService:
ts
// Define a per-call budget (soft cap at 80%, hard cap at 100% of the limit)defineCallBudget(callSid: string, limit?: number): Promise<void>// Check whether the next LLM call is allowed under the remaining budgetcheckBudget(callSid: string, estimatedCost: number, modelId: string): Promise<{ allowed: boolean; suggestedModel?: string }>// Record actual spend after an LLM call (input + output tokens)recordSpend(callSid: string, requestId: string, cost: number, inputTokens: number, outputTokens: number, modelId: string): Promise<void>// Snapshot current spend stategetBudgetState(callSid: string): Promise<{ spent: number; remaining: number; state: BudgetStateEnum }>
Step 9: Wire up the API routes and middleware
The project uses three API routes under app/api/:
app/api/chat/route.ts — the main conversational endpoint. It parses the transcript with a Zod schema, builds all dependencies from env vars, calls voiceAgent.onTranscript(), and maps errors to the correct HTTP status codes.
ts
import { NextRequest, NextResponse } from "next/server";import { z } from "zod";import { VoiceAgent } from "@/src/voice/agent";import { IntentClassifier } from "@/src/services/intent-classifier";import { BookingAgent } from "@/src/services/booking-agent";import { BudgetService } from "@/src/services/budget-service";import { AzureLLMProvider } from "@/src/lib/azure-llm";import { CalcomClient } from "@/src/lib/calcom";import { AzurePricingProvider } from "@/src/lib/pricing-provider";import { InMemorySpendStore } from "@/src/lib/spend-store";import { calcomCircuitBreaker } from "@/src/services/circuit-breaker-setup";import { CircuitOpenError } from "@reaatech/circuit-breaker-agents";export class BudgetExceededError extends Error { constructor(msg?: string) { super(msg ?? "Budget exceeded"); this.name = "BudgetExceededError"; }}const chatSchema = z.object({ transcript: z.string().min(1).max(5000), callSid: z.string().min(1),});function buildVoiceAgent(): VoiceAgent { const llm = new AzureLLMProvider({ endpoint: process.env.AZURE_OPENAI_ENDPOINT, apiKey: process.env.AZURE_OPENAI_API_KEY, deployment: process.env.AZURE_OPENAI_DEPLOYMENT, apiVersion: process.env.AZURE_OPENAI_API_VERSION, }); const calcomClient = new CalcomClient({ apiKey: process.env.CALCOM_API_KEY, baseUrl: process.env.CALCOM_BASE_URL, }); const pricingProvider = new AzurePricingProvider(); const spendStore = new InMemorySpendStore(); const budgetService = new BudgetService({ pricingProvider, spendStore }); const intentClassifier = new IntentClassifier({ azureLLM: llm }); const bookingAgent = new BookingAgent({ calcomClient, azureLLM: llm }); return new VoiceAgent({ intentClassifier, bookingAgent, budgetService, circuitBreaker: calcomCircuitBreaker, });}export async function POST(req: NextRequest): Promise<NextResponse> { try { const body: unknown = await req.json(); const parsed = chatSchema.safeParse(body); if (!parsed.success) { return NextResponse.json( { error: "Invalid request body" }, { status: 400 } ); } const { transcript, callSid } = parsed.data; const voiceAgent = buildVoiceAgent(); const response = await voiceAgent.onTranscript(transcript, callSid); return NextResponse.json({ response }); } catch (err) { if (err instanceof BudgetExceededError) { return NextResponse.json( { error: "Budget exceeded for this session" }, { status: 402 } ); } if (err instanceof CircuitOpenError) { return NextResponse.json( { error: "Service temporarily unavailable" }, { status: 503, headers: { "Retry-After": "30" } } ); } return NextResponse.json( { error: "Internal server error" }, { status: 500 } ); }}export function GET(): NextResponse { return NextResponse.json({ status: "ok" });}
Error responses map to specific HTTP codes: BudgetExceededError → 402, CircuitOpenError → 503 with Retry-After: 30, everything else → 500.
app/api/webhook/twilio/route.ts — handles inbound Twilio calls. Parses CallSid, From, and To from form data, creates a LiveKit room for the call, and returns TwiML XML with a <Connect><Stream> element pointing to the LiveKit WebSocket.
app/api/webhook/livekit/route.ts — receives LiveKit room lifecycle events. Verifies the webhook signature with WebhookReceiver, then handles three event types:
room_started — logs the event
participant_joined — logs and queues a welcome TTS prompt
room_finished — calls spendStore.resetSpend() and voiceRoomManager.deleteRoom() to clean up
middleware.ts at the project root enforces two rules across all /api/* routes:
/api/chat requires an Authorization header (returns 401 if missing)
/api/webhook/* is exempt from auth — webhooks cannot add headers
All responses get CORS headers based on ALLOWED_ORIGINS (comma-separated list)
The test suite is fully mocked — no live HTTP calls, no real credentials. MSW intercepts fetch at the network layer, and all external packages are stubbed via vi.mock.
terminal
pnpm test
Expected output: All tests pass with numFailedTests: 0. Coverage thresholds (lines, branches, functions, statements all at 90%+) pass on the runtime code under src/ and app/api/.
To run type checking and linting independently:
terminal
pnpm typecheckpnpm lint
To invoke the full quality gate:
terminal
node /home/rick/solutions-worker/bin/preflight.js
Expected output:{"ok": true, ...} — zero findings, exit code 0.
Next steps
Add a human handoff fallback — wire FALLBACK intent in the voice agent to a Twilio Conference or webhook to a CRM so callers always reach a live agent when the model cannot decide.
Implement confirmation SMS — after booking-completed, use Twilio to send a text message with the appointment details using the AppointmentResult.startTime and .title from the Cal.com response.
Persist budget state to Redis — swap InMemorySpendStore for a Redis-backed implementation so budgets survive server restarts and can be queried across multiple Next.js instances.
Add OpenTelemetry tracing — instrument AzureLLMProvider.chat() and each Cal.com fetch call with spans so you can trace a caller’s transcript through to the booked slot in production.