OpenAI Voice Agent for Aircall Small Business Support

An AI-powered voice receptionist that answers calls on Aircall numbers, handles common inquiries, and escalates to human agents when needed, reducing hold times for SMBs.

openai voice-agent aircall deepgram nextjs small-business call-handling agent-handoff

The problem

Small businesses miss after-hours calls and struggle to handle peak-time call volumes, leading to lost revenue and customer frustration.

Built from

Intro

This recipe builds an AI-powered voice receptionist for Aircall phone numbers using Next.js, Deepgram for speech transcription and synthesis, and OpenAI for natural language understanding. When a customer calls, the agent greets them, answers FAQs about hours, pricing, and services, and escalates to a human agent when it can’t help — reducing hold times and catching after-hours calls for small businesses.

Prerequisites

Node.js >= 22 with pnpm 10.x
An OpenAI API key (for voice NLU via gpt-5.2-mini)
A Deepgram API key (for STT and TTS)
An Aircall account with API credentials (key and secret) and a phone number
(Optional) A Langfuse account for tracing and cost observability
Familiarity with TypeScript and Next.js App Router basics

Step 1: Create the project scaffold

Create a new Next.js project and install all dependencies at exact pinned versions.

terminal

npx create-next-app@latest openai-voice-agent-aircall \
  --typescript --eslint --app --src-dir \
  --import-alias "@/*" --use-pnpm
cd openai-voice-agent-aircall

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

162 kB·70 tests·97.4% coverage·vitest passing

SHA-256f5607469a1a97528096de06cbd24af002f8adb44a6c728e8a076f9f4fe31f588

Book a conversation All solutions

Comments

Loading comments…

import OpenAI from "openai"; import pino from "pino"; import type { NLUResponse } from "./types.js"; import { repairLlmResponse } from "./repair.js"; const SMB_SUPPORT_PROMPT = `You are an AI receptionist for a small business. Your job is to: 1. Greet callers warmly and identify their needs 2. Answer frequently asked questions about hours, pricing, and services 3. Escalate to a human agent when you are unsure or the caller asks for something beyond your scope You must ALWAYS respond with valid JSON matching this schema: { "action": "greet" | "answer" | "escalate" | "goodbye", "message": string, "escalationReason"?: string } - Use "greet" for initial greetings - Use "answer" when you can answer the question - Use "escalate" when you need to transfer to a human - Use "goodbye" to end the conversation`; export function createOpenAIClient(apiKey: string): OpenAI { return new OpenAI({ apiKey }); } export class VoiceNLUClient { private client: OpenAI; private logger: pino.Logger; constructor(client: OpenAI, logger: pino.Logger) { this.client = client; this.logger = logger; } async decideAction( transcript: string, context: { callerName?: string; history: string[] } ): Promise<NLUResponse> { const messages: Array<{ role: "developer" | "user" | "assistant"; content: string }> = [ { role: "developer", content: SMB_SUPPORT_PROMPT }, ]; for (const entry of context.history) { messages.push({ role: "user", content: entry }); } messages.push({ role: "user", content: transcript }); try { const completion = await this.client.chat.completions.create({ model: "gpt-5.2-mini", messages, max_tokens: 256, temperature: 0.3, }); const rawContent = completion.choices[0]?.message?.content || ""; this.logger.info({ rawContent }, "decideAction raw response"); return await repairLlmResponse(rawContent); } catch (error) { this.logger.error({ error, transcript }, "decideAction failed"); return { action: "escalate", message: "Let me connect you with someone who can help.", escalationReason: "NLU service error", }; } } async generateReply( transcript: string, context: { callerName?: string; businessName: string } ): Promise<string> { const messages: Array<{ role: "developer" | "user"; content: string }> = [ { role: "developer", content: `You are a receptionist at ${context.businessName}. Provide a friendly, concise spoken reply.`, }, { role: "user", content: transcript }, ]; try { const completion = await this.client.chat.completions.create({ model: "gpt-5.2-mini", messages, max_tokens: 150, temperature: 0.5, }); return completion.choices[0]?.message?.content || ""; } catch (error) { this.logger.error({ error, transcript }, "generateReply failed"); return "I'm sorry, I'm having trouble responding right now."; } } }

import { createPipeline, createLatencyBudget, initializeSessionManager, LatencyBudgetEnforcer, } from "@reaatech/voice-agent-core"; import type { VoiceAgentKitConfig } from "@reaatech/voice-agent-core"; import { DeepgramSTTProvider } from "@reaatech/voice-agent-stt"; import { DeepgramTTSProvider, TTSProviderInterface } from "@reaatech/voice-agent-tts"; import { OpenAIMCPAdapter } from "./openai-mcp-adapter.js"; import { VoiceNLUClient, createOpenAIClient } from "../openai.js"; import { createLogger } from "../config.js"; import type { VoiceAgentConfig } from "../types.js"; export async function createVoiceAgent(config: VoiceAgentConfig) { const logger = createLogger("voice-agent"); const stt = new DeepgramSTTProvider(); await stt.connect({ provider: "deepgram", apiKey: config.deepgramApiKey, model: "nova-2", language: "en", sampleRate: 8000, encoding: "mulaw", smartFormat: true, interimResults: true, endpointing: 300, }); const tts = new DeepgramTTSProvider(); const sessionManager = initializeSessionManager({ defaultTTL: 3600, maxTurns: 20, maxTokens: 4000, }); const latencyEnforcer = new LatencyBudgetEnforcer( createLatencyBudget({ target: 800, hardCap: 1200, stt: 200, mcp: 400, tts: 200, }) ); const voiceConfig: VoiceAgentKitConfig = { mcp: { endpoint: "inline", timeout: 400 }, stt: { provider: "deepgram", model: "nova-2", language: "en", sampleRate: 8000 }, tts: { provider: "deepgram", voice: "asteria", model: "aura" }, latency: { total: { target: 800, hardCap: 1200 }, stages: { stt: 200, mcp: 400, tts: 200 }, }, session: { ttl: 3600, history: { maxTurns: 20, maxTokens: 4000 }, }, bargeIn: { enabled: true, minSpeechDuration: 300, confidenceThreshold: 0.7, silenceThreshold: 0.3, }, }; const openaiClient = createOpenAIClient(config.openaiApiKey); const nluClient = new VoiceNLUClient(openaiClient, logger); const adapter = new OpenAIMCPAdapter(nluClient, sessionManager); const pipeline = createPipeline({ sessionManager, latencyEnforcer, sttProvider: stt, ttsProvider: tts, mcpClient: adapter, config: voiceConfig, }); pipeline.on("pipeline:turn:end", (event: { data?: { metrics?: Record<string, unknown> } }) => { logger.info(event.data?.metrics, "Turn complete"); }); pipeline.on("pipeline:error", (event: unknown) => { logger.error(event, "Pipeline error"); }); pipeline.on("pipeline:stt:final", (event: { data?: { utterance?: { transcript?: string } } }) => { logger.info({ transcript: event.data?.utterance?.transcript }, "STT final"); }); return { pipeline, sessionManager, stt, tts, ttsInterface: TTSProviderInterface }; }

import { generateId, now, calculateCostFromTokens, CostSpanSchema } from "@reaatech/llm-cost-telemetry"; import type { CostSpan } from "@reaatech/llm-cost-telemetry"; import type { CallCostReport } from "../types.js"; const DEEPGRAM_STT_PRICE_PER_MIN = 0.0059; const DEEPGRAM_TTS_PRICE_PER_CHAR = 0.000015; const GPT52_MINI_PRICE_PER_MILLION = 1.5; export class CallCostTracker { private spans: Map<string, CostSpan[]> = new Map(); recordSTTUsage(callSid: string, audioDurationMs: number): void { if (audioDurationMs < 0) { audioDurationMs = 0; } const minutes = audioDurationMs / 60000; const costUsd = minutes * DEEPGRAM_STT_PRICE_PER_MIN; const span: CostSpan = { id: generateId(), provider: "openai", model: "nova-2", inputTokens: 0, outputTokens: 0, costUsd, timestamp: now(), }; this.addSpan(callSid, span); } recordTTSUsage(callSid: string, characterCount: number): void { if (characterCount < 0) { characterCount = 0; } const costUsd = characterCount * DEEPGRAM_TTS_PRICE_PER_CHAR; const span: CostSpan = { id: generateId(), provider: "openai", model: "aura", inputTokens: 0, outputTokens: 0, costUsd, timestamp: now(), }; this.addSpan(callSid, span); } recordLLMUsage(callSid: string, inputTokens: number, outputTokens: number): void { const totalTokens = (inputTokens < 0 ? 0 : inputTokens) + (outputTokens < 0 ? 0 : outputTokens); const costUsd = calculateCostFromTokens(totalTokens, GPT52_MINI_PRICE_PER_MILLION); const span: CostSpan = { id: generateId(), provider: "openai", model: "gpt-5.2-mini", inputTokens: Math.max(0, inputTokens), outputTokens: Math.max(0, outputTokens), costUsd, timestamp: now(), }; this.addSpan(callSid, span); } getCallSummary(callSid: string): CallCostReport { const callSpans = this.spans.get(callSid) || []; let sttCostUsd = 0; let ttsCostUsd = 0; let llmCostUsd = 0; for (const span of callSpans) { if (span.model === "nova-2") { sttCostUsd += span.costUsd; } else if (span.model === "aura") { ttsCostUsd += span.costUsd; } else { llmCostUsd += span.costUsd; } } return { callSid, sttCostUsd, ttsCostUsd, llmCostUsd, totalCostUsd: sttCostUsd + ttsCostUsd + llmCostUsd, durationSeconds: 0, timestamp: new Date().toISOString(), }; } reset(): void { this.spans.clear(); } private addSpan(callSid: string, span: CostSpan): void { CostSpanSchema.parse(span); const existing = this.spans.get(callSid) || []; existing.push(span); this.spans.set(callSid, existing); } }

import { NextRequest, NextResponse } from "next/server"; import type { AudioChunk } from "@reaatech/voice-agent-core"; let handlerSingleton: { handleWebhook: (body: unknown, signature?: string) => Promise<{ status: number; body: unknown }> } | null = null; async function getWebhookHandler() { if (!handlerSingleton) { const { loadConfig, createLogger } = await import("../../../../src/config.js"); const { createVoiceAgent } = await import("../../../../src/voice/agent.js"); const { CallSessionService } = await import("../../../../src/voice/session-service.js"); const { AircallWebhookHandler } = await import("../../../../src/aircall/webhook-handler.js"); const config = loadConfig(); const logger = createLogger("webhook"); const { pipeline, sessionManager, stt } = await createVoiceAgent(config); const sessionService = new CallSessionService(sessionManager, logger); handlerSingleton = new AircallWebhookHandler({ pipeline: { startSession: (s: { sessionId: string; status: string }) => pipeline.startSession(s), endSession: (id: string) => pipeline.endSession(id), processAudioChunk: async (id: string, chunk: AudioChunk) => { await pipeline.processAudioChunk(id, chunk); }, }, sessionService, stt: { streamAudio: (chunk: AudioChunk) => { stt.streamAudio(chunk); }, }, tts: {}, logger, config, }); } return handlerSingleton; } export async function POST(req: NextRequest) { try { const body = await req.text(); const signature = req.headers.get("x-aircall-signature") || ""; const handler = await getWebhookHandler(); const result = await handler.handleWebhook(JSON.parse(body), signature); return NextResponse.json(result.body, { status: result.status }); } catch { return NextResponse.json({ error: "invalid webhook payload" }, { status: 400 }); } } export function GET() { return NextResponse.json({ status: "webhook endpoint is POST-only" }, { status: 405 }); }

OpenAI Voice Agent for Aircall Small Business Support

The problem

Built from

Intro

Prerequisites

Step 1: Create the project scaffold

Example artifact

Comments

Intro

Prerequisites

Step 1: Create the project scaffold

Step 2: Configure environment variables

Step 3: Create shared domain types

Step 4: Build the configuration loader

Step 5: Create the OpenAI NLU client

Step 6: Create the structured output repair utility

Step 7: Create the MCP adapter

Step 8: Create the session service wrapper

Step 9: Assemble the voice agent pipeline

Step 10: Handle Aircall webhooks

Step 11: Build the transfer handler

Step 12: Add per-call cost tracking

Step 13: Wire up Langfuse observability

Step 14: Set up Next.js instrumentation

Step 15: Create the API routes

Step 16: Run the tests

Next steps