vLLM Voice Agent for After-Hours Small Business Support

A self-hosted voice agent that answers after-hours calls using your own vLLM inference, with customizable workflows for appointment booking and FAQs.

vllm voice-agent twilio deepgram cartesia express typescript small-business on-premises

The problem

Small service businesses miss after-hours calls, losing customers because they can't afford a 24/7 receptionist. Existing AI voice solutions require expensive cloud LLM APIs and send sensitive call data off-site.

Built from

Intro

This recipe builds a self-hosted voice agent that answers after-hours calls for small businesses using your own vLLM inference server. The agent handles appointment booking and FAQ queries over Twilio phone calls — no cloud LLM API calls, no customer data leaving your infrastructure. You’ll wire up Deepgram STT (speech-to-text), Cartesia TTS (text-to-speech), a vLLM-powered intent router, and Redis-backed session storage into a pipeline that runs on an Express server with Twilio Media Streams.

Prerequisites

Node.js 22+ and pnpm 10+ installed
A vLLM server running with an OpenAI-compatible endpoint
Twilio account with a phone number that supports voice
Deepgram API key for speech-to-text
Cartesia API key for text-to-speech
Redis instance (local or remote) for session storage
Langfuse account (optional, for tracing)
Familiarity with TypeScript, Express, WebSockets, and Next.js App Router

Step 1: Scaffold the project and install dependencies

Start with a Next.js 16 project using the App Router. If you don’t have one yet, scaffold it with npx create-next-app@latest . (choose TypeScript, App Router, and src/ directory). Then pin all dependencies:

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

175 kB·101 tests·97.4% coverage·vitest passing

SHA-2563e57caa4ff411534995d970168b75ac8c9adc4e9c124226338b7ee98fbec1ffd

Book a conversation All solutions

Comments

Loading comments…

import { Redis } from "ioredis" import crypto from "node:crypto" export class AppointmentScheduler { readonly seedDefaultSlots: (date: string) => Promise<void> readonly getAvailableSlots: (date: string) => Promise<Array<{ time: string; available: boolean }>> readonly bookSlot: (date: string, time: string, customerName: string) => Promise<{ success: boolean; confirmationId?: string; error?: string }> readonly generateId: () => string constructor(redis: Redis) { this.seedDefaultSlots = async (date) => { const key = "appointments:" + date const members: Array<{ score: number; value: string }> = [] for (let hour = 9; hour <= 17; hour++) { const hh = hour.toString().padStart(2, "0") members.push({ score: 0, value: date + "T" + hh + ":00" }) } await redis.zadd(key, ...members.flatMap(m => [m.score, m.value])) } this.getAvailableSlots = async (date) => { const key = "appointments:" + date const result = await redis.zrange(key, 0, -1, "WITHSCORES") const slots: Array<{ time: string; available: boolean }> = [] for (let i = 0; i < result.length; i += 2) { const time = result[i] const score = Number(result[i + 1]) slots.push({ time, available: score === 0 }) } return slots } this.bookSlot = async (date, time, customerName) => { const key = "appointments:" + date const slotTime = date + "T" + time const confirmationId = this.generateId() await redis.watch(key) const currentScore = await redis.zscore(key, slotTime) if (currentScore !== null && Number(currentScore) === 1) { await redis.unwatch() return { success: false, error: "Slot already booked" } } const multi = redis.multi() multi.zadd(key, 1, slotTime) multi.set("booking:" + confirmationId, JSON.stringify({ date, time: slotTime, customerName })) const results = await multi.exec() if (results === null) { await redis.unwatch() return { success: false, error: "Concurrent booking conflict, please retry" } } return { success: true, confirmationId } } this.generateId = () => { return crypto.randomUUID() } } }

vLLM Voice Agent for After-Hours Small Business Support

The problem

Built from

Intro

Prerequisites

Step 1: Scaffold the project and install dependencies

Example artifact

Comments

Intro

Prerequisites

Step 1: Scaffold the project and install dependencies

Step 2: Define types and configuration schema

Step 3: Build the vLLM client

Step 4: Create the intent router

Step 5: Implement the FAQ service

Step 6: Build the appointment scheduler

Step 7: Create the Redis session storage adapter

Step 8: Build the TwiML handler

Step 9: Wire the voice call handler

Step 10: Create the Express server

Step 11: Add the Next.js health endpoint and landing page

Step 12: Run tests, typecheck, and lint

Next steps