Files · vLLM Voice Agent for After-Hours Small Business Support

69 (1 binary, 602.9 kB total)attempt 1

README.md·4350 B·markdown

markdown

# vLLM Voice Agent for After-Hours Small Business Support
 
> A self-hosted voice agent that answers after-hours calls using your own vLLM inference, with customizable workflows for appointment booking and FAQs.
 
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade voice AI systems with the `@reaatech/*` package family.
 
## Overview
 
Small service businesses miss after-hours calls, losing customers because they can't afford a 24/7 receptionist. Existing AI voice solutions require expensive cloud LLM APIs and send sensitive call data off-site. This recipe solves that by running the entire voice pipeline on your infrastructure with vLLM serving the LLM brain on-premises.
 
## Architecture
 
```
Twilio PSTN → Express Webhook (TwiML) → WebSocket Media Stream → Deepgram STT
                                                                     ↓
                                                              vLLM (OpenAI SDK)
                                                                     ↓
Twilio PSTN ← Express Webhook (TwiML) ← WebSocket Media Stream ← Cartesia TTS
```
 
- **Voice Agent Core** (`@reaatech/voice-agent-core`) orchestrates the STT → MCP → TTS pipeline
- **Telephony** (`@reaatech/voice-agent-telephony`) handles Twilio Media Stream WebSocket protocol
- **STT** (`@reaatech/voice-agent-stt`) transcribes audio via Deepgram
- **TTS** (`@reaatech/voice-agent-tts`) synthesizes speech via Cartesia
- **Session Continuity** (`@reaatech/session-continuity`) manages conversation context with Redis storage
- **LLM** runs on vLLM via the OpenAI SDK chat completions endpoint
 
## Prerequisites
 
- Twilio account with a phone number capable of media streams
- Deepgram API key
- Cartesia API key
- Redis instance
- Langfuse account (for observability)
- A running vLLM server serving an OpenAI-compatible endpoint
 
## Environment Variables
 
| Variable | Description |
|---|---|
| `VLLM_ENDPOINT` | vLLM server URL (e.g. `http://localhost:8000/v1`) |
| `VLLM_API_KEY` | API key for vLLM (often empty for self-hosted) |
| `VLLM_MODEL` | Model name on vLLM |
| `DEEPGRAM_API_KEY` | Deepgram API key |
| `CARTESIA_API_KEY` | Cartesia API key |
| `TWILIO_ACCOUNT_SID` | Twilio account SID |
| `TWILIO_AUTH_TOKEN` | Twilio auth token |
| `TWILIO_PHONE_NUMBER` | Twilio phone number |
| `REDIS_URL` | Redis connection URL |
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key |
| `LANGFUSE_SECRET_KEY` | Langfuse secret key |
| `LANGFUSE_BASE_URL` | Langfuse base URL |
| `PORT` | Express server port (default 8080) |
| `WS_URL` | WebSocket URL for Twilio media streams |
 
## Quick Start
 
```bash
# Install dependencies
pnpm install
 
# Copy and fill in environment variables
cp .env.example .env
 
# Run the Express server
pnpm run dev
```
 
Configure your Twilio phone number's voice webhook URL to point to `https://your-server.com/api/twilio/voice`.
 
## API Endpoints
 
| Endpoint | Method | Description |
|---|---|---|
| `/api/twilio/voice` | POST | Twilio voice webhook — returns TwiML with Connect/Stream |
| `/api/twilio/stream` | WebSocket | Twilio Media Stream — real-time audio processing |
| `/health` | GET | Health check |
 
## Testing
 
```bash
pnpm test            # vitest run with coverage
pnpm typecheck       # TypeScript type checking
pnpm lint            # ESLint
```
 
## Project layout
 
```
app/                  Next.js App Router pages + API routes
src/
  config.ts           Zod-validated configuration
  types.ts            Shared TypeScript types
  services/           Voice agent services
    simple-token-counter.ts     Token counter for SessionManager
    vllm-client.ts              OpenAI SDK wrapper for vLLM
    redis-storage-adapter.ts    IStorageAdapter over ioredis
    intent-router.ts            Intent classification service
    appointment-scheduler.ts    Appointment booking via Redis
    faq-service.ts              FAQ answering service
    twiml-handler.ts            TwiML generation and validation
    voice-call-handler.ts       Main pipeline orchestrator
  server.ts           Express server bootstrap
tests/                vitest suite (mirrors src/)
packages/             API references for every dependency
DEV_PLAN.md           build plan for this recipe
```
 
## License
 
MIT — see [LICENSE](./LICENSE).