Files · OpenAI Voice Agent for Auto-Repair Estimates
63 (1 binary, 598.1 kB total)attempt 1
README.md·3449 B·markdown
markdown
# OpenAI Voice Agent for Auto-Repair Estimates
> Answer after-hours calls, capture vehicle details, and provide instant repair estimates for auto shops — all via natural voice conversation.
Small auto-repair shops miss revenue from after-hours callers who need estimates but can't reach anyone. Manual intake by voicemail is slow, error-prone, and rarely converts to a booked job. This voice agent solves that by answering calls 24/7, collecting vehicle info through a guided conversation, generating instant repair estimates via OpenAI, and routing callers to booking, callback, or human handoff — with no database required.
## Architecture
```
PSTN → Twilio → Express WebSocket → Telephony Media Handler
→ STT (OpenAI Whisper) → Confidence Router + Estimate FSM
→ OpenAI (gpt-4o-mini) → TTS (Deepgram Aura) → Twilio audio
```
- **@reaatech/voice-agent-telephony**: Handles Twilio Media Streams WebSocket protocol
- **@reaatech/voice-agent-stt**: OpenAI Whisper speech-to-text
- **@reaatech/voice-agent-tts**: Deepgram Aura text-to-speech
- **@reaatech/voice-agent-core**: Session manager, latency budget enforcer, cost tracker
- **@reaatech/confidence-router** + **@reaatech/confidence-router-classifiers**: Intent classification (get estimate, schedule, talk to human, end call)
- **openai**: Estimate generation via chat completions
- **langfuse**: Tracing and observability
## Prerequisites
- Node.js >= 22
- pnpm 10.x
- A Twilio phone number with Voice / Media Streams enabled
- OpenAI API key
- Deepgram API key
## Setup
```bash
pnpm install
cp .env.example .env # fill in all values
```
Configure your Twilio phone number's voice webhook URL to `https://your-host/twilio-voice`.
## Usage
```bash
pnpm dev # starts Express + WebSocket server
```
Call your Twilio number. The voice agent will answer and walk through the estimate intake process.
## Project Structure
```
src/
types.ts Shared types (VehicleInfo, EstimateResult, EstimateState)
config.ts Zod-validated environment config
index.ts Entry point — boots server, Langfuse, graceful shutdown
app.ts Express + WebSocket server, routes, TwiML generation
services/
voice-engine.ts SessionManager, STT, TTS, cost tracker, recording lifecycle
telephony-handler.ts Twilio Media Streams handler wiring
estimate-collector.ts Finite-state machine for make/model/year/symptom intake
estimate-composer.ts OpenAI chat completions for repair cost estimation
intent-router.ts Confidence-router based caller intent classification
call-orchestrator.ts Main turn loop — wires STT → routing → FSM → TTS
tests/ Vitest suite (mirrors src/)
```
## Environment Variables
| Variable | Required | Description |
|---|---|---|
| OPENAI_API_KEY | Yes | OpenAI API key (for Whisper STT + chat completions) |
| DEEPGRAM_API_KEY | Yes | Deepgram API key (for Aura TTS) |
| TWILIO_ACCOUNT_SID | Yes | Twilio account identifier |
| TWILIO_AUTH_TOKEN | Yes | Twilio authentication token |
| TWILIO_PHONE_NUMBER | Yes | Twilio phone number for incoming calls |
| PORT | No | HTTP server port (default 3000) |
| LANGFUSE_PUBLIC_KEY | Yes | Langfuse project public key |
| LANGFUSE_SECRET_KEY | Yes | Langfuse project secret key |
| LANGFUSE_HOST | Yes | Langfuse API host URL |
## License
MIT — see [LICENSE](./LICENSE).