Skip to content
reaatech

Files · OpenAI Voice Agent for Auto-Repair Estimates

63 (1 binary, 598.1 kB total)attempt 1

README.md·3449 B·markdown
markdown
# OpenAI Voice Agent for Auto-Repair Estimates
 
> Answer after-hours calls, capture vehicle details, and provide instant repair estimates for auto shops — all via natural voice conversation.
 
Small auto-repair shops miss revenue from after-hours callers who need estimates but can't reach anyone. Manual intake by voicemail is slow, error-prone, and rarely converts to a booked job. This voice agent solves that by answering calls 24/7, collecting vehicle info through a guided conversation, generating instant repair estimates via OpenAI, and routing callers to booking, callback, or human handoff — with no database required.
 
## Architecture
 
```
PSTN → Twilio → Express WebSocket → Telephony Media Handler
    → STT (OpenAI Whisper) → Confidence Router + Estimate FSM
    → OpenAI (gpt-4o-mini) → TTS (Deepgram Aura) → Twilio audio
```
 
- **@reaatech/voice-agent-telephony**: Handles Twilio Media Streams WebSocket protocol
- **@reaatech/voice-agent-stt**: OpenAI Whisper speech-to-text
- **@reaatech/voice-agent-tts**: Deepgram Aura text-to-speech
- **@reaatech/voice-agent-core**: Session manager, latency budget enforcer, cost tracker
- **@reaatech/confidence-router** + **@reaatech/confidence-router-classifiers**: Intent classification (get estimate, schedule, talk to human, end call)
- **openai**: Estimate generation via chat completions
- **langfuse**: Tracing and observability
 
## Prerequisites
 
- Node.js >= 22
- pnpm 10.x
- A Twilio phone number with Voice / Media Streams enabled
- OpenAI API key
- Deepgram API key
 
## Setup
 
```bash
pnpm install
cp .env.example .env   # fill in all values
```
 
Configure your Twilio phone number's voice webhook URL to `https://your-host/twilio-voice`.
 
## Usage
 
```bash
pnpm dev               # starts Express + WebSocket server
```
 
Call your Twilio number. The voice agent will answer and walk through the estimate intake process.
 
## Project Structure
 
```
src/
  types.ts               Shared types (VehicleInfo, EstimateResult, EstimateState)
  config.ts              Zod-validated environment config
  index.ts               Entry point — boots server, Langfuse, graceful shutdown
  app.ts                 Express + WebSocket server, routes, TwiML generation
  services/
    voice-engine.ts       SessionManager, STT, TTS, cost tracker, recording lifecycle
    telephony-handler.ts  Twilio Media Streams handler wiring
    estimate-collector.ts Finite-state machine for make/model/year/symptom intake
    estimate-composer.ts  OpenAI chat completions for repair cost estimation
    intent-router.ts      Confidence-router based caller intent classification
    call-orchestrator.ts  Main turn loop — wires STT → routing → FSM → TTS
tests/                   Vitest suite (mirrors src/)
```
 
## Environment Variables
 
| Variable | Required | Description |
|---|---|---|
| OPENAI_API_KEY | Yes | OpenAI API key (for Whisper STT + chat completions) |
| DEEPGRAM_API_KEY | Yes | Deepgram API key (for Aura TTS) |
| TWILIO_ACCOUNT_SID | Yes | Twilio account identifier |
| TWILIO_AUTH_TOKEN | Yes | Twilio authentication token |
| TWILIO_PHONE_NUMBER | Yes | Twilio phone number for incoming calls |
| PORT | No | HTTP server port (default 3000) |
| LANGFUSE_PUBLIC_KEY | Yes | Langfuse project public key |
| LANGFUSE_SECRET_KEY | Yes | Langfuse project secret key |
| LANGFUSE_HOST | Yes | Langfuse API host URL |
 
## License
 
MIT — see [LICENSE](./LICENSE).