Files · Voice quote assistant for auto-repair shops
92 (1 binary, 655.0 kB total)attempt 1
README.md·6118 B·markdown
markdown
# Voice quote assistant for auto-repair shops
> Convert inbound 'how much to fix X' calls into structured estimates without tying up your service advisor.
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI voice agents with the `@reaatech/*` package family and OpenAI.
## Problem
When a customer calls asking "How much to fix my brake noise?", the service advisor has to drop everything, walk to the bay, and interrupt a mechanic. That kills bay productivity and makes the customer wait on hold. Worse, the advisor often lacks the part-pricing data at their fingertips, leading to rough ballpark quotes that later get disputed.
This recipe solves that by providing an AI voice agent that:
1. Answers inbound calls via Twilio telephony
2. Transcribes the customer's speech in real time (Deepgram STT)
3. Interprets the repair issue using OpenAI Responses API
4. Looks up parts and pricing from a catalog
5. Generates a structured quote and reads it back (Deepgram TTS)
6. Remembers customer context across calls (AgentMemory)
7. Caches frequent queries for efficiency (CacheEngine)
## Architecture
```
Twilio Phone
│
▼ (HTTP webhook)
Next.js /api/twilio/voice → returns TwiML with <Stream> pointing to Fastify WS
│
▼ (WebSocket Media Stream)
Fastify /media-stream
│
├─ TwilioMediaStreamHandler (reaatech/voice-agent-telephony)
│ ├─ receives audio chunks
│ ├─ emits audio:received / barge-in:detected / call:start / call:end
│ └─ sends TTS audio back to Twilio
│
├─ Deepgram STT (reaatech/voice-agent-stt)
│ └─ streaming transcription → interim + final utterances
│
├─ Pipeline (reaatech/voice-agent-core)
│ ├─ session management (SessionManager)
│ ├─ latency budget enforcement (LatencyBudgetEnforcer)
│ └─ staged: STT → MCP → TTS
│
├─ QuoteMCPClient (custom)
│ └─ OpenAI Responses API with tool calling:
│ • lookup_parts → query part catalog
│ • generate_quote → build structured estimate
│
├─ Deepgram TTS (reaatech/voice-agent-tts)
│ └─ streaming synthesis → mulaw 8kHz audio
│
├─ AgentMemory (reaatech/agent-memory)
│ └─ extract + store + retrieve customer memories
│
└─ CacheEngine (reaatech/llm-cache)
└─ exact + semantic cache for LLM calls
```
Observability via Langfuse (`langfuse`) and OpenTelemetry (`@reaatech/voice-agent-core`'s `initializeObservability`).
## Packages used
| Package | Role |
|---|---|
| `@reaatech/voice-agent-core` | Pipeline orchestration, session management, latency budget |
| `@reaatech/voice-agent-stt` | Streaming STT via Deepgram (nova-2) |
| `@reaatech/voice-agent-tts` | Streaming TTS via Deepgram (asteria voice) |
| `@reaatech/voice-agent-telephony` | Twilio Media Streams WebSocket handler |
| `@reaatech/agent-memory` | Long-term memory with semantic search |
| `@reaatech/llm-cache` | Exact + semantic LLM response cache |
| `openai` | OpenAI Responses API (quote intelligence) |
| `twilio` | Twilio SDK for webhook auth |
| `zod` | Schema validation for config and types |
| `langfuse` | LLM tracing and observability |
| `fastify` / `@fastify/websocket` | WebSocket server for Twilio Media Streams |
## Running locally
### Prerequisites
- Node.js >= 22
- pnpm 10.x
- Ngrok (or a public URL for Twilio webhooks)
- Accounts: Twilio, Deepgram, OpenAI, (optional) Langfuse
### Setup
```bash
# 1. Install dependencies
pnpm install
# 2. Copy and fill in env vars
cp .env.example .env
# Edit .env with your API keys
# 3. Start the Fastify WebSocket server
pnpm tsx src/server/ws-server.ts
# 4. In another terminal, start Next.js
pnpm dev
# 5. Expose your local server via ngrok
ngrok http 3000 # Next.js (for Twilio webhook)
# Also expose the WebSocket port 3001 via a separate ngrok tunnel
# 6. Configure your Twilio phone number:
# - Voice webhook: https://<ngrok-url>/api/twilio/voice (POST)
# - Status callback: https://<ngrok-url>/api/twilio/status (POST)
```
### Making a test call
Call your Twilio number. The voice assistant will answer, ask about the repair issue, and generate a quote. Speak naturally — e.g., "How much to fix my brake noise? It's a 2018 Honda Civic."
## Project layout
```
app/api/
twilio/voice/route.ts Twilio voice webhook (TwiML)
twilio/status/route.ts Twilio status callback
quotes/route.ts Quote CRUD API
quotes/[id]/route.ts Single quote API
parts/route.ts Parts catalog API
calls/route.ts Call history API
app/page.tsx Dashboard server component
src/
config/
env.ts Environment config loader
pipeline-config.ts Voice agent kit config
lib/
openai-client.ts OpenAI client singleton
twilio-client.ts Twilio client singleton
telemetry.ts Langfuse + OTel observability
services/
pricing-store.ts Part catalog data
part-pricing.ts Pricing lookup and calculations
openai-tools.ts OpenAI tool definitions
quote-engine.ts Quote generation orchestration
quote-mcp-client.ts MCP client for voice pipeline
voice-pipeline.ts Voice pipeline orchestration
memory-service.ts AgentMemory wrapper
cache-service.ts CacheEngine wrapper
server/
ws-server.ts Fastify WebSocket server
types/
quote.ts Part, Quote, QuoteLineItem, QuoteInput schemas
call.ts CallRecord, ConversationTurn schemas
config.ts AppConfig schema
tests/ Vitest suite (63 tests, 90%+ coverage)
```
## Development
```bash
pnpm test # vitest run with coverage
pnpm typecheck # tsc --noEmit
pnpm lint # eslint
pnpm build # next build
```
## License
MIT — see [LICENSE](./LICENSE).