Files · Voice quote assistant for auto-repair shops

92 (1 binary, 655.0 kB total)attempt 1
README.md·6118 B·markdown
markdown
# Voice quote assistant for auto-repair shops
 
> Convert inbound 'how much to fix X' calls into structured estimates without tying up your service advisor.
 
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI voice agents with the `@reaatech/*` package family and OpenAI.
 
## Problem
 
When a customer calls asking "How much to fix my brake noise?", the service advisor has to drop everything, walk to the bay, and interrupt a mechanic. That kills bay productivity and makes the customer wait on hold. Worse, the advisor often lacks the part-pricing data at their fingertips, leading to rough ballpark quotes that later get disputed.
 
This recipe solves that by providing an AI voice agent that:
1. Answers inbound calls via Twilio telephony
2. Transcribes the customer's speech in real time (Deepgram STT)
3. Interprets the repair issue using OpenAI Responses API
4. Looks up parts and pricing from a catalog
5. Generates a structured quote and reads it back (Deepgram TTS)
6. Remembers customer context across calls (AgentMemory)
7. Caches frequent queries for efficiency (CacheEngine)
 
## Architecture
 
```
Twilio Phone
    │
    ▼ (HTTP webhook)
Next.js /api/twilio/voice → returns TwiML with <Stream> pointing to Fastify WS
    │
    ▼ (WebSocket Media Stream)
Fastify /media-stream
    │
    ├─ TwilioMediaStreamHandler (reaatech/voice-agent-telephony)
    │   ├─ receives audio chunks
    │   ├─ emits audio:received / barge-in:detected / call:start / call:end
    │   └─ sends TTS audio back to Twilio
    │
    ├─ Deepgram STT (reaatech/voice-agent-stt)
    │   └─ streaming transcription → interim + final utterances
    │
    ├─ Pipeline (reaatech/voice-agent-core)
    │   ├─ session management (SessionManager)
    │   ├─ latency budget enforcement (LatencyBudgetEnforcer)
    │   └─ staged: STT → MCP → TTS
    │
    ├─ QuoteMCPClient (custom)
    │   └─ OpenAI Responses API with tool calling:
    │       • lookup_parts → query part catalog
    │       • generate_quote → build structured estimate
    │
    ├─ Deepgram TTS (reaatech/voice-agent-tts)
    │   └─ streaming synthesis → mulaw 8kHz audio
    │
    ├─ AgentMemory (reaatech/agent-memory)
    │   └─ extract + store + retrieve customer memories
    │
    └─ CacheEngine (reaatech/llm-cache)
        └─ exact + semantic cache for LLM calls
```
 
Observability via Langfuse (`langfuse`) and OpenTelemetry (`@reaatech/voice-agent-core`'s `initializeObservability`).
 
## Packages used
 
| Package | Role |
|---|---|
| `@reaatech/voice-agent-core` | Pipeline orchestration, session management, latency budget |
| `@reaatech/voice-agent-stt` | Streaming STT via Deepgram (nova-2) |
| `@reaatech/voice-agent-tts` | Streaming TTS via Deepgram (asteria voice) |
| `@reaatech/voice-agent-telephony` | Twilio Media Streams WebSocket handler |
| `@reaatech/agent-memory` | Long-term memory with semantic search |
| `@reaatech/llm-cache` | Exact + semantic LLM response cache |
| `openai` | OpenAI Responses API (quote intelligence) |
| `twilio` | Twilio SDK for webhook auth |
| `zod` | Schema validation for config and types |
| `langfuse` | LLM tracing and observability |
| `fastify` / `@fastify/websocket` | WebSocket server for Twilio Media Streams |
 
## Running locally
 
### Prerequisites
 
- Node.js >= 22
- pnpm 10.x
- Ngrok (or a public URL for Twilio webhooks)
- Accounts: Twilio, Deepgram, OpenAI, (optional) Langfuse
 
### Setup
 
```bash
# 1. Install dependencies
pnpm install
 
# 2. Copy and fill in env vars
cp .env.example .env
# Edit .env with your API keys
 
# 3. Start the Fastify WebSocket server
pnpm tsx src/server/ws-server.ts
 
# 4. In another terminal, start Next.js
pnpm dev
 
# 5. Expose your local server via ngrok
ngrok http 3000   # Next.js (for Twilio webhook)
# Also expose the WebSocket port 3001 via a separate ngrok tunnel
 
# 6. Configure your Twilio phone number:
#    - Voice webhook: https://<ngrok-url>/api/twilio/voice (POST)
#    - Status callback: https://<ngrok-url>/api/twilio/status (POST)
```
 
### Making a test call
 
Call your Twilio number. The voice assistant will answer, ask about the repair issue, and generate a quote. Speak naturally — e.g., "How much to fix my brake noise? It's a 2018 Honda Civic."
 
## Project layout
 
```
app/api/
  twilio/voice/route.ts       Twilio voice webhook (TwiML)
  twilio/status/route.ts      Twilio status callback
  quotes/route.ts             Quote CRUD API
  quotes/[id]/route.ts        Single quote API
  parts/route.ts              Parts catalog API
  calls/route.ts              Call history API
app/page.tsx                  Dashboard server component
src/
  config/
    env.ts                    Environment config loader
    pipeline-config.ts        Voice agent kit config
  lib/
    openai-client.ts          OpenAI client singleton
    twilio-client.ts          Twilio client singleton
    telemetry.ts              Langfuse + OTel observability
  services/
    pricing-store.ts          Part catalog data
    part-pricing.ts           Pricing lookup and calculations
    openai-tools.ts           OpenAI tool definitions
    quote-engine.ts           Quote generation orchestration
    quote-mcp-client.ts       MCP client for voice pipeline
    voice-pipeline.ts         Voice pipeline orchestration
    memory-service.ts         AgentMemory wrapper
    cache-service.ts          CacheEngine wrapper
  server/
    ws-server.ts              Fastify WebSocket server
  types/
    quote.ts                  Part, Quote, QuoteLineItem, QuoteInput schemas
    call.ts                   CallRecord, ConversationTurn schemas
    config.ts                 AppConfig schema
tests/                        Vitest suite (63 tests, 90%+ coverage)
```
 
## Development
 
```bash
pnpm test          # vitest run with coverage
pnpm typecheck     # tsc --noEmit
pnpm lint          # eslint
pnpm build         # next build
```
 
## License
 
MIT — see [LICENSE](./LICENSE).