When a customer calls asking 'How much to fix my brake noise?', the service advisor has to drop everything, walk to the bay, and interrupt a mechanic. That kills bay productivity and makes the customer wait on hold. Worse, the advisor often lacks the part-pricing data at their fingertips, leading to rough ballpark quotes that later get disputed. The shop owner sees advisor burnout and lost revenue from calls that never converted.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
Convert inbound “how much to fix X” phone calls into structured repair estimates without tying up your service advisor. This tutorial builds a voice quote assistant for auto-repair shops using Next.js, Twilio telephony, Deepgram STT/TTS, the OpenAI Responses API, and the REAA voice agent packages. By the end, you’ll have a working system that accepts Twilio calls, transcribes speech, looks up parts and pricing, generates a quote, and reads it back — with caching, agent memory, and Langfuse observability.
Prerequisites
Node.js 22+ and pnpm 10 installed
An OpenAI API key with access to gpt-5.2-mini and text-embedding-3-small
A Twilio account with an inbound phone number
A Deepgram API key (for STT and TTS)
(Optional) A Langfuse project for observability
Step 1: Scaffold the Next.js project and install dependencies
Create a fresh Next.js project with the App Router, then install all the dependencies this recipe needs.
Expected output: Running pnpm typecheck on an empty scaffold passes. All deps are pinned to exact versions (no ^ or ~ prefixes).
Step 2: Define the data types with Zod
Create the src/types/ directory and define the core schemas. These types represent the part catalog, repair quotes, call records, and app configuration — every module in the system reads from these.
src/types/quote.ts — Part, QuoteLineItem, Quote, and QuoteInput:
export type { Part, Quote, QuoteLineItem, QuoteInput } from './quote.js';export type { CallRecord, ConversationTurn } from './call.js';export type { AppConfig } from './config.js';export { PartSchema, QuoteSchema, QuoteLineItemSchema, QuoteInputSchema,} from './quote.js';export { CallRecordSchema, ConversationTurnSchema } from './call.js';export { AppConfigSchema } from './config.js';
Expected output:pnpm typecheck passes with no errors.
Step 3: Environment config and client factories
Create a config loader that validates environment variables at startup, catching misconfiguration early.
Now create the OpenAI and Twilio client factories using a singleton pattern.
src/lib/openai-client.ts:
ts
import OpenAI from 'openai';import type { AppConfig } from '../types/config.js';import { loadConfig } from '../config/env.js';let _client: OpenAI | undefined;export function createOpenAIClient(config: AppConfig): OpenAI { return new OpenAI({ apiKey: config.openaiApiKey });}export function getOpenAIClient(): OpenAI { if (!_client) { _client = createOpenAIClient(loadConfig()); } return _client;}
src/lib/twilio-client.ts:
ts
import twilio from 'twilio';import type { AppConfig } from '../types/config.js';import { loadConfig } from '../config/env.js';let _client: twilio.Twilio | undefined;export function createTwilioClient(config: AppConfig) { return twilio(config.twilioAccountSid, config.twilioAuthToken);}export function getTwilioClient() { if (!_client) { _client = createTwilioClient(loadConfig()); } return _client;}
Add a telemetry module that wires up Langfuse and OpenTelemetry.
src/lib/telemetry.ts:
ts
import Langfuse from 'langfuse';import { initializeObservability } from '@reaatech/voice-agent-core';import type { AppConfig } from '../types/config.js';export async function initObservability(config: AppConfig): Promise<void> { if (config.langfusePublicKey && config.langfuseSecretKey) { new Langfuse({ publicKey: config.langfusePublicKey, secretKey: config.langfuseSecretKey, }); } await initializeObservability({ serviceName: 'openai-quote-assistant', serviceVersion: '0.1.0', enabled: true, otlpEndpoint: process.env.OTLP_ENDPOINT || 'http://localhost:4318/v1/traces', });}
Create .env.example with placeholder values:
env
# Env vars used by openai-quote-assistant.# The builder adds entries here as it wires up each integration.# Keep placeholders only — never commit real values.OPENAI_API_KEY=<your-openai-api-key>TWILIO_ACCOUNT_SID=<your-twilio-account-sid>TWILIO_AUTH_TOKEN=<your-twilio-auth-token>DEEPGRAM_API_KEY=<your-deepgram-api-key>LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>FASTIFY_PORT=3001FASTIFY_HOST=0.0.0.0NODE_ENV=development
Step 4: Build the part pricing catalog and service
Auto-repair shops need a lookup table of common parts and labor costs. Create a catalog and a service that matches customer issues to parts.
src/services/part-pricing.ts — the PricingService class:
ts
import { DEFAULT_PART_CATALOG, type PartCatalog } from './pricing-store.js';import type { Part, QuoteLineItem } from '../types/quote.js';export class PricingService { private catalog: PartCatalog; constructor(catalog: PartCatalog = DEFAULT_PART_CATALOG) { this.catalog = catalog; } lookupParts(issue: string, _make?: string, _model?: string): Part[] { void _make; void _model; if (!issue || issue.trim().length === 0) { return []; } const lowerIssue = issue.toLowerCase(); const tokens = lowerIssue.split(/\s+/); const matched = new Map<string, Part>(); for (const [category, parts] of Object.entries(this.catalog)) { const catLower = category.toLowerCase(); const matches = tokens.some(t => catLower.includes(t) || t.includes(catLower)); if (matches) { for (const part of parts) { matched.set(part.id, part); } } } return Array.from(matched.values()); } getPartById(id: string): Part | undefined { for (const parts of Object.values(this.catalog)) { const found = parts.find(p => p.id === id); if (found) return found; } return undefined; } calculateQuoteLineItem(part: Part): QuoteLineItem { const subtotal = Math.round((part.price + part.laborCost) * 100) / 100; return { partId: part.id, partName: part.name, partPrice: part.price, laborCost: part.laborCost, laborHours: part.laborHours, subtotal, }; } calculateGrandTotal(lineItems: QuoteLineItem[]): { totalParts: number; totalLabor: number; grandTotal: number } { const totalParts = Math.round(lineItems.reduce((sum, li) => sum + li.partPrice, 0) * 100) / 100; const totalLabor = Math.round(lineItems.reduce((sum, li) => sum + li.laborCost, 0) * 100) / 100; const grandTotal = Math.round((totalParts + totalLabor) * 100) / 100; return { totalParts, totalLabor, grandTotal }; }}
Expected output:lookupParts('brakes') returns the four brake-related parts with calculated labor costs.
Step 5: Build the QuoteEngine
The QuoteEngine uses the OpenAI Responses API to interpret a customer’s spoken issue (extracting vehicle info) and generates a structured quote from the pricing catalog.
src/services/quote-engine.ts:
ts
import OpenAI from 'openai';import { z } from 'zod';import type { Quote, QuoteInput } from '../types/quote.js';import { PricingService } from './part-pricing.js';import type { CacheService } from './cache-service.js';const InterpretedIssueSchema = z.object({ issue: z.string(), make: z.string().nullable().optional(), model: z.string().nullable().optional(), year: z.number().nullable().optional(),
Expected output: Calling generateQuote({ issue: 'brakes', customerName: 'Alice' }) returns a Quote with populated line items and calculated totals.
Step 6: Add semantic caching and agent memory
The CacheService wraps the REAA llm-cache package to avoid redundant OpenAI calls. The MemoryService stores extracted facts and preferences from conversations.
Expected output: Both services import cleanly and their initialize() methods accept an AppConfig object.
Step 7: Create OpenAI function tools and the MCP client
The function tools tell the model what operations it can perform. The QuoteMCPClient is a tool-calling loop that alternates between the model and your functions until the model produces a final response.
src/services/openai-tools.ts:
ts
export const lookupPartsTool = { type: 'function' as const, name: 'lookup_parts', description: 'Look up auto-repair parts and pricing for a given issue and vehicle.', parameters: { type: 'object', properties: { issue: { type: 'string' }, make: { type: 'string' }, model: { type: 'string' }, year: { type: 'number' }, }, required: ['issue'], }, strict: false,};export const generateQuoteTool = { type: 'function' as const, name: 'generate_quote', description: 'Generate a formal repair quote from selected parts and customer info.', parameters: { type: 'object', properties: { lineItems: { type: 'array', items: { type: 'object', properties: { partName: { type: 'string' }, partPrice: { type: 'number' }, laborCost: { type: 'number' }, }, }, }, customerName: { type: 'string' }, }, required: ['lineItems'], }, strict: false,};
src/services/quote-mcp-client.ts:
ts
import OpenAI from 'openai';import type { ResponseInput } from 'openai/resources/responses/responses';import { lookupPartsTool, generateQuoteTool } from './openai-tools.js';import { PricingService } from './part-pricing.js';import { QuoteEngine } from './quote-engine.js';import type { Quote, QuoteInput } from '../types/quote.js';interface ToolCall { name: string; arguments: Record<string, unknown>; result?: Record<string, unknown>;}
Expected output: The client loops up to 5 times, calling the model with the two tools and routing tool outputs back into the conversation.
Step 8: Wire the voice pipeline and Fastify WebSocket server
Bring together STT, TTS, the MCP client, latency budgets, and session management into a VoicePipelineService. Then create a Fastify WebSocket server to handle Twilio Media Streams.
import { createPipeline, createLatencyBudget, initializeSessionManager, LatencyBudgetEnforcer, type AudioChunk, type PipelineEvent, type Pipeline, type SessionManager, type STTProvider, type TTSProvider, type Session } from '@reaatech/voice-agent-core';import { createSTTProvider } from '@reaatech/voice-agent-stt';import { createTTSProvider, TTSProviderInterface } from '@reaatech/voice-agent-tts';import { createVoiceAgentKitConfig } from '../config/pipeline-config.js';import type { AppConfig } from '../types/config.js';import { QuoteMCPClient } from './quote-mcp-client.js';export class VoicePipelineService { public pipeline!: Pipeline
src/server/ws-server.ts — the Fastify WebSocket server:
ts
import Fastify, { type FastifyInstance } from 'fastify';import fastifyWebsocket from '@fastify/websocket';import type WebSocket from 'ws';import { createTwilioHandler } from '@reaatech/voice-agent-telephony';import type { AudioChunk } from '@reaatech/voice-agent-core';import type { AppConfig } from '../types/config.js';import type { VoicePipelineService } from '../services/voice-pipeline.js';import type { MemoryService } from '../services/memory-service.js';import type { CallRecord } from '../types/call.js';export class WebSocketServer
Expected output: The Fastify server starts on port 3001, serves a /health endpoint, and accepts WebSocket connections at /media-stream for Twilio Media Streams.
Step 9: Create the Next.js API routes
These route handlers serve as the HTTP interface for quote CRUD, part catalog queries, call tracking, and Twilio voice webhooks.
app/api/twilio/status/route.ts — Twilio status callback that records call outcomes:
ts
import { type NextRequest, NextResponse } from 'next/server';export const callRecords: Array<{ callSid: string; status: string; startedAt?: string; endedAt?: string; durationSeconds?: number;}> = [];export async function POST(req: NextRequest) { const formData = await req.formData(); const callSid = formData.get('CallSid') as string; const callStatus = formData.get('CallStatus') as string; if (callSid) { const existing = callRecords.find(r => r.callSid === callSid); if (existing) { existing.status = callStatus; existing.endedAt = new Date().toISOString(); if (existing.startedAt) { existing.durationSeconds = Math.floor( (new Date(existing.endedAt).getTime() - new Date(existing.startedAt).getTime()) / 1000, ); } } else { const now = new Date().toISOString(); callRecords.push({ callSid, status: callStatus, startedAt: now, endedAt: now, durationSeconds: 0, }); } } return NextResponse.json({ ok: true });}
app/api/calls/route.ts — list and reset call records:
ts
import { NextResponse } from 'next/server';import { callRecords } from '../twilio/status/route.js';export function GET() { const sorted = [...callRecords].sort( (a, b) => (b.startedAt || '').localeCompare(a.startedAt || ''), ); return NextResponse.json(sorted);}export function DELETE() { callRecords.length = 0; return NextResponse.json({ ok: true });}
src/index.ts — barrel export of all reusable modules:
ts
export { loadConfig } from './config/env.js';export { createOpenAIClient, getOpenAIClient } from './lib/openai-client.js';export { PricingService } from './services/part-pricing.js';export { QuoteEngine } from './services/quote-engine.js';export { VoicePipelineService } from './services/voice-pipeline.js';export { WebSocketServer } from './server/ws-server.js';export { MemoryService } from './services/memory-service.js';export { CacheService } from './services/cache-service.js';export type { Part, Quote, QuoteLineItem, QuoteInput } from './types/quote.js';export type { CallRecord, ConversationTurn } from './types/call.js';export type { AppConfig } from './types/config.js';
Expected output: All route handlers compile. pnpm typecheck passes.
Step 10: Build the dashboard page and run the tests
Replace the placeholder app/page.tsx with a dashboard that shows live call and quote counts:
Expected output: All tests pass, numFailedTests=0, and coverage for src/**/*.ts and app/**/route.ts meets the 90% thresholds.
Next steps
Add a persistent database — replace the in-memory quotes[] and callRecords[] arrays with PostgreSQL, SQLite, or any database adapter so data survives restarts.
Integrate with shop management software — when a quote is accepted, push line items and customer info to the shop’s existing inventory or invoicing system via webhooks.
Add multi-language support — configure Deepgram’s language detection and the OpenAI model to handle calls in Spanish, French, or Vietnamese for multilingual shops.
Implement a human escalation workflow — when the customer asks for a manager or the confidence score drops below a threshold, transfer the call to a live service advisor.
});
function generateId(): string {
return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, c => {
'You extract auto-repair issues and vehicle info from customer speech. Return JSON with keys "issue", "make", "model", "year". Use null for unknown vehicle fields.',
'You are a friendly auto-repair service advisor. Read the following quote to the customer in a conversational tone. Do NOT read the quote ID or internal fields.',