A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
Small businesses lose revenue when after-hours callers reach voicemail or an overwhelmed front desk. This tutorial walks you through building a phone-based lead intake agent that answers calls, transcribes speech with Deepgram, uses xAI Grok for natural-language understanding, qualifies callers with a confidence router, and syncs results to a CRM — all backed by Redis for multi-turn session continuity.
You’ll write TypeScript modules under src/, App Router route handlers under app/api/, and a test suite under tests/. The @reaatech/* package family provides the real-time audio pipelines, telephony WebSocket integration, MCP agent orchestration, and session lifecycle management. By the end, you’ll have a complete, testable voice lead intake system.
Prerequisites
Node.js >=22 and pnpm installed globally
Redis running locally on port 6379 (or a remote instance)
Twilio account with a phone number that supports Media Streams
Deepgram API key (for STT transcription and TTS synthesis)
xAI API key (for Grok language model access)
Langfuse account (optional, for observability — or leave placeholders)
Familiarity with TypeScript, Next.js App Router, and basic async/await patterns
Step 1: Scaffold the project
Start from an empty directory. Create a Next.js 16 project with the App Router and install all the dependencies.
You’ll pin every dependency to an exact version so builds are reproducible. The @reaatech/* packages provide the voice pipeline, telephony, MCP client, confidence router, session continuity, and Redis storage. Third-party packages include @ai-sdk/xai for Grok access, @deepgram/sdk for speech, twilio for telephony, and for observability.
Expected output:pnpm install prints a success summary with no warnings. The node_modules/ directory and pnpm-lock.yaml are created.
Step 2: Define shared types
Create src/types.ts to hold the domain types used across all your services. These interfaces describe leads, lead forms, qualification results, CRM updates, and a session-manager handle type that abstracts the @reaatech/session-continuity API.
Expected output:tsc --noEmit passes with no errors. The types are self-contained — no imports from external packages yet.
Step 3: Set up configuration and logger
Create src/lib/config.ts to read environment variables for all external services. Each function returns a typed config object keyed from process.env. The MCP endpoint uses the envConfig helper from @reaatech/mcp-server-core for the default port fallback.
Create src/lib/logger.ts to initialize the Langfuse singleton and expose createTrace / logEvent convenience helpers. Every service module uses these for observability.
ts
import Langfuse from "langfuse";import { getLangfuseConfig } from "./config.js";const { secretKey, publicKey, baseUrl } = getLangfuseConfig();export const langfuse = new Langfuse({ secretKey, publicKey, baseUrl });export function createTrace(sessionId: string, name: string) { return langfuse.trace({ id: sessionId, name });}export function logEvent( trace: { event: (params: { name: string; input: unknown }) => void }, eventName: string, data: Record<string, unknown>) { trace.event({ name: eventName, input: data });}
Now create .env.example at the project root with every variable referenced above:
env
# Env vars used by xai-grok-voice-lead-intake-for-small-business.# Keep placeholders only — never commit real values.NODE_ENV=development# TwilioTWILIO_ACCOUNT_SID=<your-twilio-account-sid>TWILIO_AUTH_TOKEN=<your-twilio-auth-token>TWILIO_PHONE_NUMBER=<your-twilio-phone-number># DeepgramDEEPGRAM_API_KEY=<your-deepgram-api-key># xAI GrokXAI_API_KEY=<your-xai-api-key># RedisREDIS_URL=redis://localhost:6379# Langfuse observabilityLANGFUSE_SECRET_KEY=<your-langfuse-secret-key>LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>LANGFUSE_BASE_URL=https://cloud.langfuse.com# MCP agent serverMCP_ENDPOINT=http://localhost:3001/mcpMCP_TIMEOUT=400MCP_RETRY_ATTEMPTS=2# WebSocket server for Twilio Media StreamsNEXT_PUBLIC_WS_URL=wss://your-domain.com/wsWS_PORT=8081
Expected output: Copy .env.example to .env and fill in your API keys. tsc --noEmit passes and @reaatech/mcp-server-core and langfuse resolve correctly.
Step 4: Build the xAI Grok service
Create src/services/xai-service.ts. This module wraps the Vercel AI SDK’s generateText function with the xAI Grok model. It provides four operations: extractLeadForm (parse contact info from raw speech), classifyLeadIntent (determine caller intent), qualifyLead (score lead quality), and generateResponse (produce a natural-language reply in multi-turn conversation).
Each function wraps the Grok call in a try/catch and returns a safe fallback on failure — the voice pipeline must never crash from an API error.
ts
import { generateText, Output } from "ai";import { xai } from "@ai-sdk/xai";import { z } from "zod";import type { LeadForm } from "../types.js";import { createTrace, logEvent } from "../lib/logger.js";export class XaiServiceError extends Error { constructor(message: string, cause?: Error) { super(message); this.name = "XaiServiceError"; if (cause) this.cause = cause; }}const ClassifyLeadIntentSchema = z.
Expected output:pnpm typecheck passes. The zod, ai, and @ai-sdk/xai imports resolve to their exact-pinned versions.
Step 5: Build the lead service with confidence routing
Create src/services/lead-service.ts. This is the orchestration layer that combines the confidence router with the xAI Grok service. It creates a ConfidenceRouter with keyword classifiers for booking, quote requests, support, cancellations, and general info — then provides evaluateLead which processes a transcript through the router and decides whether to route, clarify, or fall back.
ts
import { ConfidenceRouter, KeywordClassifier } from "@reaatech/confidence-router";import { type Lead, type LeadForm, type QualificationResult } from "../types.js";import { qualifyLead, extractLeadForm } from "./xai-service.js";export function createLeadRouter(): ConfidenceRouter { const router = new ConfidenceRouter({ routeThreshold: 0.8, fallbackThreshold: 0.3, clarificationEnabled: true, }); router.registerClassifier( new KeywordClassifier([ { label: "booking", keywords: ["book"
Expected output:pnpm typecheck passes. The @reaatech/confidence-router import resolves. The ConfidenceRouter constructor accepts { routeThreshold, fallbackThreshold, clarificationEnabled }.
Step 6: Set up session continuity with Redis
Create src/services/session-continuity.ts. This module wires Redis-backed session storage to @reaatech/session-continuity. It creates both a redis (node-redis v4) client for the RedisAdapter and a separate ioredis client for manual Redis operations. The SessionManager is configured with a token budget of 4,096 tokens and a sliding-window compression strategy.
ts
import { SessionManager } from "@reaatech/session-continuity";import { RedisAdapter } from "@reaatech/session-continuity-storage-redis";import { TiktokenTokenizer } from "@reaatech/session-continuity-tokenizers";import { createClient, type RedisClientType } from "redis";import { Redis } from "ioredis";import type { Session, Message } from "@reaatech/session-continuity";import type { SessionManagerHandle } from "../types.js";import { getRedisConfig } from "../lib/config.js";import { langfuse } from "../lib/logger.js";export async function createSessionManager(): Promise<SessionManagerHandle> { const config = getRedisConfig(); const redisClient = createClient({ url: config.url }) as RedisClientType; await redisClient.connect(); const redisAdapter = new RedisAdapter({ client: redisClient, ttlSeconds: 3600 }); const manager = new SessionManager({ storage: redisAdapter, tokenCounter: new TiktokenTokenizer("gpt-4"), tokenBudget: { maxTokens: 4096, reserveTokens: 500, overflowStrategy: "compress", }, compression: { strategy: "sliding_window", targetTokens: 3500, }, sessionTTL: 3600, cleanupInterval: 300, }); manager.on("session:created", (payload) => { langfuse.trace({ name: "session_created", input: { sessionId: payload.sessionId } }); }); manager.on("session:ended", (payload) => { langfuse.trace({ name: "session_ended", input: { sessionId: payload.sessionId } }); }); return manager as SessionManagerHandle;}let ioRedisClient: Redis | null = null;export function getIoRedis(): Redis { if (!ioRedisClient) { ioRedisClient = new Redis(getRedisConfig().url); } return ioRedisClient;}export async function getOrCreateSession( manager: SessionManagerHandle, callSid: string): Promise<Session> { const existing = await manager.listSessions({ tags: [callSid] }); if (existing.length > 0) { return existing[0] as Session; } const session = await manager.createSession({ userId: callSid, metadata: { title: `Call ${callSid}`, tags: [callSid], }, }); return session as Session;}export async function addTurnToSession( manager: SessionManagerHandle, sessionId: string, userText: string, agentText: string): Promise<void> { await manager.addMessage(sessionId, { role: "user", content: userText }); await manager.addMessage(sessionId, { role: "assistant", content: agentText });}export async function getConversation( manager: SessionManagerHandle, sessionId: string, maxTurns?: number): Promise<Message[]> { const context = await manager.getConversationContext(sessionId); if (maxTurns && context.length > maxTurns) { return context.slice(-maxTurns) as Message[]; } return context as Message[];}export async function closeSession( manager: SessionManagerHandle, sessionId: string): Promise<void> { await manager.endSession(sessionId);}
Expected output:pnpm typecheck passes. The SessionManager, RedisAdapter, and TiktokenTokenizer constructors accept the documented parameters.
Step 7: Create the CRM MCP tools
Create src/services/crm-service.ts. This implements two MCP tools — get_lead_form (returns the form schema) and update_crm (validates a lead and logs it to Langfuse). Each returns a ToolResponse using the helpers from @reaatech/mcp-server-core.
Expected output:pnpm typecheck passes. The textContent and errorResponse helpers from @reaatech/mcp-server-core are available.
Step 8: Build Deepgram STT and TTS adapters
Create src/services/deepgram-stt-provider.ts. This is a factory that wraps a DeepgramClient instance into an STT provider contract — connect, streamAudio, onUtterance, onEndOfSpeech, and close. Internally it opens a Deepgram Live Transcription connection with the Nova-3 model.
ts
import { DeepgramClient } from "@deepgram/sdk";import type { AudioChunk } from "@reaatech/voice-agent-core";import type { Utterance } from "@reaatech/voice-agent-core";export function createDeepgramSTTProvider(dgClient: DeepgramClient) { let socket: Awaited<ReturnType<typeof dgClient.listen.v1.connect>> | null = null; let utteranceCallback: ((utterance: Utterance) => void) | null = null; let endOfSpeechCallback: (() => void) | null = null; return { name: "deepgram-stt" as const, async connect() { const connection = await dgClient.listen.v1.connect({ model: "nova-3", language: "en", interim_results: "true", encoding: "linear16", sample_rate: 16000, vad_events: "true", utterance_end_ms: "1000", Authorization: "", }); connection.on("message", (data) => { if (data.type === "Results") { for (const alt of data.channel.alternatives) { utteranceCallback?.({ transcript: alt.transcript, confidence: alt.confidence, isFinal: data.is_final === true, timestamp: Date.now() }); break; } } if (data.type === "UtteranceEnd") { endOfSpeechCallback?.(); } }); socket = connection; }, streamAudio(chunk: AudioChunk) { if (socket) { socket.sendMedia(chunk.buffer); } }, onUtterance(cb: (utterance: Utterance) => void) { utteranceCallback = cb; }, onEndOfSpeech(cb: () => void) { endOfSpeechCallback = cb; }, close(): Promise<void> { if (socket) { socket.close(); socket = null; } return Promise.resolve(); }, };}
Create src/services/deepgram-tts-provider.ts. This factory wraps a DeepgramClient instance into a TTS provider contract — synthesize returns an AsyncIterable<AudioChunk> and cancel stops generation. It uses the Aura-2 Thalia voice with linear16 WAV encoding.
Expected output:pnpm typecheck passes. The @deepgram/sdk types resolve correctly for both live transcription and TTS synthesis.
Step 9: Create the MCP agent server
Create src/agent-server.ts. This module creates an MCPClient from @reaatech/voice-agent-mcp-client, connects it, discovers available tools, and exposes a handleRequest function that routes utterances through the lead service. On a ROUTE decision, it sends a full MCP request; on CLARIFY or FALLBACK, it returns the decision text directly without an MCP round-trip.
ts
import { MCPClient } from "@reaatech/voice-agent-mcp-client";import type { MCPTool } from "@reaatech/voice-agent-mcp-client";import { evaluateLead } from "./services/lead-service.js";import { buildLeadForm } from "./services/lead-service.js";import { registerCrmTools } from "./services/crm-service.js";export interface AgentServer { client: MCPClient; tools: MCPTool[]; handleRequest: ( sessionId: string, turnId: string, utterance: string, history: Array<{ role: string; content: string }> ) => Promise<{ text: string; toolCalls: unknown[]; confidence: number }>;}export async function createAgentServer(config: { endpoint: string; timeout: number; retryAttempts: number;}): Promise<AgentServer> { const client = new MCPClient({ endpoint: config.endpoint, timeout: config.timeout, retryAttempts: config.retryAttempts, maxHistoryTurns: 20, }); await client.connect(); await client.discoverTools(); return { client, tools: client.getDiscoveredTools(), async handleRequest(sessionId, turnId, utterance, history) { try { const form = await buildLeadForm(utterance); const result = await evaluateLead(utterance, form); if (result.decision.type === "ROUTE") { const mcpResponse = await client.sendRequest({ sessionId, turnId: turnId || sessionId, utterance, history, tools: client.getDiscoveredTools(), }); return { text: mcpResponse.text, toolCalls: mcpResponse.toolCalls, confidence: mcpResponse.confidence ?? 0.95, }; } const fallbackText = result.decision.type === "CLARIFY" ? (result.decision.prompt ?? "Could you tell me more about what you're looking for?") : (result.decision.fallbackMessage ?? "Let me connect you to a team member."); return { text: fallbackText, toolCalls: [], confidence: result.score }; } catch { return { text: "I'm sorry, I had trouble processing that. Could you try again?", toolCalls: [], confidence: 0, }; } }, };}export function registerAgentTools(server: { registerTool: (name: string, handler: unknown) => void;}): void { registerCrmTools(server);}
Expected output:pnpm typecheck passes. The @reaatech/voice-agent-mcp-client API — MCPClient constructor, connect(), discoverTools(), getDiscoveredTools(), sendRequest() — all resolve.
Step 10: Wire the voice pipeline
Create src/voice-pipeline.ts. This function assembles every component you’ve built into a unified voice pipeline using @reaatech/voice-agent-core. It creates Deepgram clients, instantiates the STT and TTS adapters, creates the agent server, sets up a latency budget enforcer with an 800ms target, initializes the session manager, and wires pipeline events to Langfuse for observability.
ts
import { createPipeline, createLatencyBudget, initializeSessionManager, LatencyBudgetEnforcer,} from "@reaatech/voice-agent-core";import type { Pipeline, AudioChunk } from "@reaatech/voice-agent-core";import { DeepgramClient } from "@deepgram/sdk";import { createDeepgramSTTProvider } from "./services/deepgram-stt-provider.js";import { createDeepgramTTSProvider } from "./services/deepgram-tts-provider.js";import { createAgentServer } from "./agent-server.js";import { getDeepgramConfig, getMcpConfig } from "./lib/config.js";import { langfuse } from "./lib/logger.js";export async function createVoicePipeline
Expected output:pnpm typecheck passes. The createPipeline, createLatencyBudget, initializeSessionManager, and LatencyBudgetEnforcer imports from @reaatech/voice-agent-core all resolve.
Step 11: Handle Twilio webhooks
Create src/webhooks/call.ts. This module handles incoming Twilio voice calls by building TwiML that connects the call to a WebSocket Media Stream. It also provides lifecycle hooks: initCallSession creates a Redis-backed session, handleCallStatus closes the session on completed calls or logs failures, and handleRecording stores recording URLs in session metadata.
Expected output:pnpm typecheck passes. The TwiML builder from twilio produces valid <Connect><Stream> XML.
Step 12: Create the App Router API routes
Create three route handlers under app/api/.
First, app/api/twilio/voice/route.ts — the Twilio voice webhook. It receives form-encoded call data, builds TwiML via handleIncomingCall, initializes a Redis session, and returns text/xml:
ts
import { type NextRequest, NextResponse } from "next/server";import { handleIncomingCall, initCallSession } from "../../../../src/webhooks/call.js";import { createSessionManager } from "../../../../src/services/session-continuity.js";let sessionManager: Awaited<ReturnType<typeof createSessionManager>> | null = null;async function getManager() { if (!sessionManager) { sessionManager = await createSessionManager(); } return sessionManager;}export async function POST(req: NextRequest): Promise<NextResponse> { try { const formData = await req.formData(); const callSid = formData.get("CallSid") as string | null; const from = formData.get("From") as string | null; const to = formData.get("To") as string | null; if (!callSid) { return NextResponse.json({ error: "missing CallSid" }, { status: 400 }); } const manager = await getManager(); const twiml = handleIncomingCall({ CallSid: callSid, From: from ?? "", To: to ?? "", wsUrl: process.env.NEXT_PUBLIC_WS_URL ?? "", }); await initCallSession(manager, callSid, from ?? ""); return new NextResponse(twiml, { status: 200, headers: { "Content-Type": "text/xml" }, }); } catch (err) { return NextResponse.json({ error: String(err) }, { status: 500 }); }}
Second, app/api/twilio/status/route.ts — the Twilio call status callback:
Third, app/api/leads/route.ts — list sessions (GET) and manually qualify leads from transcript text (POST):
ts
import { type NextRequest, NextResponse } from "next/server";import { evaluateLead, buildLeadForm } from "../../../src/services/lead-service.js";import { createSessionManager } from "../../../src/services/session-continuity.js";let sessionManager: Awaited<ReturnType<typeof createSessionManager>> | null = null;export async function GET(): Promise<NextResponse> { try { if (!sessionManager) { sessionManager = await createSessionManager(); } const sessions = await sessionManager.listSessions({ status: "active" as const }); return NextResponse.json({ leads: sessions }); } catch { return NextResponse.json({ leads: [] }); }}export async function POST(req: NextRequest): Promise<NextResponse> { try { const body = (await req.json()) as { transcript?: string }; if (!body.transcript) { return NextResponse.json({ error: "invalid body" }, { status: 400 }); } const form = await buildLeadForm(body.transcript); const result = await evaluateLead(body.transcript, form); return NextResponse.json(result); } catch { return NextResponse.json({ error: "invalid body" }, { status: 400 }); }}
Expected output:pnpm typecheck passes. Each route handler is an App Router route.ts with named GET and/or POST exports.
Step 13: Add WebSocket instrumentation
Create src/instrumentation.ts. Next.js’s experimental.instrumentationHook calls the register() export at server startup. Inside, guard against non-Node runtimes, then start a WebSocketServer on the configured port. Each connection creates a telephony handler from @reaatech/voice-agent-telephony that manages Twilio Media Streams — accepting connections, processing incoming audio, detecting barge-in, and handling call-end signals.
Now update src/index.ts to re-export the public API of your modules:
ts
export { qualifyLead, classifyLeadIntent, generateResponse } from "./services/xai-service.js";export { evaluateLead, createLeadRouter, buildLeadForm, validateLead } from "./services/lead-service.js";export { createSessionManager, getOrCreateSession, addTurnToSession, getConversation, closeSession } from "./services/session-continuity.js";
Expected output:pnpm typecheck passes. The instrumentation.ts guard prevents the WebSocket server from starting in edge runtime contexts.
Step 14: Run the tests
Create your test suite under tests/. Here’s the structure you’ll build — one file per module with mocked external dependencies.
Create tests/services/xai-service.test.ts that mocks ai and @ai-sdk/xai at module level with vi.mock. It tests the four exported functions — happy paths, API failure recovery, empty transcript edge cases, multi-turn conversation, and the XaiServiceError custom error class.
Similarly create test files for each service module (lead-service.test.ts, session-continuity.test.ts, crm-service.test.ts, deepgram-stt-provider.test.ts, deepgram-tts-provider.test.ts), plus integration tests for the route handlers (api/twilio-voice.test.ts, api/leads.test.ts).
Now run everything:
terminal
pnpm typecheckpnpm lintpnpm test
Expected output:pnpm typecheck exits 0 with no errors. pnpm lint exits 0 with no warnings. pnpm test prints a JSON report with numFailedTests: 0, numTotalTests >= 3, and coverage of at least 90% across lines, branches, functions, and statements.
Next steps
Add SMS fallback — extend the webhook handler to send an SMS via Twilio’s Messaging API when a call is not answered, with a link to a web lead form.
Integrate a real CRM — replace the updateCrm MCP tool’s Langfuse logging with an actual CRM API call (HubSpot, Salesforce, or Pipedream).
Add sentiment analysis — pipe the Deepgram STT utterance confidence scores into the qualification pipeline to flag frustrated callers for priority routing.
Deploy with a WebSocket proxy — deploy behind a TLS-terminating proxy (Caddy, Nginx, or Cloudflare) and configure Twilio’s wss:// endpoint pointing to your WebSocket server’s public URL.
Build a dashboard — add a Next.js page at app/leads/page.tsx that fetches GET /api/leads and renders lead cards with status badges and transcript excerpts.
object
({
intent: z.string(),
confidence: z.number(),
});
const LeadFormSchema = z.object({
name: z.string().optional(),
phone: z.string(),
email: z.string().optional(),
company: z.string().optional(),
notes: z.string().optional(),
});
export async function extractLeadForm(transcript: string): Promise<LeadForm> {
system: "Extract lead contact information from the transcript. Always try to find a phone number. Return partial results if information is incomplete.",
prompt: `Transcript: ${transcript}`,
});
return result.output;
} catch {
return { phone: "" };
}
}
const QualifyLeadSchema = z.object({
score: z.number().min(0).max(1),
summary: z.string(),
});
export async function classifyLeadIntent(
transcript: string
): Promise<{ intent: string; confidence: number }> {
system: "You are a lead classification assistant. Classify the caller intent from their transcript into one of: booking, quote_request, support, general_info, cancel, unknown. Return a JSON object with 'intent' and 'confidence' fields.",
system: "You are a lead qualification assistant. Evaluate the quality of the lead based on the conversation transcript and form data. Score from 0 (poor) to 1 (excellent). Provide a brief summary. Return a JSON object with 'score' (number 0-1) and 'summary' (string) fields.",
system: "You are a professional receptionist for a small business. Your role is to qualify callers as leads, ask relevant questions, and collect contact information. Be polite, professional, and conversational. Keep responses brief (2-3 sentences).",