Skip to content
reaatechREAATECH

Files · xAI Grok Voice Agent for After-Hours Customer Support

70 (1 binary, 576.8 kB total)attempt 1

README.md·4624 B·markdown
markdown
# xAI Grok Voice Agent for After-Hours Customer Support
 
> Deploy an AI receptionist powered by xAI Grok that answers calls, qualifies leads, and routes urgent issues—without 24/7 staffing.
 
Small businesses lose potential customers when calls go unanswered after hours. This recipe demonstrates how to build an AI-powered voice agent that uses xAI Grok for natural conversation, LiveKit for real-time media, Deepgram for speech recognition, and Cartesia for speech synthesis.
 
## Architecture
 
```
Caller → Twilio → LiveKit Room → Webhook → VoiceAgentOrchestrator
                                              ├── ConfidenceRouter (intent classification)
                                              ├── BudgetController (per-call spend limits)
                                              ├── LlmCache (semantic response cache)
                                              └── AgentHandoff (Twilio SMS / callback)
```
 
The pipeline:
1. **LiveKit** dispatches an agent when a call comes in
2. **Deepgram** transcribes caller speech (via LiveKit plugin)
3. **xAI Grok** (via `@ai-sdk/xai`) generates natural responses
4. **Cartesia** synthesises speech output (via LiveKit plugin)
5. **ConfidenceRouter** classifies intent: inquiry, booking, or escalation
6. **BudgetController** enforces per-call spending limits
7. **LlmCache** caches common prompt-response pairs to reduce API cost
8. **AgentHandoff** triggers Twilio SMS/callbacks when human escalation is needed
 
## Prerequisites
 
- [xAI API key](https://console.x.ai) — `XAI_API_KEY`
- [LiveKit Cloud](https://cloud.livekit.io) account — `LIVEKIT_API_KEY`, `LIVEKIT_API_SECRET`
- [Deepgram API key](https://console.deepgram.com) — `DEEPGRAM_API_KEY`
- [Cartesia API key](https://cartesia.ai) — `CARTESIA_API_KEY`
- [Twilio account](https://console.twilio.com) — `TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`
- [Langfuse account](https://langfuse.com) — `LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY`
- [OpenAI API key](https://platform.openai.com/api-keys) — `OPENAI_API_KEY` (for semantic cache embeddings)
 
## Quick Start
 
```bash
cp .env.example .env
# Fill in your API keys in .env
 
pnpm install
pnpm dev
```
 
Then configure your LiveKit webhook to send `agent_dispatch` events to:
```
POST https://your-host/api/webhook/voice
```
 
## API Endpoints
 
| Method | Path                    | Description                                  |
|--------|-------------------------|----------------------------------------------|
| POST   | `/api/webhook/voice`    | Receives LiveKit `agent_dispatch` webhooks   |
| POST   | `/api/twilio/callback`  | Receives Twilio SMS delivery status callbacks |
 
## Project Structure
 
```
app/
  api/
    webhook/voice/route.ts      LiveKit agent dispatch webhook
    twilio/callback/route.ts    Twilio status callback
src/
  lib/
    types.ts                    Shared types and Zod schemas
    grok.ts                     xAI Grok client (Vercel AI SDK)
    pricing-provider.ts         PricingProvider for BudgetController
    langfuse.ts                 Langfuse observability helpers
  services/
    voice-agent.service.ts      Main orchestrator
    confidence-router.service.ts Intent classification wrapper
    budget-engine.service.ts    Budget enforcement wrapper
    llm-cache.service.ts        Semantic cache wrapper
    agent-handoff.service.ts    Escalation / Twilio SMS wrapper
tests/                          Vitest suite with MSW mocks
```
 
## Packages Used
 
| Package                          | Purpose                        |
|----------------------------------|--------------------------------|
| `@reaatech/confidence-router`    | Intent classification routing  |
| `@reaatech/agent-handoff`        | Handoff protocol types/events  |
| `@reaatech/agent-budget-engine`  | LLM spending budget controller |
| `@reaatech/llm-cache`            | Semantic + exact-match caching |
| `@livekit/agents`                | Real-time voice agent pipeline |
| `@deepgram/sdk`                  | Speech-to-text                 |
| `@cartesia/cartesia-js`          | Speech synthesis               |
| `@ai-sdk/xai`                    | xAI Grok provider for AI SDK   |
| `ai`                             | Vercel AI SDK (generateText)   |
| `twilio`                         | SMS and callback notifications |
| `langfuse`                       | LLM observability / tracing    |
| `livekit-server-sdk`             | LiveKit server API + webhooks  |
| `zod`                            | Schema validation              |
 
## License
 
MIT — see [LICENSE](./LICENSE).