Files · Mistral AI Voice Agent for After-Hours Customer Support
76 (1 binary, 610.7 kB total)attempt 1
README.md·2428 B·markdown
markdown
# Mistral AI Voice Agent for After-Hours Customer Support
SMBs lose business when customers call outside business hours. This voice agent answers after-hours calls, answers FAQs, and books appointments using Mistral's LLM.
## Architecture
Twilio phone call → WebSocket Media Stream → Deepgram STT → Mistral Large (chat completion) → ElevenLabs TTS → spoken response
## Environment Variables
| Variable | Description |
|----------|-------------|
| `TWILIO_ACCOUNT_SID` | Twilio account SID |
| `TWILIO_AUTH_TOKEN` | Twilio auth token |
| `DEEPGRAM_API_KEY` | Deepgram API key |
| `ELEVENLABS_API_KEY` | ElevenLabs API key |
| `MISTRAL_API_KEY` | Mistral AI API key |
| `GOOGLE_CALENDAR_CREDENTIALS` | Path to service account key JSON |
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key |
| `LANGFUSE_SECRET_KEY` | Langfuse secret key |
| `LANGFUSE_HOST` | Langfuse host URL |
| `FASTIFY_PORT` | Fastify server port (default: 3000) |
| `SESSION_TTL_SECONDS` | Session time-to-live in seconds (default: 3600) |
## Setup
```bash
pnpm install
cp .env.example .env
pnpm dev:server
```
## How It Works
1. **Incoming call** — Twilio forwards the call to the Fastify webhook endpoint, which returns TwiML with a WebSocket Media Stream connection.
2. **Speech-to-text** — Deepgram Nova-2 transcribes the caller's audio in real time and emits final utterances.
3. **AI processing** — Mistral Large receives the transcript along with session memory and generates an appropriate response (FAQ answer, appointment booking confirmation, or escalation notice).
4. **Text-to-speech** — ElevenLabs Turbo v2.5 synthesizes the response and streams audio chunks back through the WebSocket.
5. **Response delivery** — The audio is delivered to the caller over the active Twilio call. Multiple turns continue until the caller hangs up.
## REAA Packages
| Package | Role |
|---------|------|
| `@reaatech/voice-agent-core` | Pipeline orchestration, latency budgeting, session lifecycle |
| `@reaatech/voice-agent-telephony` | Twilio WebSocket handler for Media Stream protocol |
| `@reaatech/voice-agent-stt` | Deepgram STT provider integration |
| `@reaatech/voice-agent-tts` | ElevenLabs TTS provider integration |
| `@reaatech/session-continuity` | Session state management and conversation history |
| `@reaatech/agent-memory` | Cross-call memory extraction and retrieval |
## License
MIT — see [LICENSE](./LICENSE).