Skip to content
reaatechREAATECH

Files · Google Gemini Voice Agent for Clinic Appointment Scheduling

82 (1 binary, 737.5 kB total)attempt 3

README.md·2804 B·markdown
markdown
# Google Gemini Voice Agent for Clinic Appointment Scheduling
 
> Answer calls, book appointments, and send SMS reminders for medical and dental clinics using a voice AI agent powered by Google Gemini.
 
A voice receptionist that handles real-time phone calls via Twilio. Google Gemini powers the conversation logic, Deepgram provides speech-to-text, and Cartesia delivers text-to-speech. The REAA voice-agent-core pipeline orchestrates the full STT → MCP → TTS flow with latency enforcement and session management.
 
## Architecture
 
```
Phone Call → Twilio → WebSocket → Telephony Handler → Pipeline
                                                         ├── STT (Deepgram Nova-3)
                                                         ├── MCP Client → EHR/Calendar Tools
                                                         ├── LLM (Gemini 2.5 Flash)
                                                         └── TTS (Cartesia Sonic-3.5)
                                            → SMS Reminder (Twilio)
                                            → Observability (Langfuse + OTel)
```
 
## Prerequisites
 
- Node.js 22+
- pnpm 10
 
## Getting started
 
```bash
cp .env.example .env          # fill in your API keys
pnpm install
pnpm dev                      # next dev (localhost:3000)
```
 
Configure your Twilio phone number's voice webhook to `https://your-domain.com/api/voice`.
 
## Project structure
 
```
app/api/voice/route.ts        Twilio webhook → TwiML + WebSocket upgrade
src/instrumentation.ts        Server startup — Wires WS server + pipeline
src/services/
├── pipeline-service.ts       createPipeline wrapper (STT → MCP → TTS)
├── mcp-client-service.ts     MCPClient for EHR/calendar tools
├── telephony-service.ts      Twilio Media Streams handler
└── session-service.ts        SessionManager for conversation context
src/lib/
├── ehr-adapter.ts            Clinic EHR integration (availability, booking)
├── sms-service.ts            Twilio SMS reminders
├── llm-service.ts            Gemini 2.5 Flash + tool calling
├── stt-service.ts            Deepgram Nova-3 live transcription
├── tts-service.ts            Cartesia Sonic-3.5 speech synthesis
├── observability.ts          Langfuse + OpenTelemetry tracing
├── config.ts                 Zod-validated env config
├── memory-storage-adapter.ts IStorageAdapter for session-continuity
└── character-token-counter.ts TokenCounter for session-budget enforcement
tests/                        Vitest test suite (mirrors src/)
packages/                     API references for all dependencies
```
 
## Environment variables
 
See `.env.example` for the full list.
 
## License
 
MIT — see [LICENSE](./LICENSE).