Files · Vertex AI Voice Agent for Cal.com Appointment Scheduling
45 (0 binary, 379.8 kB total)attempt 3
README.md·5408 B·markdown
markdown
# Vertex AI Voice Agent for Cal.com Appointment Scheduling
> Let customers book appointments on Cal.com over the phone with a voice agent that understands natural language, verifies availability, and confirms bookings.
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
## Tech Stack
- **Runtime**: Node.js 22+, TypeScript 6
- **Framework**: Next.js 16 (App Router) + Express 5 (custom server)
- **Voice/Telephony**: Twilio (PSTN), Deepgram (STT), Cartesia (TTS)
- **AI**: Google Gemini via `@google/genai` (Vertex AI)
- **Intent Routing**: `@reaatech/confidence-router-core` + `@reaatech/confidence-router-classifiers`
- **Guardrails**: `@reaatech/guardrail-chain-guardrails` (PII redaction, prompt injection)
- **Budget**: `@reaatech/agent-budget-engine` + `@reaatech/agent-budget-spend-tracker`
- **Calendar**: Cal.com REST API + OAuth2 (JWT via `jose`)
- **Testing**: Vitest (v8 coverage, MSW), ESLint (typescript-eslint strict)
## Architecture
```
PSTN Call -> Twilio -> Express (/voice/incoming, /media-stream WebSocket)
|
Deepgram STT (WebSocket)
|
GuardrailChain (PII redaction)
|
IntentRouter (confidence-router -> DecisionEngine)
|
Gemini (Vertex AI, function calling)
| |
CalendarService Cartesia TTS (WebSocket)
(Cal.com API) |
Twilio stream -> Caller
```
## Environment Variables
| Variable | Description | Example |
|----------|-------------|---------|
| `GOOGLE_CLOUD_PROJECT` | GCP project ID | `my-project-123` |
| `GOOGLE_CLOUD_LOCATION` | GCP region | `us-central1` |
| `GOOGLE_APPLICATION_CREDENTIALS` | Path to service account key | `/path/to/key.json` |
| `TWILIO_ACCOUNT_SID` | Twilio account SID | `ACxxxxxxxxxx` |
| `TWILIO_AUTH_TOKEN` | Twilio auth token | `xxxxxxxxxx` |
| `TWILIO_PHONE_NUMBER` | Twilio phone number (E.164) | `+15551234567` |
| `DEEPGRAM_API_KEY` | Deepgram API key | `xxxxxxxxxx` |
| `CARTESIA_API_KEY` | Cartesia API key | `xxxxxxxxxx` |
| `CALCOM_CLIENT_ID` | Cal.com OAuth2 client ID | `xxxxxxxxxx` |
| `CALCOM_CLIENT_SECRET` | Cal.com OAuth2 client secret | `xxxxxxxxxx` |
| `CALCOM_PRIVATE_KEY` | Cal.com JWT signing private key | `-----BEGIN PRIVATE KEY-----...` |
| `CALCOM_API_URL` | Cal.com API base URL | `https://api.cal.com` |
| `ANTHROPIC_API_KEY` | (optional) For LLMClassifier fallback | `sk-ant-...` |
| `OPENAI_API_KEY` | (optional) For LLMClassifier fallback | `sk-...` |
| `LANGFUSE_SECRET_KEY` | Langfuse secret key | `sk-...` |
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key | `pk-...` |
| `LANGFUSE_HOST` | Langfuse host URL | `https://cloud.langfuse.com` |
| `EXPRESS_PORT` | Express server port | `3001` |
| `MAX_CALL_COST_CENTS` | Per-call budget in cents | `50` |
## Getting API Keys
- **Twilio**: Sign up at [twilio.com](https://twilio.com), buy a voice-capable phone number
- **Deepgram**: Register at [deepgram.com](https://deepgram.com), get an API key with STT access
- **Cartesia**: Create an account at [cartesia.ai](https://cartesia.ai) and generate an API key
- **Google Cloud Vertex AI**: Enable Vertex AI API, create a service account, download JSON key
- **Cal.com OAuth2**: Create a developer client in Cal.com settings, generate a client ID/secret and a private key for JWT signing
## Conversation Flow
1. **Greeting**: Caller connects, agent plays welcome message via Cartesia TTS
2. **Intent Detection**: Caller speaks, audio is streamed to Deepgram STT, transcript is classified by IntentRouter (keyword -> optional LLM fallback)
3. **Slot Filling**: Gemini extracts appointment details (date, time, service) via function calling
4. **Booking Confirmation**: Cal.com API creates/reschedules/cancels the booking
5. **Summary**: Agent reads back confirmation, call ends
## Running locally
```bash
pnpm install
cp .env.example .env # fill in your credentials
pnpm test # vitest run with coverage (all external services mocked)
pnpm typecheck # TypeScript type checking
pnpm lint # ESLint
pnpm dev # next dev (port 3000) + Express (port 3001) concurrently
pnpm start # Express production server only
```
> **Note**: All external services (Twilio, Deepgram, Cartesia, Gemini, Cal.com) are mocked in tests. No real credentials needed to run the test suite.
## Project layout
```
app/ Next.js App Router pages + API routes
src/ services, lib, adapters
ai/ Gemini integration
calcom/ Cal.com REST API client
guardrails/ PII redaction + prompt injection
repair/ Calendar payload validation (Zod)
routing/ Intent classification
telephony/ Twilio + Deepgram + Cartesia integration
budget.ts Per-call budget enforcement
server.ts Express server
types.ts Shared type definitions
tests/ vitest suite (mirrors src/)
packages/ API references for every dependency (read these first)
DEV_PLAN.md build plan for this recipe
```
## License
MIT — see [LICENSE](./LICENSE).