Files · OpenAI Voice Agent for OpenTable Reservation Management
72 (1 binary, 721.3 kB total)attempt 2
README.md·4110 B·markdown
markdown
# OpenAI Voice Agent for OpenTable Reservation Management
> Voice agent that lets diners manage OpenTable reservations by phone using natural conversation, powered by OpenAI, Twilio, Deepgram, and Cartesia.
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
## Problem
Diners often call restaurants to book, modify, or cancel reservations — a manual process that is time-consuming for staff and error-prone. This solution provides an AI-powered voice agent that handles the full reservation lifecycle over the phone.
## Architecture
```
Caller → Twilio → WebSocket → voice-agent-core pipeline
→ STT: Deepgram → LLM: OpenAI → MCP: OpenTable → TTS: Cartesia
→ back to caller
```
The pipeline orchestrates speech-to-text (Deepgram), LLM-based decision making (OpenAI GPT), OpenTable API operations via MCP, and text-to-speech (Cartesia), all wrapped in a latency-budget-enforced pipeline with OpenTelemetry observability and Langfuse tracing.
## Prerequisites
- Node.js >= 22
- pnpm 10.x
- Twilio account with a phone number capable of Media Streams
- Deepgram API key (STT)
- Cartesia API key (TTS)
- OpenAI API key (GPT)
- OpenTable API access
## Quick Start
```bash
pnpm install
cp .env.example .env
# Populate .env with your API keys and credentials
pnpm dev # starts Fastify WS server + Next.js
```
## Configuration
| Variable | Description | Default |
|----------|-------------|---------|
| `OPENAI_API_KEY` | OpenAI API key for GPT | — |
| `TWILIO_ACCOUNT_SID` | Twilio account SID | — |
| `TWILIO_AUTH_TOKEN` | Twilio auth token | — |
| `TWILIO_PHONE_NUMBER` | Twilio phone number | — |
| `DEEPGRAM_API_KEY` | Deepgram API key for STT | — |
| `CARTESIA_API_KEY` | Cartesia API key for TTS | — |
| `OPENTABLE_API_KEY` | OpenTable API key | — |
| `OPENTABLE_BASE_URL` | OpenTable API base URL | `https://platform.opentable.com` |
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key (optional) | — |
| `LANGFUSE_SECRET_KEY` | Langfuse secret key (optional) | — |
| `LANGFUSE_BASE_URL` | Langfuse base URL | `https://cloud.langfuse.com` |
| `OTLP_ENDPOINT` | OTLP traces endpoint (optional) | `http://localhost:4318/v1/traces` |
| `PORT` | Fastify WebSocket server port | `8080` |
| `MCP_ENDPOINT` | MCP server URL | — |
| `PUBLIC_URL` | Publicly routable URL for Twilio webhooks | — |
## API Endpoints
| Method | Path | Description |
|--------|------|-------------|
| `POST` | `/api/twilio/webhook` | Twilio incoming call webhook — returns TwiML pointing to the WebSocket media stream |
| `GET` | `/media-stream` | WebSocket endpoint for Twilio Media Streams (Fastify) |
| `GET` | `/api/health` | Health check returning `{ status, timestamp }` |
## OpenTable Operations
The agent supports four reservation operations via natural language:
- **checkAvailability** — Check table availability by party size, date/time, and duration
- **createReservation** — Create a new reservation with customer details
- **updateReservation** — Modify an existing reservation (party size, time, special requests)
- **cancelReservation** — Cancel a reservation with optional reason
## Testing
```bash
pnpm test
```
Runs the Vitest suite with coverage reporting.
## Deployment
The WebSocket URL in the TwiML response must be publicly accessible. Configure your Twilio phone number's voice webhook to `POST` to:
```
https://<your-domain>/api/twilio/webhook
```
Ensure the Fastify server (port `8080` by default) is reachable at the same public domain for the WebSocket connection.
## Project Layout
```
app/ Next.js App Router pages + API routes
src/ Services, config, pipelines
agent/ LLM orchestrator, tool schemas, tool executor
services/ Pipeline orchestration, observability, conversation store
tests/ Vitest suite (mirrors src/)
packages/ API references for every dependency
```
## License
MIT — see [LICENSE](./LICENSE).