Files · vLLM Multi-Agent Handoff for E-commerce Support Routing
77 (1 binary, 533.1 kB total)attempt 1
README.md·3315 B·markdown
markdown
# vLLM Multi-Agent Handoff for E-commerce Support Routing
> Route customer queries across product, order, and returns agents hosted on vLLM, with compressed context handoff so no conversation gets lost.
E-commerce support teams hosting cost-effective vLLM models face difficulty coordinating multiple specialist agents. Misrouted questions cause customer frustration and agent loops. This recipe demonstrates a complete solution using the `@reaatech/agent-handoff` protocol family, LangGraph state orchestration, and Upstash Redis session persistence.
## How it works
1. **Chat API** (`POST /api/chat`) accepts a `{ sessionId, message }` from the customer
2. A **LangGraph state machine** loads the conversation session from Redis
3. A **CapabilityBasedRouter** scores the message against three specialist agents (Product, Order, Returns) using weighted skill/domain/load scoring
4. The highest-scoring agent's vLLM model (via `@ai-sdk/openai-compatible`) generates a response
5. Before crossing the token budget, **HybridCompressor** condenses conversation history
6. The updated session is persisted back to Redis with 30-minute TTL and retry resilience via `p-retry`
## Prerequisites
- A running vLLM endpoint with at least one deployed model
- An Upstash Redis database (or any Redis-compatible HTTP/REST service)
## Getting Started
```bash
pnpm install
cp .env.example .env
# Fill in VLLM_BASE_URL, VLLM_API_KEY, per-agent model IDs, and Upstash Redis credentials
pnpm dev
```
## API Reference
### `POST /api/chat`
**Request body:**
```json
{
"sessionId": "string",
"message": "string (min 1 char)",
"language": "string (optional)"
}
```
**Success response (200):**
```json
{
"reply": "string",
"sessionId": "string",
"routedTo": "string (agent name)"
}
```
**Error responses:**
- `400` — invalid request body (missing or empty message)
- `500` — internal server error
## Tech Stack
| Package | Version | Role |
|---|---|---|
| `@reaatech/agent-handoff` | 0.1.0 | Core types, errors, utilities |
| `@reaatech/agent-handoff-routing` | 0.1.0 | Weighted scoring router |
| `@reaatech/agent-handoff-protocol` | 0.1.0 | HandoffManager orchestration |
| `@reaatech/agent-handoff-compression` | 0.1.0 | Context compression strategies |
| `@langchain/langgraph` | 1.4.4 | Stateful agent graph |
| `@ai-sdk/openai-compatible` | 2.0.51 | vLLM model adapter |
| `@upstash/redis` | 1.38.0 | Session storage |
| `p-retry` | 8.0.0 | Retry with backoff |
| `ai` | 6.0.208 | AI SDK generation |
| `zod` | 4.4.3 | Schema validation |
## Project Structure
```
app/api/chat/route.ts Chat API endpoint (Next.js App Router)
src/lib/config.ts Zod-validated environment config
src/lib/types.ts Shared TypeScript types + Zod schemas
src/handoff/agents.ts Agent registry (3 e-commerce specialists)
src/handoff/vllm-client.ts vLLM model factory via @ai-sdk/openai-compatible
src/handoff/compression.ts Context compression via HybridCompressor
src/handoff/session-store.ts Redis-backed session persistence with p-retry
src/handoff/router.ts CapabilityBasedRouter wrapper
src/handoff/manager.ts HandoffManager orchestration
src/handoff/graph.ts LangGraph state machine
src/index.ts Public API re-exports
```
## License
MIT — see [LICENSE](./LICENSE).