Skip to content
reaatech

Files · vLLM Multi-Agent Handoff for E-commerce Support Routing

77 (1 binary, 533.1 kB total)attempt 1

README.md·3315 B·markdown
markdown
# vLLM Multi-Agent Handoff for E-commerce Support Routing
 
> Route customer queries across product, order, and returns agents hosted on vLLM, with compressed context handoff so no conversation gets lost.
 
E-commerce support teams hosting cost-effective vLLM models face difficulty coordinating multiple specialist agents. Misrouted questions cause customer frustration and agent loops. This recipe demonstrates a complete solution using the `@reaatech/agent-handoff` protocol family, LangGraph state orchestration, and Upstash Redis session persistence.
 
## How it works
 
1. **Chat API** (`POST /api/chat`) accepts a `{ sessionId, message }` from the customer
2. A **LangGraph state machine** loads the conversation session from Redis
3. A **CapabilityBasedRouter** scores the message against three specialist agents (Product, Order, Returns) using weighted skill/domain/load scoring
4. The highest-scoring agent's vLLM model (via `@ai-sdk/openai-compatible`) generates a response
5. Before crossing the token budget, **HybridCompressor** condenses conversation history
6. The updated session is persisted back to Redis with 30-minute TTL and retry resilience via `p-retry`
 
## Prerequisites
 
- A running vLLM endpoint with at least one deployed model
- An Upstash Redis database (or any Redis-compatible HTTP/REST service)
 
## Getting Started
 
```bash
pnpm install
cp .env.example .env
# Fill in VLLM_BASE_URL, VLLM_API_KEY, per-agent model IDs, and Upstash Redis credentials
pnpm dev
```
 
## API Reference
 
### `POST /api/chat`
 
**Request body:**
```json
{
  "sessionId": "string",
  "message": "string (min 1 char)",
  "language": "string (optional)"
}
```
 
**Success response (200):**
```json
{
  "reply": "string",
  "sessionId": "string",
  "routedTo": "string (agent name)"
}
```
 
**Error responses:**
- `400` — invalid request body (missing or empty message)
- `500` — internal server error
 
## Tech Stack
 
| Package | Version | Role |
|---|---|---|
| `@reaatech/agent-handoff` | 0.1.0 | Core types, errors, utilities |
| `@reaatech/agent-handoff-routing` | 0.1.0 | Weighted scoring router |
| `@reaatech/agent-handoff-protocol` | 0.1.0 | HandoffManager orchestration |
| `@reaatech/agent-handoff-compression` | 0.1.0 | Context compression strategies |
| `@langchain/langgraph` | 1.4.4 | Stateful agent graph |
| `@ai-sdk/openai-compatible` | 2.0.51 | vLLM model adapter |
| `@upstash/redis` | 1.38.0 | Session storage |
| `p-retry` | 8.0.0 | Retry with backoff |
| `ai` | 6.0.208 | AI SDK generation |
| `zod` | 4.4.3 | Schema validation |
 
## Project Structure
 
```
app/api/chat/route.ts     Chat API endpoint (Next.js App Router)
src/lib/config.ts         Zod-validated environment config
src/lib/types.ts          Shared TypeScript types + Zod schemas
src/handoff/agents.ts     Agent registry (3 e-commerce specialists)
src/handoff/vllm-client.ts vLLM model factory via @ai-sdk/openai-compatible
src/handoff/compression.ts Context compression via HybridCompressor
src/handoff/session-store.ts Redis-backed session persistence with p-retry
src/handoff/router.ts     CapabilityBasedRouter wrapper
src/handoff/manager.ts    HandoffManager orchestration
src/handoff/graph.ts      LangGraph state machine
src/index.ts              Public API re-exports
```
 
## License
 
MIT — see [LICENSE](./LICENSE).