A multi-agent system that handles order inquiries, shipment tracking, and returns for SMB e‑commerce stores, powered by a vLLM‑hosted model and orchestrated with REAA agent-mesh.
Small online retailers manually handle repetitive customer queries about order status, shipping updates, and return policies. Delegating these tasks to a single LLM agent leads to context‑lost handoffs and inconsistent responses.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building a multi-agent e-commerce order management system powered by a vLLM-hosted LLM and orchestrated with the REAA agent-mesh framework. You’ll create a Next.js App Router application where specialist agents handle order inquiries, shipment tracking, and returns — with intent classification via @reaatech/agent-mesh-router, multi-turn session continuity backed by Upstash Redis, structured output repair for vLLM responses, and per-session cost telemetry.
Prerequisites
Node.js 22+ and pnpm 10+
A vLLM endpoint running an OpenAI-compatible model (e.g., Qwen 2.5 7B Instruct)
An Upstash Redis instance (for session storage)
Familiarity with TypeScript, Next.js App Router, and basic LLM concepts
Step 1: Scaffold the Next.js project and install dependencies
Create a new Next.js project with TypeScript and the App Router, then install the REAA orchestration packages:
Expected output: The dependencies are written to package.json and pnpm-lock.yaml is generated.
Step 2: Configure environment variables
Create .env.example with the variables the system needs at runtime:
env
# Env vars used by vllm-agent-mesh-for-e-commerce-order-management.# Keep placeholders only — never commit real values.NODE_ENV=development# vLLM endpointVLLM_API_KEY=<your-vllm-api-key>VLLM_BASE_URL=<your-vllm-endpoint>VLLM_MODEL=<model-name># Upstash Redis (session storage)UPSTASH_REDIS_URL=<your-upstash-redis-url>UPSTASH_REDIS_TOKEN=<your-upstash-redis-token># Agent registryAGENT_REGISTRY_DIR=./agentsMCP_REQUEST_TIMEOUT_MS=30000MCP_MAX_RETRIES=3# SessionSESSION_TTL_SECONDS=3600# Cost telemetryBUDGET_DAILY_LIMIT=10.00
Copy it to .env.local and fill in your real values:
terminal
cp .env.example .env.local
Expected output: All environment variables are available at process.env.* at runtime.
Step 3: Define the domain types with Zod
Create src/types/index.ts to re-export the core REAA schemas and define your domain-specific types for e-commerce operations:
ts
import { z } from "zod";import { IncomingRequestSchema, type IncomingRequest, AgentResponseSchema, type AgentResponse, ContextPacketSchema, type ContextPacket, ClassifierOutputSchema, type ClassifierOutput, AgentConfigSchema, type AgentConfig, TurnEntrySchema, type TurnEntry, HealthStatusSchema, type HealthStatus,} from "@reaatech/agent-mesh";const OrderInquirySchema = z.object({ orderId: z.string().min(1), customerEmail: z.email(), inquiryType: z.enum(["status", "modification", "cancel", "other"]),});type OrderInquiry = z.infer<typeof OrderInquirySchema>;const ShipmentTrackingSchema = z.object({ trackingNumber: z.string().min(1), carrier: z.string().min(1), status: z.enum(["pending", "in_transit", "delivered", "exception"]), estimatedDelivery: z.iso.datetime(),});type ShipmentTracking = z.infer<typeof ShipmentTrackingSchema>;const ReturnRequestSchema = z.object({ orderId: z.string().min(1), itemId: z.string().min(1), reason: z.string().min(1), refundMethod: z.enum(["original", "store_credit", "exchange"]),});type ReturnRequest = z.infer<typeof ReturnRequestSchema>;const ChatMessageSchema = z.object({ id: z.string().min(1), role: z.enum(["user", "assistant", "system"]), content: z.string(), timestamp: z.iso.datetime(),});type ChatMessage = z.infer<typeof ChatMessageSchema>;export { IncomingRequestSchema, type IncomingRequest, AgentResponseSchema, type AgentResponse, ContextPacketSchema, type ContextPacket, ClassifierOutputSchema, type ClassifierOutput, AgentConfigSchema, type AgentConfig, TurnEntrySchema, type TurnEntry, HealthStatusSchema, type HealthStatus, OrderInquirySchema, type OrderInquiry, ShipmentTrackingSchema, type ShipmentTracking, ReturnRequestSchema, type ReturnRequest, ChatMessageSchema, type ChatMessage,};
Expected output: The @reaatech/agent-mesh schemas are re-exported alongside OrderInquiry, ShipmentTracking, ReturnRequest, and ChatMessage for use throughout the app.
Step 4: Create the vLLM client
Create src/lib/vllm-client.ts to wrap the OpenAI SDK and point it at your vLLM endpoint. This client handles timeout control, error classification, and both synchronous and streaming chat:
Expected output: The vllmClient singleton wraps the OpenAI SDK, pointed at your vLLM endpoint. chat() sends messages and returns the response text, while chatStream() yields tokens for streaming UIs. Custom error types (auth, rate limit, timeout, connection) make error handling specific.
Step 5: Build session state management with Upstash Redis
Create src/lib/state.ts to manage multi-turn conversations using @reaatech/session-continuity backed by Upstash Redis:
ts
import { SessionManager, SlidingWindowStrategy, type TokenCounter, type IStorageAdapter, type Session, type SessionId, type Message, type MessageId, type UpdateSessionOptions, type SessionFilters, type MessageQueryOptions, type HealthStatus, SessionNotFoundError, ConcurrencyError } from "@reaatech/session-continuity";import { Redis } from "@upstash/redis";function dateReviver(_key: string, value: unknown): unknown { if (typeof value === "string" && /^\d{4}-\d{2}-\d{2}T\d{2}:\d
Expected output: The sessionManager singleton stores sessions and messages in Upstash Redis with TTL-based expiry, token budget enforcement using a sliding window strategy, and optimistic concurrency via version checks.
Step 6: Implement cost telemetry
Create src/lib/cost-telemetry.ts to track API spend per call using @reaatech/llm-cost-telemetry:
Expected output:costTelemetry records every LLM call with provider, model, token counts, and calculated cost. getDailyCost() and getMonthlyCost() provide windowed totals, and checkBudget() returns a BudgetStatus object with threshold comparisons.
Step 7: Create the agent registry
Create src/services/agent-registry.ts to manage specialist agents via @reaatech/agent-mesh-registry:
Expected output: The agentRegistry loads agent definitions from YAML files in AGENT_REGISTRY_DIR, supports hot-reload via SIGHUP, and exposes getAgent(), getDefaultAgent(), and getAllAgentIds().
Step 8: Build the handoff manager
Create src/services/handoff-manager.ts to transfer conversation context between agents with retry logic:
Expected output:executeHandoff() transfers a session from one agent to another with up to 3 exponential retries. On SessionNotFoundError it fails immediately; on ConcurrencyError it retries once with linear backoff. Lifecycle events let you observe handoffs for logging.
Step 9: Build the core agent router
Create src/services/agent-router.ts — the central orchestration service that classifies intents via vLLM, routes to the right agent, dispatches with retry, repairs structured outputs, and records costs:
ts
import { dispatchToAgent, buildTurnEntry, formatAgentResponse, shouldCloseSession, getUpdatedWorkflowState, mcpClientFactory } from "@reaatech/agent-mesh-router";import { repair, repairOutput } from "@reaatech/structured-repair-core";void repair;import { createHandoffConfig, TypedEventEmitter, withRetry, pickDefined, HandoffError } from "@reaatech/agent-handoff";import { AgentResponseSchema, type AgentResponse, type AgentConfig, type ClassifierOutput, type TurnEntry } from "@reaatech/agent-mesh";import { vllmClient, getVllmModel } from "../lib/vllm-client.js";import { agentRegistry } from "./agent-registry.js";import { costTelemetry } from "../lib/cost-telemetry.js";import { sessionManager } from "../lib/state.js";
Expected output:handleTurn() is the main entry point. It loads conversation history, classifies the intent via vLLM, routes to the best-matching agent (falling back to the default on low confidence), dispatches with retry, repairs the output with @reaatech/structured-repair-core, records cost, and returns the formatted result.
Step 10: Wire the Chat API route
Create app/api/chat/route.ts as the Next.js App Router endpoint that accepts chat messages and returns agent responses:
Expected output:POST /api/admin triggers an agent registry reload from disk. GET /api/admin returns system status including registered agents, daily/monthly spend, and budget status.
Step 12: Create the Zustand store and chat component
Create src/lib/store.ts for the Zustand-powered chat state:
Expected output: The Zustand store manages messages, loading state, and errors. The Chat component renders a message list and input form. When the user sends a message, it posts to /api/chat and displays the agent’s response.
Step 13: Wire up the home page and layout
Replace app/layout.tsx with the root layout applying Geist fonts:
tsx
import type { Metadata } from "next";import { Geist, Geist_Mono } from "next/font/google";import "./globals.css";const geistSans = Geist({ variable: "--font-geist-sans", subsets: ["latin"],});const geistMono = Geist_Mono({ variable: "--font-geist-mono", subsets: ["latin"],});export const metadata: Metadata = { title: "vLLM Agent Mesh — E‑commerce Order Management", description: "Multi-agent order management system powered by vLLM and agent-mesh orchestrator",};export default function RootLayout({ children,}: Readonly<{ children: React.ReactNode;}>) { return ( <html lang="en" className={`${geistSans.variable} ${geistMono.variable}`}> <body>{children}</body> </html> );}
Replace app/page.tsx with the home page that mounts the Chat component:
Expected output: The layout applies Geist fonts and page metadata. The home page renders a header bar with the app title and a main area containing the interactive chat component.
Step 14: Create the barrel export file
Create src/index.ts as a single entry point that re-exports every public module:
ts
export const SCAFFOLD_VERSION = "0.1.0" as const;export { IncomingRequestSchema, type IncomingRequest, AgentResponseSchema, type AgentResponse, ContextPacketSchema, type ContextPacket, ClassifierOutputSchema, type ClassifierOutput, AgentConfigSchema, type AgentConfig, TurnEntrySchema, type TurnEntry, HealthStatusSchema, type HealthStatus, OrderInquirySchema, type OrderInquiry, ShipmentTrackingSchema, type ShipmentTracking, ReturnRequestSchema, type ReturnRequest, ChatMessageSchema, type ChatMessage,} from "./types/index.js";export { VllmClient, vllmClient, getVllmModel, VllmError, VllmAuthError, VllmRateLimitError, VllmTimeoutError, VllmConnectionError,} from "./lib/vllm-client.js";export { AgentRegistryService, agentRegistry } from "./services/agent-registry.js";export { CostTelemetryService, costTelemetry } from "./lib/cost-telemetry.js";export { SimpleTokenCounter, UpstashRedisAdapter, SessionStateManager, sessionManager,} from "./lib/state.js";export { HandoffManager, handoffManager } from "./services/handoff-manager.js";export { AgentRouterService, agentRouter } from "./services/agent-router.js";export { useChatStore } from "./lib/store.js";
Expected output: All public modules are accessible from a single import path.
Step 15: Configure Vitest and the test setup
Create vitest.config.ts with 90% coverage thresholds on runtime code:
Expected output: Vitest is configured with 90% coverage thresholds on src/**/*.ts and app/**/route.ts. React components and config files are excluded. The setup file primes all required environment variables for the REAA packages before any module is imported.
Step 16: Run the tests
Run the full test suite with coverage:
terminal
pnpm vitest run --coverage
Expected output: All tests pass with at least 90% line, branch, function, and statement coverage. You’ll see output similar to:
Open http://localhost:3000 in your browser, type “Where is my order?” into the chat input, and the multi-agent mesh will classify the intent, route to the order-status specialist, and return a response.
Next steps
Add specialist agent definitions — Create YAML files in AGENT_REGISTRY_DIR for order-status, shipping, and returns agents, then hot-reload with POST /api/admin.
Add streaming responses — Wire VllmClient.chatStream() into a server-sent events endpoint for token-by-token UI updates.
Build a cost dashboard — Expose a /api/admin/dashboard endpoint using the costTelemetry service for daily and monthly spend charts.
Extend context compression — Switch to the summarization or hybrid compression strategies from @reaatech/session-continuity for longer conversation histories.
Add failure monitoring — Subscribe to AgentRouterService.events to detect recurring routing failures and alert operators.
;
}
}
class VllmTimeoutError extends VllmError {
constructor(message: string) {
super(message);
this.name = "VllmTimeoutError";
}
}
class VllmConnectionError extends VllmError {
constructor(message: string) {
super(message);
this.name = "VllmConnectionError";
}
}
interface ChatCompletionMessage {
role: "user" | "assistant" | "system";
content: string;
}
interface ChatOptions {
signal?: AbortSignal;
}
function classifyError(cause: unknown): VllmError {
if (cause instanceof Error) {
if ("status" in cause && typeof (cause as Record<string, unknown>).status === "number") {
const status = (cause as Record<string, number>).status;
if (status === 401) {
return new VllmAuthError(`Authentication failed: ${cause.message}`);
}
if (status === 429) {
return new VllmRateLimitError(`Rate limit exceeded: ${cause.message}`);
}
}
if (cause.name === "APIConnectionTimeoutError" || cause.name === "AbortError") {
return new VllmTimeoutError(`Request timed out: ${cause.message}`);
}
if (cause.name === "APIConnectionError") {
return new VllmConnectionError(`Connection failed: ${cause.message}`);
}
}
return new VllmError(cause instanceof Error ? cause.message : String(cause));