Skip to content
reaatech

Files · OpenRouter Budget Guardrails for BigCommerce SMB Customer Support

65 (1 binary, 683.7 kB total)attempt 2

README.md·4417 B·markdown
markdown
# OpenRouter Budget Guardrails for BigCommerce SMB Customer Support
 
> A cost-control layer that enforces budget limits, routes to cheaper models when needed, and tracks spending for BigCommerce support chatbots.
 
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
 
## Architecture
 
This recipe wraps an existing chatbot's LLM calls with a four-module cost governance layer:
 
1. **Budget Middleware** (`src/lib/budgetMiddleware.ts`) — Uses `@reaatech/agent-budget-engine` with a `BudgetController` to enforce per-tenant spend caps. Implements a state machine: Active → Warned → Degraded → Stopped, with auto-downgrade to cheaper models when thresholds are breached.
 
2. **Cost Tracker** (`src/lib/costTracker.ts`) — Uses `@reaatech/llm-cost-telemetry` to record token usage and cost per LLM call, logging telemetry to Helicone for observability.
 
3. **Model Router** (`src/lib/modelRouter.ts`) — Uses `@reaatech/llm-router-engine` to dynamically select the optimal model from OpenRouter's offerings based on cost, capability, and remaining budget. Falls back to cheaper models when budgets tighten.
 
4. **Semantic Cache** (`src/lib/cache.ts`) — Uses `@reaatech/llm-cache` with an `InMemoryAdapter` and `OpenAIEmbedder` to store frequently asked product questions. Uses cosine-similarity matching to return cached answers without an LLM call.
 
All four modules are orchestrated by a Next.js App Router API route: `app/api/chat/route.ts`.
 
## Quick Start
 
```bash
pnpm install
pnpm dev              # starts Next.js at http://localhost:3000
```
 
Send a chat request:
 
```bash
curl -X POST http://localhost:3000/api/chat \
  -H 'Content-Type: application/json' \
  -d '{"prompt":"How do I track my order?","customerId":"cust-1","tenantId":"store-1"}'
```
 
## Environment Variables
 
| Variable | Description | Default |
|---|---|---|
| `OPENROUTER_API_KEY` | OpenRouter API key (from https://openrouter.ai/keys) | — |
| `HELICONE_API_KEY` | Helicone API key for cost telemetry | — |
| `DEFAULT_DAILY_BUDGET` | Per-tenant daily budget in USD | 10.0 |
| `DEFAULT_MONTHLY_BUDGET` | Per-tenant monthly budget in USD | 100.0 |
| `DEFAULT_MODEL` | Default model ID for OpenRouter | openai/gpt-5.2-mini |
| `OTEL_SERVICE_NAME` | OpenTelemetry service name | openrouter-budget-guardrails |
 
## Cost Governance
 
The budget lifecycle follows four states:
 
```
Active ──(soft cap 80%)──▶ Warned ──(auto-downgrade)──▶ Degraded ──(hard cap)──▶ Stopped
```
 
- **Active**: Full model selection, no restrictions.
- **Warned**: Budget at ≥80% usage. Strategy switches to `cost-optimized` (cheapest capable model).
- **Degraded**: Budget at ≥100%. Forced to the cheapest workhorse model.
- **Stopped**: Hard cap reached. All requests return HTTP 429.
 
## Running tests
 
```bash
pnpm typecheck        # TypeScript type checking
pnpm lint             # ESLint
pnpm test             # vitest run with coverage (≥90% threshold)
```
 
## Project layout
 
```
app/api/chat/route.ts       Next.js API route handler
src/
  config.ts                 Env-var loading via @reaatech/llm-cost-telemetry
  index.ts                  Programmatic entry point (OpenRouterGuardrails)
  instrumentation.ts        Next.js instrumentation hook (register)
  lib/
    types.ts                Shared Zod schemas and TypeScript interfaces
    budgetMiddleware.ts     BudgetController wrapper
    costTracker.ts          CostSpan telemetry + Helicone logging
    modelRouter.ts          LLMRouter + OpenRouter integration
    cache.ts                CacheEngine + semantic cache
tests/                      vitest suite (48 tests, ≥90% coverage)
```
 
## Packages used
 
- `@reaatech/agent-budget-engine` — Budget enforcement and state machine
- `@reaatech/agent-budget-spend-tracker` — Spend tracking data layer
- `@reaatech/agent-budget-types` — Budget types and error classes
- `@reaatech/llm-cost-telemetry` — Cost telemetry types and utilities
- `@reaatech/llm-router-engine` — Config-driven model router
- `@reaatech/llm-cache` — Exact-match and semantic cache
- `@helicone/async` — Helicone async telemetry (OpenTelemetry auto-instrumentation)
- `openai` — OpenAI SDK (pointed at OpenRouter)
- `zod` — Runtime validation
 
## License
 
MIT — see [LICENSE](./LICENSE).