Files · OpenRouter Cost Control for SMB API Spend Management
67 (1 binary, 524.2 kB total)attempt 1
README.md·5275 B·markdown
markdown
# OpenRouter Cost Control for SMB API Spend Management
> Enforce daily budgets, track per‑model spend, and automatically downgrade to cheaper fallback models when your SMB API budget is at risk.
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build a cost-aware LLM proxy using the `@reaatech/*` package family.
## Problem
Small businesses using OpenRouter often see unpredictable LLM bills because one expensive model call can blow their monthly budget. Without granular cost tracking and automatic throttling, spend control is reactive at best.
## Architecture
```
┌─────────────┐ ┌────────────────┐ ┌──────────────┐
│ Client App │────▶│ ProxyService │────▶│ OpenRouter │
└─────────────┘ └────────────────┘ └──────────────┘
│
┌─────┴──────┐
│ BudgetCtrl │
│ (reaatech) │
└─────┬──────┘
│
┌─────┴──────┐
│ Cost Sink │
│ (exporters)│
└────────────┘
```
## Features
- **Budget enforcement** — per-tenant daily/monthly caps with soft/hard policy
- **Automatic downgrade** — falls back to cheaper models when budget tightens
- **State machine** — Active → Warned → Degraded → Stopped per scope
- **Cost telemetry** — every API call recorded as a CostSpan
- **Fallback chains** — ordered model lists with circuit breakers
- **Cost export** — aggregated spend pushed to CloudWatch or Loki/Phoenix
## Prerequisites
- Node.js >= 22
- pnpm 10.x
- An [OpenRouter API key](https://openrouter.ai/keys)
## Setup
```bash
pnpm install
pnpm dev
```
## Environment Variables
See `.env.example` for the full list. Key variables:
| Variable | Description |
|----------|-------------|
| `OPENROUTER_API_KEY` | OpenRouter API key (required) |
| `DEFAULT_DAILY_BUDGET` | Default daily budget per tenant in USD |
| `PRIMARY_MODEL` | Default model (e.g. `openai/gpt-5.2`) |
| `FALLBACK_MODEL_CHAIN` | Comma-separated fallback model IDs |
| `TENANT_BUDGETS` | Per-tenant budget overrides as JSON |
| `LOKI_HOST` | Loki/Phoenix push endpoint for cost export |
## API Endpoints
### `POST /api/v1/chat/completions`
OpenAI-compatible chat completion endpoint with budget enforcement.
Request body:
```json
{
"model": "openai/gpt-5.2",
"messages": [{ "role": "user", "content": "Hello!" }]
}
```
Headers:
- `X-Tenant-Id` — tenant identifier for budget tracking (optional, defaults to "default")
### `GET /api/health`
Returns `{ "status": "ok", "timestamp": "..." }`.
### `GET /api/usage/:tenant`
Returns current budget status for a tenant:
```json
{
"withinBudget": true,
"dailyPercentage": 45,
"monthlyPercentage": 12
}
```
## Budget Configuration
Budgets are configured via environment variables. Each tenant has:
- **Daily cap** — hard limit in USD (`DEFAULT_DAILY_BUDGET` or `TENANT_BUDGETS`)
- **Monthly cap** — hard limit in USD (`DEFAULT_MONTHLY_BUDGET`)
- **Soft cap** — at 80% utilization, the system warns and may suggest downgrade
- **Hard cap** — at 100%, requests are blocked (429)
The state machine transitions: `Active → Warned → Degraded → Stopped`
Fallback models are tried in order when the primary model is blocked by budget or fails.
## Testing
```bash
pnpm typecheck # TypeScript type checking
pnpm lint # ESLint
pnpm test # Vitest with coverage
```
## Project layout
```
app/
api/
v1/chat/completions/route.ts Cost-aware proxy endpoint
health/route.ts Health check
usage/[tenant]/route.ts Budget status
src/
index.ts Service wiring
types.ts Shared types
lib/
config.ts Configuration loader
telemetry.ts Cost span creation
budget-check.ts Budget enforcement
fallback.ts Fallback chain
cost-sink.ts Cost aggregation + export
services/
proxy-service.ts Core proxy orchestrator
tests/ Vitest suite
packages/ API references for dependencies
DEV_PLAN.md Build plan
```
## Dependencies
- `@reaatech/llm-cost-telemetry` — Core types, schemas, and utilities
- `@reaatech/llm-cost-telemetry-calculator` — Cost calculation engine
- `@reaatech/agent-budget-engine` — Budget enforcement with state machine
- `@reaatech/llm-cost-telemetry-aggregation` — Cost collection and aggregation
- `@reaatech/llm-cost-telemetry-exporters` — CloudWatch/Phoenix export
- `@reaatech/llm-router-fallback` — Fallback chains and circuit breakers
- `openai` — OpenAI SDK (pointed at OpenRouter)
## License
MIT — see [LICENSE](./LICENSE).