Skip to content
reaatech

Files · OpenRouter Cost Control for SMB API Spend Management

67 (1 binary, 524.2 kB total)attempt 1

README.md·5275 B·markdown
markdown
# OpenRouter Cost Control for SMB API Spend Management
 
> Enforce daily budgets, track per‑model spend, and automatically downgrade to cheaper fallback models when your SMB API budget is at risk.
 
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build a cost-aware LLM proxy using the `@reaatech/*` package family.
 
## Problem
 
Small businesses using OpenRouter often see unpredictable LLM bills because one expensive model call can blow their monthly budget. Without granular cost tracking and automatic throttling, spend control is reactive at best.
 
## Architecture
 
```
┌─────────────┐     ┌────────────────┐     ┌──────────────┐
│  Client App │────▶│   ProxyService │────▶│   OpenRouter │
└─────────────┘     └────────────────┘     └──────────────┘

                    ┌─────┴──────┐
                    │ BudgetCtrl │
                    │ (reaatech) │
                    └─────┬──────┘

                    ┌─────┴──────┐
                    │ Cost Sink  │
                    │ (exporters)│
                    └────────────┘
```
 
## Features
 
- **Budget enforcement** — per-tenant daily/monthly caps with soft/hard policy
- **Automatic downgrade** — falls back to cheaper models when budget tightens
- **State machine** — Active → Warned → Degraded → Stopped per scope
- **Cost telemetry** — every API call recorded as a CostSpan
- **Fallback chains** — ordered model lists with circuit breakers
- **Cost export** — aggregated spend pushed to CloudWatch or Loki/Phoenix
 
## Prerequisites
 
- Node.js >= 22
- pnpm 10.x
- An [OpenRouter API key](https://openrouter.ai/keys)
 
## Setup
 
```bash
pnpm install
pnpm dev
```
 
## Environment Variables
 
See `.env.example` for the full list. Key variables:
 
| Variable | Description |
|----------|-------------|
| `OPENROUTER_API_KEY` | OpenRouter API key (required) |
| `DEFAULT_DAILY_BUDGET` | Default daily budget per tenant in USD |
| `PRIMARY_MODEL` | Default model (e.g. `openai/gpt-5.2`) |
| `FALLBACK_MODEL_CHAIN` | Comma-separated fallback model IDs |
| `TENANT_BUDGETS` | Per-tenant budget overrides as JSON |
| `LOKI_HOST` | Loki/Phoenix push endpoint for cost export |
 
## API Endpoints
 
### `POST /api/v1/chat/completions`
 
OpenAI-compatible chat completion endpoint with budget enforcement.
 
Request body:
```json
{
  "model": "openai/gpt-5.2",
  "messages": [{ "role": "user", "content": "Hello!" }]
}
```
 
Headers:
- `X-Tenant-Id` — tenant identifier for budget tracking (optional, defaults to "default")
 
### `GET /api/health`
 
Returns `{ "status": "ok", "timestamp": "..." }`.
 
### `GET /api/usage/:tenant`
 
Returns current budget status for a tenant:
```json
{
  "withinBudget": true,
  "dailyPercentage": 45,
  "monthlyPercentage": 12
}
```
 
## Budget Configuration
 
Budgets are configured via environment variables. Each tenant has:
 
- **Daily cap** — hard limit in USD (`DEFAULT_DAILY_BUDGET` or `TENANT_BUDGETS`)
- **Monthly cap** — hard limit in USD (`DEFAULT_MONTHLY_BUDGET`)
- **Soft cap** — at 80% utilization, the system warns and may suggest downgrade
- **Hard cap** — at 100%, requests are blocked (429)
 
The state machine transitions: `Active → Warned → Degraded → Stopped`
 
Fallback models are tried in order when the primary model is blocked by budget or fails.
 
## Testing
 
```bash
pnpm typecheck    # TypeScript type checking
pnpm lint         # ESLint
pnpm test         # Vitest with coverage
```
 
## Project layout
 
```
app/
  api/
    v1/chat/completions/route.ts    Cost-aware proxy endpoint
    health/route.ts                 Health check
    usage/[tenant]/route.ts         Budget status
src/
  index.ts                          Service wiring
  types.ts                          Shared types
  lib/
    config.ts                       Configuration loader
    telemetry.ts                    Cost span creation
    budget-check.ts                 Budget enforcement
    fallback.ts                     Fallback chain
    cost-sink.ts                    Cost aggregation + export
  services/
    proxy-service.ts                Core proxy orchestrator
tests/                              Vitest suite
packages/                           API references for dependencies
DEV_PLAN.md                         Build plan
```
 
## Dependencies
 
- `@reaatech/llm-cost-telemetry` — Core types, schemas, and utilities
- `@reaatech/llm-cost-telemetry-calculator` — Cost calculation engine
- `@reaatech/agent-budget-engine` — Budget enforcement with state machine
- `@reaatech/llm-cost-telemetry-aggregation` — Cost collection and aggregation
- `@reaatech/llm-cost-telemetry-exporters` — CloudWatch/Phoenix export
- `@reaatech/llm-router-fallback` — Fallback chains and circuit breakers
- `openai` — OpenAI SDK (pointed at OpenRouter)
 
## License
 
MIT — see [LICENSE](./LICENSE).