SMBs adopting multiple AI agents across customer support, sales, and operations face unpredictable API costs. Without a provider-agnostic budget, it's easy to overspend or lock into a single vendor.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This recipe wires three systems together: a BudgetController from @reaatech/agent-budget-engine backed by Redis, an LLMRouter from @reaatech/llm-router-engine that selects the cheapest capable model per request, and a telemetry pipeline from @reaatech/llm-cost-telemetry-aggregation that pushes cost data to Langfuse. The result is a Next.js API where every LLM call is checked against a spend limit, routed to the cheapest model, and logged for dashboarding — without locking into a single vendor.
Prerequisites
Node.js >= 22
pnpm 10.x
A running Redis instance (or use Docker: docker run -d -p 6379:6379 redis)
RedisSpendStore extends the SpendStore base from @reaatech/agent-budget-spend-tracker. It accumulates spend in a local Map for fast reads and asynchronously increments the Redis key for durability. The add method converts from the telemetry pipeline’s flat shape into the SpendEntry format the controller expects.
The factory createDefaultPricingProvider reads the ROUTER_CONFIG_MODELS JSON from the environment and seeds a pricing table. If a model is not found in the table, it falls back to the DEFAULT_INPUT_COST_PER_MILLION and DEFAULT_OUTPUT_COST_PER_MILLION environment variables.
Step 5: Build the budget guard
Create src/lib/budget-guard.ts:
ts
import { BudgetController } from "@reaatech/agent-budget-engine";import { BudgetInterceptor } from "@reaatech/agent-budget-middleware";import { BudgetScope } from "@reaatech/agent-budget-types";import { Redis } from "ioredis";import { RedisSpendStore } from "./redis-spend-store.js";import { createDefaultPricingProvider } from "./pricing-provider.js";import { getEnvInt } from "./config.js";export type BudgetGuard = { interceptor: BudgetInterceptor; controller: BudgetController };export function createBudgetGuard(redisUrl: string): BudgetGuard { const redis = new Redis(redisUrl); const spendStore = new RedisSpendStore({ redis }); const pricingProvider = createDefaultPricingProvider(); const controller = new BudgetController({ spendTracker: spendStore, pricing: pricingProvider }); const interceptor = new BudgetInterceptor({ controller }); return { interceptor, controller };}export function defineDefaultBudget( controller: BudgetController, dailyLimit?: number): void { const limit = dailyLimit ?? getEnvInt("DEFAULT_DAILY_BUDGET", 100); controller.defineBudget({ scopeType: BudgetScope.User, scopeKey: "*", limit, policy: { softCap: 0.8, hardCap: 1.0, autoDowngrade: [], disableTools: [], }, });}
createBudgetGuard builds the full chain: Redis client → RedisSpendStore → BudgetController → BudgetInterceptor. The returned interceptor is what you call before and after every LLM call. defineDefaultBudget creates a wildcard budget for the user scope that applies to all users unless overridden by a specific scope key.
loadModelDefinitions reads ROUTER_CONFIG_MODELS and validates each entry against ModelDefinitionSchema from @reaatech/llm-router-core. createLLMRouter registers all models and creates the router with your executeModel callback.
Step 7: Build the provider-agnostic model executor
The executor checks model.provider and routes to the correct API shape — Anthropic uses a different endpoint and header format. If model.apiKeyEnv is set, that env var overrides the provider default.
Step 8: Build the telemetry pipeline
Create src/lib/telemetry-pipeline.ts:
ts
import { CostCollector, CostAggregator, BudgetManager } from "@reaatech/llm-cost-telemetry-aggregation";import { generateId, now, CostSpanSchema, type CostSpan } from "@reaatech/llm-cost-telemetry";import type { Langfuse } from "langfuse";import { getEnvInt, getEnvJson } from "./config.js";const aggregator = new CostAggregator({ dimensions: ["tenant", "feature", "provider", "model"], timeWindows: ["hour", "day", "month"],});const tenantBudgets = getEnvJson<Record<string, { daily:
recordSpan takes a partial CostSpan, fills in the id and timestamp, validates it with CostSpanSchema.parse, and buffers it in the collector. When the collector auto-flushes (every 60 seconds or when the buffer hits 1000 spans), it pushes each span into the aggregator, the budget manager, and Langfuse.
Step 9: Wire the API routes
Create the usage webhook at app/api/webhooks/usage/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import { CostSpanSchema } from "@reaatech/llm-cost-telemetry";import Langfuse from "langfuse";import { createTelemetryPipeline, recordSpan, getTelemetrySummary } from "../../../../src/lib/telemetry-pipeline";import { getEnvVar } from "../../../../src/lib/config";const langfuseClient = new Langfuse({ publicKey: getEnvVar("LANGFUSE_PUBLIC_KEY", ""), secretKey: getEnvVar("LANGFUSE_SECRET_KEY", ""), baseUrl: getEnvVar("LANGFUSE_BASE_URL", "https://cloud.langfuse.com"),});createTelemetryPipeline(langfuseClient);export async function POST(req: NextRequest) { let body: unknown; try { body = await req.json(); } catch { return NextResponse.json({ error: "Request body must be valid JSON" }, { status: 400 }); } const result = CostSpanSchema.safeParse(body); if (!result.success) { return NextResponse.json({ error: "Invalid cost span", details: result.error.issues }, { status: 400 }); } const span = result.data; const spanId = recordSpan(span); return NextResponse.json({ ok: true, spanId }, { status: 202 });}export function GET(req: NextRequest) { const tenant = req.nextUrl.searchParams.get("tenant") ?? undefined; const period = req.nextUrl.searchParams.get("period") ?? undefined; const summary = getTelemetrySummary(tenant, period); return NextResponse.json(summary);}
The POST endpoint validates the incoming cost span with CostSpanSchema.safeParse (returns a 400 with details on Zod errors) and records it. The GET endpoint returns the aggregated summary with optional tenant and period query params.
Create the budget pre-flight check at app/api/budget/check/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import { BudgetScope, BudgetExceededError, BudgetValidationError } from "@reaatech/agent-budget-types";import { createBudgetGuard, defineDefaultBudget } from "../../../../src/lib/budget-guard";import { getEnvVar } from "../../../../src/lib/config";let guardPromise: ReturnType<typeof createBudgetGuard> | null = null;function getGuard() { if (!guardPromise) { guardPromise = createBudgetGuard(getEnvVar("REDIS_URL", "redis://localhost:6379")); } return guardPromise;}export async function POST(req: NextRequest) { let body: Record<string, unknown>; try { body = (await req.json()) as Record<string, unknown>; } catch { return NextResponse.json({ error: "Request body must be valid JSON" }, { status: 400 }); } const scopeType = body.scopeType as string | undefined; const scopeKey = body.scopeKey as string | undefined; const estimatedCost = body.estimatedCost as number | undefined; const modelId = body.modelId as string | undefined; const tools = (body.tools as string[] | undefined) ?? []; if (!scopeType || !scopeKey || estimatedCost === undefined || !modelId) { return NextResponse.json({ error: "Missing required fields" }, { status: 400 }); } if (!Object.values(BudgetScope).includes(scopeType as BudgetScope)) { return NextResponse.json({ error: `Invalid scope type: ${scopeType}` }, { status: 400 }); } if (typeof estimatedCost !== "number" || estimatedCost < 0) { return NextResponse.json({ error: "estimatedCost must be a non-negative number" }, { status: 400 }); } try { const guard = getGuard(); defineDefaultBudget(guard.controller); const ctx = guard.interceptor.beforeStep({ scope: { scopeType: scopeType as BudgetScope, scopeKey }, modelId, tools, estimatedCost, }); return NextResponse.json( { allowed: ctx.allowed, suggestedModel: ctx.modelId !== ctx.originalModelId ? ctx.modelId : undefined, disabledTools: ctx.originalTools.filter((t: string) => !ctx.tools.includes(t)), }, { status: 200 }, ); } catch (err: unknown) { if (err instanceof BudgetExceededError) { return NextResponse.json({ allowed: false, reason: err.message }, { status: 402 }); } if (err instanceof BudgetValidationError) { return NextResponse.json({ error: err.message }, { status: 400 }); } return NextResponse.json({ error: "Internal server error" }, { status: 500 }); }}
POST validates the incoming request, calls guard.interceptor.beforeStep() to check the budget, and returns whether the request is allowed. If the budget is exceeded, it returns 402.
Create the budget record endpoint at app/api/budget/record/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import { BudgetScope, BudgetError } from "@reaatech/agent-budget-types";import { createBudgetGuard, defineDefaultBudget } from "../../../../src/lib/budget-guard";import { getEnvVar } from "../../../../src/lib/config";let guardPromise: ReturnType<typeof createBudgetGuard> | null = null;function getGuard() { if (!guardPromise) { guardPromise = createBudgetGuard(getEnvVar("REDIS_URL", "redis://localhost:6379")); } return guardPromise;}export async function POST(req: NextRequest) { let body: Record<string, unknown>; try { body = (await req.json()) as Record<string, unknown>; } catch { return NextResponse.json({ error: "Request body must be valid JSON" }, { status: 400 }); } const scopeType = body.scopeType as string | undefined; const scopeKey = body.scopeKey as string | undefined; const actualCost = body.actualCost as number | undefined; const inputTokens = body.inputTokens as number | undefined; const outputTokens = body.outputTokens as number | undefined; const modelId = body.modelId as string | undefined; const requestId = body.requestId as string | undefined; if (!scopeType || !scopeKey || actualCost === undefined || inputTokens === undefined || outputTokens === undefined || !modelId || !requestId) { return NextResponse.json({ error: "Missing required fields" }, { status: 400 }); } if (!Object.values(BudgetScope).includes(scopeType as BudgetScope)) { return NextResponse.json({ error: `Invalid scope type: ${scopeType}` }, { status: 400 }); } try { const guard = getGuard(); defineDefaultBudget(guard.controller); guard.interceptor.afterStep({ scope: { scopeType: scopeType as BudgetScope, scopeKey }, allowed: true, modelId, tools: [], originalModelId: modelId, originalTools: [], actualCost, inputTokens, outputTokens, requestId, }); return NextResponse.json({ ok: true }, { status: 200 }); } catch (err) { const error = err as Error; if (error instanceof BudgetError) { return NextResponse.json({ error: error.message }, { status: 400 }); } return NextResponse.json({ error: "Internal server error" }, { status: 500 }); }}
POST records actual spend after an LLM call completes. It validates the scope type against BudgetScope enum values, then calls guard.interceptor.afterStep() with the real token counts and cost.
Create the LLM router endpoint at app/api/router/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import { RoutingRequestSchema } from "@reaatech/llm-router-core";import { ZodError } from "zod";import { createLLMRouter } from "../../../src/lib/router-config";import { executeModel } from "../../../src/lib/execute-model";const router = createLLMRouter(executeModel);export async function POST(req: NextRequest) { let body: unknown; try { body = await req.json(); } catch { return NextResponse.json({ error: "Request body must be valid JSON" }, { status: 400 }); } try { const validated = RoutingRequestSchema.parse(body); const result = await router.route(validated); return NextResponse.json( { model: result.model.id, provider: result.model.provider, strategy: result.strategy, cost: result.cost, latencyMs: result.latencyMs, content: result.result.content, }, { status: 200 }, ); } catch (err) { if (err instanceof ZodError) { return NextResponse.json({ error: "Invalid routing request", details: err.issues }, { status: 400 }); } const error = err as Error; if (error.message.includes("budget") || error.message.includes("Budget")) { return NextResponse.json({ error: error.message }, { status: 429 }); } if (error.message.includes("No matching model") || error.message.includes("no models available")) { return NextResponse.json({ error: error.message }, { status: 503 }); } return NextResponse.json({ error: "Internal server error" }, { status: 500 }); }}export function GET() { const models = router.getModels(); return NextResponse.json({ models });}
POST validates the routing request with RoutingRequestSchema, calls router.route(), and returns the model choice, provider, strategy, cost, latency, and content. GET returns the list of registered models for introspection.
Step 10: Run the tests
The test suite covers every module and route handler with mocks for all external dependencies (ioredis, langfuse, provider APIs). Run it with:
terminal
pnpm vitest run --coverage --reporter=json --outputFile=vitest-report.json
Expected output: all tests pass with numFailedTests=0 and coverage for runtime code (services, lib, route handlers) at 90% or above across lines, branches, functions, and statements. UI files in app/**/page.tsx, layout.tsx, and not-found.tsx are excluded from coverage scope.
Next steps
Add real provider SDK clients in src/lib/execute-model.ts instead of raw fetch calls for retry logic, token caching, and structured error handling.
Hook BudgetInterceptor events (threshold-breach, hard-stop) to a notification service so teams get Slack or email alerts when budgets hit 75% or 90%.
Add src/instrumentation.ts with register() to load dotenv before the Next.js server starts, so env vars are available during the build phase.
Extend TENANT_BUDGETS in .env with per-tenant overrides for your highest-volume customers, and read them in the webhook handler to tag incoming spans with tenant metadata.
Add a /api/budget/reset endpoint that calls controller.reset() for manual budget resets when a billing cycle rolls over.