OpenAI Cost Control for SMB Agent Workflows

Slash OpenAI spending with AI-powered caching and budget enforcement for small business support teams.

openai cost-control nextjs redis langfuse caching budget-enforcement smb

The problem

Small businesses using OpenAI for customer-facing agents watch bills climb from repeated questions and no way to cap monthly spend. They need automatic cost tracking, caching of common answers, and hard budget limits.

Built from

Intro

Small businesses using OpenAI for customer-facing agents watch their bills climb as repeated questions and unmonitored usage pile up. This recipe builds a complete cost-control system that wraps every OpenAI call with automatic token-cost tracking via @reaatech/llm-cost-telemetry, enforces monthly budgets with hard stop limits using @reaatech/agent-budget-engine, and serves cached responses for semantically similar prompts through @reaatech/llm-cache with Redis — cutting redundant API spending. A Next.js dashboard exposes four endpoints for budget state, cost aggregation, usage history, and cache health.

Prerequisites

Node.js 22+ and pnpm 10+ installed
A Next.js project scaffolded with the App Router — or create one with npx create-next-app@latest
OpenAI API key — set as OPENAI_API_KEY
A running Redis instance — set as REDIS_URL (defaults to redis://localhost:6379)
A running PostgreSQL database — set as DATABASE_URL
A Langfuse account (optional) — set as LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY
Familiarity with TypeScript, Next.js App Router, and environment variable management

Step 1: Create the project and install dependencies

Scaffold a fresh Next.js project with the App Router, then install all the cost-control packages in one go.

terminal

npx create-next-app@latest

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

167 kB·84 tests·99.5% coverage·vitest passing

SHA-2564a47906c74e588006cbd8a539458c99a7c2ae76a670de4b79758809325b22aeb

Book a conversation All solutions

Comments

Loading comments…

import OpenAI from "openai"; import { generateId, now, loadConfig, calculateCostFromTokens, CostSpanSchema, type CostSpan, type TelemetryContext, } from "@reaatech/llm-cost-telemetry"; import type { OpenAIRequest, OpenAIResponse } from "./cost-types.js"; import { ApplicationError } from "./cost-types.js"; import { collector } from "./aggregation.js"; const appConfig = loadConfig(); const telemetryServiceName = appConfig.telemetry.serviceName; const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY ?? telemetryServiceName }); export async function makeOpenAICall( request: OpenAIRequest, telemetryContext: TelemetryContext, ): Promise<OpenAIResponse> { try { const response = await client.responses.create({ model: request.model, instructions: request.instructions, input: request.input, temperature: request.temperature ?? null, max_output_tokens: request.maxTokens ?? null, }); const totalTokens = (response.usage?.input_tokens ?? 0) + (response.usage?.output_tokens ?? 0); const costFromTokens = calculateCostFromTokens(totalTokens, 30); const costFromOutput = calculateCostFromTokens(response.usage?.output_tokens ?? 0, 8); const costSpan: CostSpan = { id: generateId(), provider: "openai", model: request.model, inputTokens: response.usage?.input_tokens ?? 0, outputTokens: response.usage?.output_tokens ?? 0, costUsd: costFromTokens + costFromOutput, tenant: telemetryContext.tenant, feature: telemetryContext.feature, timestamp: now(), }; const validated = CostSpanSchema.parse(costSpan); collector.add(validated); return { outputText: response.output_text, costSpan: validated, cachedHit: false, cachedSavings: 0, }; } catch (err) { if (err instanceof OpenAI.APIError) { console.error("OpenAI API error:", err.status, err.name); throw new ApplicationError( "openai_error", err.message, err.status as number | undefined, ); } if (err instanceof ApplicationError) throw err; throw new ApplicationError( "openai_connection_error", err instanceof Error ? err.message : String(err), ); } }

import { BudgetController, PolicyEvaluator, DowngradeEngine, ToolFilter } from "@reaatech/agent-budget-engine"; import { SpendStore } from "@reaatech/agent-budget-spend-tracker"; import type { BudgetScope } from "@reaatech/agent-budget-types"; import type { TenantBudgetConfig, BudgetCheckResult } from "./cost-types.js"; const store = new SpendStore(); const controller = new BudgetController({ spendTracker: store }); const policyEvaluator = new PolicyEvaluator(); const downgradeEngine = new DowngradeEngine(); const toolFilter = new ToolFilter(); controller.on("threshold-breach", (event) => { const evt = event as { threshold: number }; console.warn("Budget at " + String(evt.threshold * 100) + "% for scope"); }); controller.on("hard-stop", (event) => { const evt = event as { spent: number }; console.error("Budget exhausted for scope", evt.spent); }); controller.on("budget-reset", (event) => { console.info("Budget reset for scope", event); }); void policyEvaluator; void downgradeEngine; void toolFilter; export function loadBudgetsForTenant( tenantId: string, config: TenantBudgetConfig, ): void { controller.defineBudget({ scopeType: "user" as BudgetScope, scopeKey: tenantId, limit: config.monthlyBudget, policy: { softCap: config.softCap, hardCap: config.hardCap, autoDowngrade: config.autoDowngrade ? [{ from: ["gpt-5.2"], to: "gpt-5.2-mini" }] : [], disableTools: config.disabledTools, }, }); } export function budgetGuard( tenantId: string, estimatedCost: number, modelId: string, tools?: string[], ): BudgetCheckResult { const r = controller.check({ scopeType: "user" as BudgetScope, scopeKey: tenantId, estimatedCost, modelId, tools: tools ?? [], }) as Partial<{ allowed: boolean; action: string; suggestedModel: string | null; disabledTools: string[]; remaining: number; }>; const action = r.action as string; return { allowed: r.allowed ?? true, action: action === "hard-stop" ? "Block" : action === "allow" ? "Allow" : "Warn", suggestedModel: r.suggestedModel ?? null, disabledTools: r.disabledTools ?? [], remaining: r.remaining ?? 0, }; } export function recordSpend(entry: { requestId: string; tenantId: string; cost: number; inputTokens: number; outputTokens: number; modelId: string; provider: string; }): void { controller.record({ requestId: entry.requestId, scopeType: "user" as BudgetScope, scopeKey: entry.tenantId, cost: entry.cost, inputTokens: entry.inputTokens, outputTokens: entry.outputTokens, modelId: entry.modelId, provider: entry.provider, timestamp: new Date(), }); } export function getBudgetState(tenantId: string) { return controller.getState("user" as BudgetScope, tenantId); } export function getDisabledToolsForTenant(tenantId: string): string[] { return controller.getDisabledTools("user" as BudgetScope, tenantId); }

import { CostCollector, CostAggregator, BudgetManager, } from "@reaatech/llm-cost-telemetry-aggregation"; import type { CostSpan, BudgetStatus } from "@reaatech/llm-cost-telemetry"; import type { CostRecord, CostSummary } from "@reaatech/llm-cost-telemetry"; export const aggregator = new CostAggregator({ dimensions: ["tenant", "feature", "provider", "model"], timeWindows: ["hour", "day", "month"], }); export const budget = new BudgetManager({ global: { daily: Number(process.env.DEFAULT_DAILY_BUDGET ?? 100), monthly: Number(process.env.DEFAULT_MONTHLY_BUDGET ?? 2000), }, alerts: [ { threshold: 0.5, action: "log" }, { threshold: 0.75, action: "notify" }, { threshold: 0.9, action: "block" }, ], }); export function flushHandler(spans: CostSpan[]): void { for (const span of spans) { const tenantId = span.tenant ?? span.telemetry?.tenant ?? "unknown"; aggregator.add(span); void budget.record({ tenant: tenantId, cost: span.costUsd }); } } export const collector = new CostCollector({ maxBufferSize: 1000, flushIntervalMs: 60000, onFlush: flushHandler, }); export function getTenantCosts( tenantId: string, _period?: string, ): { totalUsd: number; byProvider: Record<string, number>; byModel: Record<string, number> } { void _period; const records: CostRecord[] = aggregator.getByTenant(tenantId); const totalUsd = records.reduce((sum, r) => sum + (r.totalUsd ?? 0), 0); const byProvider: Record<string, number> = {}; const byModel: Record<string, number> = {}; for (const r of records) { if (r.key?.provider) { byProvider[r.key.provider] = (byProvider[r.key.provider] ?? 0) + (r.totalUsd ?? 0); } if (r.key?.model) { byModel[r.key.model] = (byModel[r.key.model] ?? 0) + (r.totalUsd ?? 0); } } return { totalUsd, byProvider, byModel }; } export function getCostSummary(options?: { period?: string; groupBy?: string[]; }): CostSummary { return aggregator.getSummary({ period: options?.period, groupBy: options?.groupBy, } as Parameters<typeof aggregator.getSummary>[0]); } export async function checkTenantBudget( tenantId: string, estimatedCost: number, ): Promise<BudgetStatus> { return await budget.check({ tenant: tenantId, estimatedCost }); } export function getBudgetStatus( tenantId: string, ): BudgetStatus { return budget.getStatus(tenantId); }

import { db, tenants, usageLog } from "./db/index.js"; import { eq, desc, gte, and } from "drizzle-orm"; import type { TenantBudgetConfig } from "./cost-types.js"; function mapTenantRow(row: Record<string, unknown>): TenantBudgetConfig { return { tenantId: String(row.id), dailyBudget: Number(row.dailyBudget), monthlyBudget: Number(row.monthlyBudget), softCap: Number(row.softCap), hardCap: Number(row.hardCap), autoDowngrade: Boolean(row.autoDowngrade), disabledTools: [], }; } export async function getAllTenantBudgets(): Promise<TenantBudgetConfig[]> { const rows = await db.select().from(tenants); return rows.map(mapTenantRow); } export async function getTenantBudget( tenantId: string, ): Promise<TenantBudgetConfig | null> { const rows = await db .select() .from(tenants) .where(eq(tenants.id, Number(tenantId))) .limit(1); if (rows.length === 0) return null; return mapTenantRow(rows[0]); } export async function updateTenantBudget( tenantId: string, config: Partial<TenantBudgetConfig>, ): Promise<void> { const updateData: Record<string, unknown> = {}; if (config.dailyBudget !== undefined) updateData.dailyBudget = String(config.dailyBudget); if (config.monthlyBudget !== undefined) updateData.monthlyBudget = String(config.monthlyBudget); if (config.softCap !== undefined) updateData.softCap = String(config.softCap); if (config.hardCap !== undefined) updateData.hardCap = String(config.hardCap); if (config.autoDowngrade !== undefined) updateData.autoDowngrade = config.autoDowngrade; await db .update(tenants) .set(updateData) .where(eq(tenants.id, Number(tenantId))); } export async function logUsage(entry: { tenantId: string; model: string; inputTokens: number; outputTokens: number; costUsd: number; cached: boolean; }): Promise<void> { await db.insert(usageLog).values({ tenantId: entry.tenantId, model: entry.model, inputTokens: entry.inputTokens, outputTokens: entry.outputTokens, costUsd: String(entry.costUsd), cached: entry.cached, }); } export async function getUsageHistory( tenantId: string, since?: Date, ): Promise<Array<Record<string, unknown>>> { const whereClause = since ? and(eq(usageLog.tenantId, tenantId), gte(usageLog.createdAt, since)) : eq(usageLog.tenantId, tenantId); const rows = await db .select() .from(usageLog) .where(whereClause) .orderBy(desc(usageLog.createdAt)) .limit(100); return rows; }

OpenAI Cost Control for SMB Agent Workflows

The problem

Built from

Intro

Prerequisites

Step 1: Create the project and install dependencies

Example artifact

Comments

Intro

Prerequisites

Step 1: Create the project and install dependencies

Step 2: Configure environment variables

Step 3: Set up the database connection and schema

Step 4: Define shared types

Step 5: Build the OpenAI wrapper with cost telemetry

Step 6: Build the budget middleware

Step 7: Build the cache layer with Redis and semantic matching

Step 8: Build the cost aggregation layer

Step 9: Add Langfuse observability

Step 10: Build the settings store (Drizzle-backed CRUD)

Step 11: Build the orchestrator — the unified pipeline

Step 12: Create the Next.js API routes

Step 13: Write the tests

Next steps