SMBs adopting AWS Bedrock for AI features often see unpredictable costs because there's no native spend controls. A single mis‑sized prompt or a spike in traffic can burn through a month's budget in hours, with no granular insight into which model or user caused it.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
In this tutorial, you’ll build a budget-aware proxy around AWS Bedrock LLM calls using Next.js and REAA’s budget engineering packages. Every Bedrock invocation passes through a BudgetInterceptor that estimates cost, checks remaining budget, optionally downgrades the model, records the spend, and pushes metrics to Helicone. You’ll also get a dashboard page that shows spend across all budget scopes, plus API routes to query state and invoke models with automatic enforcement.
Prerequisites
Node.js >= 22
pnpm 10.x (install with npm install -g pnpm)
AWS account with Bedrock access (access key ID, secret access key, and region)
Helicone account for AI observability (API key — free tier works)
Basic familiarity with TypeScript and Next.js App Router
Step 1: Initialize the project
Start by creating a fresh Next.js project with TypeScript. The App Router is required for this recipe’s route handler and instrumentation hook setup.
Expected output: A scaffolded Next.js app at ./bedrock-cost-control/ with package.json, tsconfig.json, and the app/ directory.
Step 2: Install dependencies
The recipe uses five REAA budget packages alongside the AWS Bedrock runtime client, Helicone helpers, p-retry for retry logic, and Zod for schema validation.
Expected output: All packages added to node_modules and package.json.
Step 3: Set up environment variables
Create a .env file in the project root. The app reads AWS credentials, Helicone config, and budget defaults from environment variables. The src/lib/env.ts schema validates these at startup.
The .env.example file declares every variable with a comment describing its purpose. The Zod schema in src/lib/env.ts enforces that AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and HELICONE_API_KEY are non-empty strings at startup.
Step 4: Create the environment schema
Create src/lib/env.ts. This file validates environment variables at import time using Zod, exporting a typed env object. If any required variable is missing, the app throws during startup rather than failing silently at runtime.
Create src/lib/types.ts. This file holds Zod schemas for request validation and TypeScript interfaces used across the services. The BedrockInvokeBodySchema is the contract for the /api/invoke route.
ts
import { z } from "zod";// Request body for the /api/invoke endpointexport const BedrockInvokeBodySchema = z.object({ modelId: z.string().min(1), prompt: z.string().min(1), maxTokens: z.number().int().positive().max(65536), tools: z.array(z.record(z.string(), z.unknown())).optional(), scopeType: z .enum(["task", "user", "session", "org"]) .optional() .default("user"), scopeKey: z.string().optional().default("*"),});export type BedrockInvokeBody = z.infer<typeof BedrockInvokeBodySchema>;// Response shape for budget state queriesexport interface BudgetStateResponse { scopeKey: string; scopeType: string; limit: number; spent: number; remaining: number; state: string; suggestedModel?: string;}export type { BudgetScope, BudgetExceededError, BudgetValidationError,} from "@reaatech/agent-budget-types";// Helicone log payload interfaceexport interface HeliconeLogPayload { sessionId: string; userId?: string; modelId: string; inputTokens: number; outputTokens: number; cost: number; provider: string; timestamp: Date;}// LLM call result from Bedrockexport interface LLMCallResult { body: string; inputTokens: number; outputTokens: number; modelId: string; provider: string;}
Step 6: Create the pricing service
Create src/services/pricing.ts. The PricingEngine from @reaatech/agent-budget-pricing normalizes model names and looks up token pricing with LRU-caching. Both estimateCost (pre-flight) and computeCost (post-flight) delegate to it.
ts
import { PricingEngine } from "@reaatech/agent-budget-pricing";// Instantiate with cachingconst pricingEngine = new PricingEngine({ cacheTtlMs: 3_600_000, // 1 hour});/** * Estimate cost for a model and input token count. */export function estimateCost( modelId: string, estimatedInputTokens: number): number { return pricingEngine.estimateCost(modelId, estimatedInputTokens, "bedrock");}/** * Compute exact cost from token counts and model. */export function computeCost( inputTokens: number, outputTokens: number, modelId: string): number { return pricingEngine.computeCost(inputTokens, outputTokens, modelId, "bedrock");}export { pricingEngine };
Step 7: Create the spend-tracker service
Create src/services/spend-tracker.ts. This is a thin wrapper around SpendStore from @reaatech/agent-budget-spend-tracker, providing a lazy singleton that the budget controller references.
ts
import { SpendStore } from "@reaatech/agent-budget-spend-tracker";// Lazy singleton patternlet _store: SpendStore | undefined;export function getSpendStore(): SpendStore { if (!_store) { _store = new SpendStore(); } return _store;}// Allow re-initialization for testingexport function resetSpendStore(): void { _store = undefined;}
Step 8: Create the budget service
Create src/services/budget.ts. This service owns the BudgetController singleton and exposes defineBudget, getBudgetState, and listAllBudgets. The controller combines the spend store and pricing engine to enforce soft/hard caps and auto-downgrade policies.
Create src/services/interceptor.ts. The BudgetInterceptor from @reaatech/agent-budget-middleware wraps the budget controller with a beforeStep/afterStep interface. beforeStep runs a pre-flight check and returns whether the request is allowed, possibly with a downgraded model or stripped tools. afterStep records the actual spend after the LLM call completes.
Create src/services/router.ts. The BudgetAwareStrategy from @reaatech/agent-budget-llm-router-plugin filters the list of candidate models by remaining budget. Use this when you want to pick the best available model given a budget constraint rather than enforcing a hard cap on a single model.
Create src/services/bedrock.ts. This service wraps the AWS SDK’s ConverseCommand with a typed invokeModel function that returns the response body and token counts. It throws a custom BedrockServiceError on AWS failures.
Create src/services/helicone.ts. This service posts spend entries to Helicone’s session log endpoint with retry logic via p-retry. The logUsage call is non-blocking — failures are caught and logged as warnings without failing the request.
Create src/services/otel.ts. The SpanListener from @reaatech/agent-budget-otel-bridge consumes OpenTelemetry span attributes and records spend entries against the budget controller. The createSpanProcessor function wraps the listener in a drop-in span processor for the OTel SDK.
ts
import { SpanListener } from "@reaatech/agent-budget-otel-bridge";import { getBudgetController } from "./budget.js";let _listener: SpanListener | undefined;export function getSpanListener(): SpanListener { if (!_listener) { _listener = new SpanListener({ controller: getBudgetController(), }); } return _listener;}export function processSpan(attributes: Record<string, unknown>): boolean { const listener = getSpanListener(); const result = listener.onSpanEnd(attributes); return result ?? false;}export function createSpanProcessor(): { onEnd: (span: { attributes: Record<string, unknown> }) => void; forceFlush: () => Promise<void>; shutdown: () => Promise<void>;} { const listener = getSpanListener(); return { onEnd(span: { attributes: Record<string, unknown> }) { listener.onSpanEnd(span.attributes); }, forceFlush() { return Promise.resolve(); }, shutdown() { return Promise.resolve(); }, };}export async function initializeTracer(): Promise<void> { try { const { NodeTracerProvider } = await import( "@opentelemetry/sdk-trace-node" ); const provider = new NodeTracerProvider(); // Wire up our custom span processor const processor = createSpanProcessor(); // Use setSpanProcessor if available, otherwise direct assignment if ("setSpanProcessor" in provider) { (provider as { setSpanProcessor: (p: unknown) => void }).setSpanProcessor(processor); } provider.register(); } catch { // OTel may not be available in all environments }}
Step 14: Create the index barrel
Create src/index.ts. This re-exports the key public APIs so consumers can import from a single entry point.
ts
// Re-export key services for external consumptionexport { getBudgetController, defineBudget, getBudgetState, listAllBudgets } from "./services/budget.js";export { estimateCost, computeCost } from "./services/pricing.js";export { getSpendStore } from "./services/spend-tracker.js";export { getInterceptor, beforeStep, afterStep } from "./services/interceptor.js";export { getBudgetAwareStrategy, selectModel } from "./services/router.js";export { getSpanListener, processSpan } from "./services/otel.js";export { logUsage } from "./services/helicone.js";export { invokeModel, BedrockServiceError } from "./services/bedrock.js";// Scaffold version constantexport const SCAFFOLD_VERSION = "0.1.0";
Step 15: Configure the instrumentation hook
Update next.config.ts to enable the Next.js instrumentation hook. This flag is required for src/instrumentation.ts to run during server startup.
ts
import type { NextConfig } from "next";const nextConfig: NextConfig = { experimental: { // @ts-expect-error instrumentationHook - Next 16 removed this from type but it's still functional instrumentationHook: true, },} satisfies NextConfig;export default nextConfig;
Then create src/instrumentation.ts. This file runs once per worker at startup. It defines the default budget (using DEFAULT_BUDGET_LIMIT from env), configures auto-downgrade rules for Claude models, and initializes the OpenTelemetry tracer.
Create app/api/health/route.ts. This is a minimal health check endpoint.
ts
import { NextResponse } from "next/server";export async function GET(_req: Request): Promise<Response> { return NextResponse.json({ status: "ok", timestamp: new Date().toISOString(), });}
Step 17: Create the invoke route
Create app/api/invoke/route.ts. This is the main endpoint that wires everything together: it validates the request body, estimates cost, runs the budget pre-flight check via beforeStep, invokes the model (possibly with a downgraded model ID), computes the actual cost, records spend via afterStep, and pushes the entry to Helicone.
ts
import { type NextRequest, NextResponse } from "next/server";import { BedrockInvokeBodySchema } from "@/lib/types.js";import { estimateCost, computeCost } from "@/services/pricing.js";import { invokeModel, BedrockServiceError } from "@/services/bedrock.js";import { beforeStep, afterStep } from "@/services/interceptor.js";import { logUsage, type SpendEntry } from "@/services/helicone.js";import { BudgetExceededError, BudgetScope } from "@reaatech/agent-budget-types";import { getBudgetState } from "@/services/budget.js";function generateRequestId(): string { return `req_${Date.
Step 18: Create the budget routes
Create app/api/budget/route.ts. This returns all defined budget scopes with their current spent/remaining totals.
Create app/api/budget/[scopeType]/[scopeKey]/route.ts. This returns the budget state for a specific scope type and key pair, validating the scope type against the BudgetScope enum.
ts
import { type NextRequest, NextResponse } from "next/server";import { getBudgetState } from "@/services/budget.js";import { BudgetScope } from "@reaatech/agent-budget-types";export async function GET( _req: NextRequest, { params }: { params: Promise<{ scopeType: string; scopeKey: string }> }): Promise<Response> { const { scopeType, scopeKey } = await params; if (!Object.values(BudgetScope).includes(scopeType as BudgetScope)) { return NextResponse.json( { error: "Invalid scope type" }, { status: 400 } ); } const state = getBudgetState(scopeType, scopeKey); if (!state) { return NextResponse.json( { error: "Budget not found" }, { status: 404 } ); } return NextResponse.json(state, { status: 200 });}
Step 19: Create the dashboard page
Create app/page.tsx. This server component renders a simple budget dashboard by calling listAllBudgets() at render time. Each budget card shows a progress bar, spent/limit amounts, and a state badge.
pricing service > happy path: computeCost returns correct value for known model
budget > happy path: listAllBudgets returns defined budgets
The JSON report is written to vitest-report.json.
Step 21: Start the dev server
With environment variables set and tests passing, start the development server:
terminal
pnpm dev
Expected output: The terminal shows Next.js starting on http://localhost:3001. You can visit http://localhost:3001/api/health to verify the server is running — it returns {"status":"ok","timestamp":"..."}. The dashboard at http://localhost:3001/ displays the budget cards, and you can POST to /api/invoke with a JSON body like:
A successful invocation returns the model response plus X-Budget-Remaining and X-Budget-Spent headers showing the cost. If the budget for that scope is exhausted, the endpoint returns HTTP 402.
Next steps
Add per-user budgets: Call defineBudget("user", "user-id", 5.0) when a new user signs up, rather than relying solely on the wildcard * scope.
Wire up auto-downgrade: The instrumentation hook already configures downgrade rules from Claude Opus to Sonnet to Haiku. Add your own rules in src/instrumentation.ts for other model families.
Connect a real dashboard: Replace the server component in app/page.tsx with a React dashboard that polls /api/budget every few seconds to show live spend as users make LLM calls.
now
()
}_${
Math
.
random
().
toString
(
36
).
slice
(
2
,
9
)
}`
;
}
export async function POST(req: NextRequest): Promise<Response> {