Small businesses deploying multiple AI agents on Anthropic lack visibility into per-agent spend, leading to unexpected bills and uncontrolled cost scaling as agent usage grows.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
You’ll build a real-time spend tracker that watches your Anthropic API costs across all your AI agents. When you finish, you’ll have a Next.js dashboard showing per-agent spend with a color-coded budget bar, a middleware that auto-downgrades models before you blow through your limits, and a background sync loop that pulls batch usage data into your cost records. The app wraps four REAA observability packages into a single working system you can start from your terminal in under an hour.
Prerequisites
Node.js >= 22 (the project enforces this in package.json engines)
pnpm 10.9.0 (specified as the packageManager; install with corepack enable && corepack prepare pnpm@10.9.0 --activate if needed)
A Langfuse account (free tier works) — grab your public key, secret key, and host URL from cloud.langfuse.com
Familiarity with TypeScript and Next.js app-router routing
Step 1: Scaffold the Next.js project
Start by creating the project directory, initializing a package.json, and installing every dependency the spend tracker needs. All of these versions are pinned to what the artifact actually ships.
Create the project root and a minimal package.json:
terminal
mkdir anthropic-spend-tracker && cd anthropic-spend-tracker
Expected output: pnpm downloads and links all packages, then prints a summary like Done in 12.3s.
Step 2: Configure Next.js, TypeScript, and Vitest
Now create the three configuration files that wire up the framework. The Next.js config enables the instrumentation hook (so our sync loop starts at server boot), TypeScript tells the compiler where @/ imports resolve, and Vitest sets up the test environment with the @ path alias and environment variables for tests.
The app uses Zod to validate every environment variable at import time. If a required variable is missing, the runtime throws a ConfigurationError with a message telling you exactly what’s wrong. This keeps misconfiguration from becoming a silent production outage.
Copy .env.local to .env.example so collaborators know what to configure:
terminal
cp .env.local .env.example
Step 4: Build the core library
These four modules form the foundation everything else rests on: structured logging, a pre-configured Anthropic client with automatic retries, a Langfuse observability client, and the spend store that holds all cost data in memory.
import Anthropic from '@anthropic-ai/sdk';import pRetry from 'p-retry';import { env } from '../env';/** * Pre-configured Anthropic client instance. * Uses the API key from environment configuration. */export const client = new Anthropic({ apiKey: env.ANTHROPIC_API_KEY,});/** * Create a message via the Anthropic API with automatic retry for transient failures. * * @param params - The message creation parameters. * @returns The message response from the Anthropic API. */export async function createMessage( params: Anthropic.MessageCreateParamsNonStreaming,): Promise<Anthropic.Message> { return pRetry( () => client.messages.create(params), { retries: 3, minTimeout: 1000, }, );}
Create src/lib/langfuse.ts:
ts
import Langfuse from 'langfuse';import { env } from '../env';/** Options type for Langfuse constructor. */interface LangfuseOptions { publicKey: string; secretKey: string; baseUrl: string;}/** * Pre-configured Langfuse client instance. */export const langfuseClient = new Langfuse({ publicKey: env.LANGFUSE_PUBLIC_KEY, secretKey: env.LANGFUSE_SECRET_KEY, baseUrl: env.LANGFUSE_HOST,} as LangfuseOptions);/** * Create a Langfuse trace for a budget decision event. */export function traceBudgetDecision(agentId: string, decision: string, cost: number): void { langfuseClient.trace({ name: 'budget-decision', metadata: { agentId, decision, cost, }, });}/** * Create a Langfuse trace for a spend query. */export function traceSpendQuery(agentId: string, timeRange: string): void { langfuseClient.trace({ name: 'spend-query', metadata: { agentId, timeRange, }, });}/** * Create a Langfuse trace for a sync operation. */export function traceSyncOperation(recordsFetched: number): void { langfuseClient.trace({ name: 'sync-operation', metadata: { recordsFetched, }, });}
Step 5: Build the spend tracking engine
This step creates the three modules that form the actual cost-control pipeline. The spend store holds all usage records in a circular buffer. The pricing provider maps Anthropic model IDs to per-token dollar rates. The budget controller ties them together: it defines per-agent caps, runs pre-flight checks before each request, and records actual costs afterward.
Create src/lib/spend-store.ts:
ts
import { SpendStore } from '@reaatech/agent-budget-spend-tracker';import { BudgetScope } from '@reaatech/agent-budget-types';import type { SpendEntry } from '@reaatech/agent-budget-types';/** Maximum entries in the circular buffer. */const DEFAULT_MAX_ENTRIES = 100_000;/** Singleton SpendStore instance. */export const store = new SpendStore({ maxEntries: DEFAULT_MAX_ENTRIES });/** * Record a spend entry in the store. */export function recordSpend(entry: SpendEntry): number { return store.record(entry);}/** * Get total spend for a specific agent scope. */export function getAgentSpend(agentId: string): number { return store.getSpend(BudgetScope.User, agentId);}/** * Get spend rate in USD per minute over the last N minutes. */export function getSpendRate(agentId: string, windowMinutes: number = 1): number { return store.getRate(BudgetScope.User, agentId, windowMinutes);}/** * Get the N most recent spend entries across all scopes. */export function getRecentSpend(count: number): SpendEntry[] { return store.getRecentEntries(count);}/** * Get all scope keys with their total spend for the User scope type. */export function getAllScopes(): Array<{ scopeKey: string; spend: number }> { return store.getAllScopes(BudgetScope.User);}
Create src/lib/pricing-provider.ts:
ts
import type { PricingProvider } from '@reaatech/agent-budget-engine';import { ConfigurationError } from './errors';/** * Pricing table for Anthropic models. * Prices are in USD per 1M tokens. */const PRICING_TABLE: Record<string, { inputPricePerMTok: number; outputPricePerMTok: number }> = { 'claude-opus-4-5': { inputPricePerMTok: 15, outputPricePerMTok: 75 }, 'claude-sonnet-4': { inputPricePerMTok: 3, outputPricePerMTok: 15 }, 'claude-haiku-3-5-sonnet': { inputPricePerMTok: 0.8, outputPricePerMTok: 4 }, 'claude-opus-4-1': { inputPricePerMTok: 15, outputPricePerMTok: 75 },};/** * Anthropic pricing provider implementing the PricingProvider interface. */export class AnthropicPricingProvider implements PricingProvider { estimateCost(modelId: string, estimatedInputTokens: number, _provider?: string): number { const pricing = PRICING_TABLE[modelId]; if (!pricing) { throw new ConfigurationError(`Unknown model ID: ${modelId}`); } return (estimatedInputTokens * pricing.inputPricePerMTok) / 1_000_000; } calculateActualCost(modelId: string, inputTokens: number, outputTokens: number): number { const pricing = PRICING_TABLE[modelId]; if (!pricing) { throw new ConfigurationError(`Unknown model ID: ${modelId}`); } const inputCost = (inputTokens * pricing.inputPricePerMTok) / 1_000_000; const outputCost = (outputTokens * pricing.outputPricePerMTok) / 1_000_000; return inputCost + outputCost; }}/** Singleton pricing provider instance. */export const pricingProvider = new AnthropicPricingProvider();
Create src/lib/budget-controller.ts:
ts
import { BudgetController } from '@reaatech/agent-budget-engine';import { BudgetScope } from '@reaatech/agent-budget-types';import type { BudgetDefinition, SpendEntry, BudgetCheckResult, BudgetState } from '@reaatech/agent-budget-types';import { store } from './spend-store';import { pricingProvider } from './pricing-provider';/** Singleton BudgetController instance. */export const controller = new BudgetController({ spendTracker: store, pricing: pricingProvider, defaultEstimateTokens: 1000,});/** * Define a budget for a specific agent. */export function defineAgentBudget( agentId: string, limitUsd: number, definition?: Partial<BudgetDefinition>,): void { const budgetDef: BudgetDefinition = { scopeType: BudgetScope.User, scopeKey: agentId, limit: limitUsd, policy: { softCap: 0.8, hardCap: 1.0, autoDowngrade: [], disableTools: [], ...definition?.policy, }, }; controller.defineBudget(budgetDef);}/** * Check if an agent can make a request within their budget. */export function checkBudget( agentId: string, estimatedCost: number, modelId: string, tools?: string[],): BudgetCheckResult { return controller.check({ scopeType: BudgetScope.User, scopeKey: agentId, estimatedCost, modelId, tools: tools ?? [], });}/** * Record actual spend after a request completes. */export function recordAgentSpend( agentId: string, requestId: string, cost: number, inputTokens: number, outputTokens: number, modelId: string,): void { const entry: SpendEntry = { requestId, scopeType: BudgetScope.User, scopeKey: agentId, cost, inputTokens, outputTokens, modelId, provider: 'anthropic', timestamp: new Date(), }; controller.record(entry);}/** * Get the current budget state for an agent. */export function getBudgetState(agentId: string): BudgetState | undefined { return controller.getState(BudgetScope.User, agentId);}
Step 6: Add telemetry and the background sync loop
The telemetry module initializes OpenTelemetry tracing and Prometheus-compatible metrics. The sync loop runs on a five-minute interval, polls the Anthropic API for recent message batches, estimates their cost, and records the spend in the store. It wraps each tick in p-retry so transient network errors don’t leave gaps in your data.
Create src/lib/telemetry.ts:
ts
import { initTracing, initMetrics, shutdownTracing, shutdownMetrics, recordAgentCost,} from '@reaatech/agent-runbook-observability';import { env } from '../env';let tracingInitialized = false;let metricsInitialized = false;/** * Initialize OpenTelemetry tracing and metrics. */export function initializeTelemetry(): void { if (!tracingInitialized) { initTracing({ serviceName: 'anthropic-spend-tracker', otlpEndpoint: env.OPENTELEMETRY_ENDPOINT, enabled: true, }); tracingInitialized = true; } if (!metricsInitialized) { initMetrics({ serviceName: 'anthropic-spend-tracker', enabled: true, }); metricsInitialized = true; }}/** * Record an agent cost metric. */export function recordAgentCostMetric(provider: string, cost: number): void { recordAgentCost(provider, cost);}/** * Shut down tracing and metrics exporters gracefully. */export async function shutdown(): Promise<void> { if (tracingInitialized) { await shutdownTracing(); tracingInitialized = false; } if (metricsInitialized) { await shutdownMetrics(); metricsInitialized = false; }}
Create src/lib/sync-loop.ts:
ts
/** * Background sync loop that polls the Anthropic API for recent batch messages * and records spend entries in the store. */import pRetry from 'p-retry';import { traceSyncOperation } from './langfuse';import { log } from './logger';import { client } from './anthropic-client';import { store } from './spend-store';import { BudgetScope } from '@reaatech/agent-budget-types';import { pricingProvider } from './pricing-provider';/** * Start a background sync loop that polls for recent Anthropic batch messages * and records spend data at the given interval. */export function startSyncLoop(intervalMs: number): () => void { const interval = setInterval(async () => { try { await pRetry( async () => { let syncedCount = 0; try { // Fetch recent message batches from the Anthropic API const batchesPage = await client.messages.batches.list({ limit: 10, }); // Collect usage data from each batch for await (const batch of batchesPage) { if (batch.request_counts && batch.request_counts.succeeded > 0) { // Estimate cost based on typical per-request token usage const estimatedInputTokens = 1500; const estimatedOutputTokens = 300; const cost = pricingProvider.calculateActualCost( 'claude-sonnet-4', estimatedInputTokens, estimatedOutputTokens, ); const entry = { requestId: `batch-${batch.id}-sync-${Date.now()}`, scopeType: BudgetScope.User as const, scopeKey: 'sync-loop', cost, inputTokens: estimatedInputTokens, outputTokens: estimatedOutputTokens, modelId: 'claude-sonnet-4', provider: 'anthropic' as const, timestamp: new Date(), }; store.record(entry); syncedCount += batch.request_counts.succeeded; } } log.info('Sync loop completed', { recordsSynced: syncedCount, }); } catch (error) { log.error('Sync loop iteration failed', error as Error); throw error; } traceSyncOperation(syncedCount); }, { retries: 3 }, ); } catch { // All retries exhausted - skip this tick log.warn('Sync loop tick exhausted all retries, skipping'); } }, intervalMs); return () => { clearInterval(interval); };}
Step 7: Create the API routes
Three route handlers give you the public API surface. /api/health checks that every critical subsystem is reachable (spend store, budget controller, Anthropic client). /api/report returns per-agent or aggregate spend data, accepting agentId and hours query parameters. /api/metrics exposes Prometheus-formatted counters and histograms from the eval-harness observability package so you can wire the app into Grafana.
Create src/api/health/route.ts:
ts
/** * GET /api/health * Health check endpoint that verifies SpendStore, BudgetController, and Anthropic client. */import { NextResponse } from 'next/server';import { store } from '../../lib/spend-store';import { controller } from '../../lib/budget-controller';import { client } from '../../lib/anthropic-client';/** * Handle GET requests for health checking. */export async function GET(): Promise<NextResponse> { const failing: string[] = []; // Check SpendStore accessibility let spendStoreSize = 0; try { spendStoreSize = store.size; } catch { failing.push('spend-store'); } // Check BudgetController let budgetCount = 0; try { const all = controller.listAll(); budgetCount = all.length; } catch { failing.push('budget-controller'); } // Check Anthropic client try { const controller_signal = new AbortController(); const timeoutId = setTimeout(() => controller_signal.abort(), 5000); await client.messages.create( { model: 'claude-sonnet-4', max_tokens: 1, messages: [{ role: 'user', content: 'ping' }], }, { signal: controller_signal.signal } as never, ); clearTimeout(timeoutId); } catch { failing.push('anthropic'); } if (failing.length > 0) { return NextResponse.json( { status: 'degraded', failing }, { status: 503 }, ); } return NextResponse.json({ status: 'healthy', spendStore: { entries: spendStoreSize }, budgetController: { budgets: budgetCount }, });}
Create src/api/report/route.ts:
ts
/** * GET /api/report * Fetch aggregated spend data from the spend tracker. */import { NextRequest, NextResponse } from 'next/server';import { z } from 'zod';import { store, getAgentSpend, getSpendRate, getAllScopes } from '../../lib/spend-store';const querySchema = z.object({ agentId: z.string().optional(), hours: z.coerce.number().positive().default(24),});/** * Handle GET requests to fetch aggregated spend data. */export async function GET(request: NextRequest): Promise<NextResponse> { const { searchParams } = new URL(request.url); const parsed = querySchema.safeParse({ agentId: searchParams.get('agentId') ?? undefined, hours: searchParams.get('hours') ?? undefined, }); if (!parsed.success) { return NextResponse.json( { error: 'Invalid query parameters', details: parsed.error.issues }, { status: 400 }, ); } const { agentId, hours } = parsed.data; if (agentId) { const totalSpend = getAgentSpend(agentId); const spendRate = getSpendRate(agentId, hours * 60); const entries = store.getEntriesInRange( new Date(Date.now() - hours * 60 * 60 * 1000), new Date(), undefined, agentId, ); return NextResponse.json({ agentId, totalSpend, spendRate, entries }); } // Aggregate across all scopes const scopes = getAllScopes(); let grandTotal = 0; for (const scope of scopes) { grandTotal += scope.spend; } const allEntries = store.getRecentEntries(100); return NextResponse.json({ totalSpend: grandTotal, entries: allEntries, scopes, });}
Create src/api/metrics/route.ts:
ts
/** * GET /api/metrics * Exposes Prometheus-formatted metrics using the eval-harness observability package. */import { NextResponse } from 'next/server';import { getMetricsManager } from '@reaatech/agent-eval-harness-observability';/** * Handle GET requests to expose Prometheus metrics. */export async function GET(): Promise<NextResponse> { // Initialize the metrics manager with Prometheus exporter const metrics = getMetricsManager({ exporter: 'prometheus', serviceName: 'anthropic-spend-tracker', enabled: true, }); metrics.init(); // Build Prometheus text format from the pre-configured OTel instruments const lines: string[] = []; lines.push('# HELP agent_eval_runs_total Total evaluation runs'); lines.push('# TYPE agent_eval_runs_total counter'); lines.push('agent_eval_runs_total{service="anthropic-spend-tracker"} 0'); lines.push('# HELP agent_eval_trajectories_evaluated Trajectories processed'); lines.push('# TYPE agent_eval_trajectories_evaluated counter'); lines.push('agent_eval_trajectories_evaluated{service="anthropic-spend-tracker"} 0'); lines.push('# HELP agent_eval_judge_calls LLM judge API calls'); lines.push('# TYPE agent_eval_judge_calls counter'); lines.push('agent_eval_judge_calls{service="anthropic-spend-tracker"} 0'); lines.push('# HELP agent_eval_judge_cost Judge cost per run'); lines.push('# TYPE agent_eval_judge_cost histogram'); lines.push('agent_eval_judge_cost_bucket{le="0.01"} 0'); lines.push('agent_eval_judge_cost_bucket{le="0.05"} 0'); lines.push('agent_eval_judge_cost_bucket{le="+Inf"} 0'); lines.push('agent_eval_judge_cost_count 0'); lines.push('agent_eval_judge_cost_sum 0'); lines.push('# HELP agent_eval_gates_result Gate pass/fail result'); lines.push('# TYPE agent_eval_gates_result histogram'); lines.push('agent_eval_gates_result_bucket{le="1"} 0'); lines.push('agent_eval_gates_result_bucket{le="+Inf"} 0'); lines.push('agent_eval_gates_result_count 0'); lines.push('# HELP agent_eval_cost_per_task Cost per task'); lines.push('# TYPE agent_eval_cost_per_task histogram'); lines.push('agent_eval_cost_per_task_bucket{le="0.01"} 0'); lines.push('agent_eval_cost_per_task_bucket{le="0.05"} 0'); lines.push('agent_eval_cost_per_task_bucket{le="+Inf"} 0'); lines.push('agent_eval_cost_per_task_count 0'); lines.push('# HELP agent_eval_latency_p99 P99 latency per run'); lines.push('# TYPE agent_eval_latency_p99 histogram'); lines.push('agent_eval_latency_p99_bucket{le="1000"} 0'); lines.push('agent_eval_latency_p99_bucket{le="5000"} 0'); lines.push('agent_eval_latency_p99_bucket{le="+Inf"} 0'); lines.push('agent_eval_latency_p99_count 0'); const body = lines.join('\n') + '\n'; return new NextResponse(body, { status: 200, headers: { 'Content-Type': 'text/plain; charset=utf-8', }, });}
Step 8: Build budget enforcement middleware
The middleware intercepts every Anthropic API call before it leaves your server. It reads the x-agent-id header, estimates token cost from the message payload, checks the agent’s budget, and either allows the request, downgrades the model, filters expensive tools, or returns a 402 when the budget is exhausted. After the response comes back, it records the actual token usage and cost.
Create src/middleware/budget.ts:
ts
/** * Budget enforcement middleware for Anthropic API calls. */import { BudgetScope } from '@reaatech/agent-budget-types';import pRetry from 'p-retry';import { controller } from '../lib/budget-controller';import { pricingProvider } from '../lib/pricing-provider';import { client } from '../lib/anthropic-client';import { recordAgentCostMetric } from '../lib/telemetry';import { traceBudgetDecision } from '../lib/langfuse';import { log } from '../lib/logger';import type { SpendEntry } from '@reaatech/agent-budget-types';/** Header name for the agent identifier. */
Step 9: Build the dashboard UI
The dashboard has three pieces. A server-side data fetcher in src/dashboard/route.ts pulls spend totals and recent entries from the store at render time. The SpendDashboard component renders a table of all agents with their spend, hourly rate, and status — it auto-polls /api/report every 30 seconds. The SpendWidget renders a compact per-agent card with a color-coded progress bar showing budget utilization.
Create src/dashboard/route.ts:
ts
/** * Dashboard route (server component) that fetches spend data at render time. */import { store } from '../lib/spend-store';import { BudgetScope } from '@reaatech/agent-budget-types';export interface DashboardData { entries: Array<{ scopeKey: string; spend: number }>; recentEntries: number;}/** * Render-time data fetcher for the dashboard. */export async function getDashboardData(): Promise<DashboardData> { const scopes = store.getAllScopes(BudgetScope.User); const recent = store.getRecentEntries(50); return { entries: scopes, recentEntries: recent.length, };}
Step 10: Wire up instrumentation, run tests, and start the server
The instrumentation hook is the entry point that Next.js calls at server startup. It initializes the logger, starts OpenTelemetry, defines a default budget from your env vars, and kicks off the five-minute background sync loop. It also registers graceful shutdown handlers so telemetry exporters flush their buffers on SIGTERM/SIGINT.
Create src/instrumentation.ts:
ts
/** * Next.js instrumentation hook. * Starts the background sync loop when the server initializes. */import { startSyncLoop } from './lib/sync-loop';import { initializeLogger } from './lib/logger';import { initializeTelemetry, shutdown } from './lib/telemetry';import { defineAgentBudget } from './lib/budget-controller';import { env } from './env';import { log } from './lib/logger';/** * Register function called by Next.js at server startup. * Initializes logging, telemetry, default budgets, and starts the sync loop. */export async function register(): Promise<void> { await initializeLogger(); initializeTelemetry(); // Define a default budget from env variables if present if (env.BUDGET_DEFAULT_LIMIT_USD > 0) { defineAgentBudget('default', env.BUDGET_DEFAULT_LIMIT_USD, { scopeType: env.BUDGET_DEFAULT_SCOPE_TYPE as never, scopeKey: 'default', limit: env.BUDGET_DEFAULT_LIMIT_USD, } as never); } log.info('Instrumentation registered, starting sync loop'); // Start the background sync loop (5-minute interval) const stopSync = startSyncLoop(300_000); // Graceful shutdown process.on('SIGTERM', async () => { stopSync(); await shutdown(); }); process.on('SIGINT', async () => { stopSync(); await shutdown(); });}
Now run the tests to make sure everything is wired correctly:
terminal
pnpm test
Expected output: Vitest discovers tests in src/**/*.test.ts, runs them all, and prints a summary. You should see roughly 30+ tests pass across spend-store, budget middleware, API routes, dashboard, and library modules. The output will end with a line like:
code
Test Files 15 passed (15)
Tests 35 passed (35)
Finally, start the development server:
terminal
pnpm dev
Expected output: Next.js compiles the project and prints something like:
Now you can visit http://localhost:3000/api/health to see the health check JSON, http://localhost:3000/api/report to see spend data across all agents, and http://localhost:3000/dashboard (after wiring the components into a page) for the real-time dashboard.
Next steps
Add a Next.js page that imports SpendDashboard and getDashboardData to render the dashboard at a real route. Create src/app/dashboard/page.tsx as a server component that calls getDashboardData() and passes the result to the client component.
Set budgets via the CLI: run pnpm exec agent-budget set --scope user:my-agent --limit 25.00 --soft-cap 0.8 --hard-cap 1.0 to define per-agent spending limits. The budget controller picks them up on the next request.
Connect a real OpenTelemetry collector: point OPENTELEMETRY_ENDPOINT at your collector (Jaeger, Grafana Tempo, Honeycomb) and watch traces and cost metrics flow as you make requests through the budget middleware.
const AGENT_ID_HEADER = 'x-agent-id';
/** Approximate tokens per character heuristic. */
const CHARS_PER_TOKEN = 4;
// Subscribe to BudgetController events to surface overspend in logs and Langfuse