SMB bookkeepers spend hours pulling data from spreadsheets and writing formulas to produce monthly financial reports. Non-technical staff can't run quick what-if analyses without risking spreadsheet errors.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building a financial report generation pipeline that accepts natural-language prompts, generates Python analysis code via AWS Bedrock, executes it in a secure E2B sandbox, validates the output, and returns formatted PDF and XLSX reports. You’ll wire together the @reaatech/* package family for classification, cost control, caching, and fault tolerance.
Prerequisites
Node.js >= 22 and pnpm 10.x installed
AWS Account with Bedrock access (Claude Sonnet 4 or compatible model)
All dependencies are exact-pinned (no ^ or ~). The @ path alias in tsconfig.json maps @/* to the project root, so imports like @/src/lib/types resolve correctly. The .env.example file lists the environment variables you’ll need:
Copy it to .env and fill in your real credentials before running the dev server.
Expected output: You see a standard Next.js App Router scaffold with app/, src/, and tests/ directories, a vitest.config.ts configured for Node coverage with 90% thresholds, and all node_modules installed.
Step 2: Define shared types with Zod
All data types live in src/lib/types.ts. You’ll define Zod schemas for the report request, code generation results, execution results, formatted output, and metadata, plus custom error classes.
Create src/lib/types.ts:
ts
import { z } from "zod";export const ReportType = z.enum(["balance_sheet", "income_statement", "cash_flow", "custom"]);export type ReportType = z.infer<typeof ReportType>;export const ReportRequestSchema = z.object({ prompt: z.string().min(1).max(10000), reportType: ReportType.optional(), maxBudgetUsd: z.number().positive().optional(),});
Expected output:ReportRequestSchema accepts { prompt: "Generate a balance sheet for Q1 2026" } and rejects { prompt: "" }. The error classes have distinct .name properties for type-safe catching in the route handler.
Step 3: Implement Bedrock pricing
The @reaatech/agent-budget-engine’s BudgetController needs a PricingProvider to estimate LLM costs. Create src/lib/pricing-provider.ts with a lookup table of Bedrock model IDs:
The estimateCost method approximates output tokens as 30% of input tokens. calculateExactCost uses the actual counts after the call completes. Both throw with a helpful message listing known model IDs if you pass an unrecognized model.
Expected output:estimateCost("anthropic.claude-sonnet-4-v1:0", 1000) returns roughly (1000 * 0.003 + 300 * 0.015) / 1000 = 0.0075. An unknown model throws an error that includes the list of known models.
Step 4: Build the spend tracker
The BudgetController also needs a SpendStore to persist usage. The @reaatech/agent-budget-spend-tracker package provides a SpendStore base class you can extend. Create src/lib/spend-store.ts:
ts
import { SpendStore } from "@reaatech/agent-budget-spend-tracker";import type { SpendEntry } from "@reaatech/agent-budget-types";export class InMemorySpendStore extends SpendStore { listAll(): SpendEntry[] { return this.getRecentEntries(Number.MAX_SAFE_INTEGER); }}
SpendStore already implements record(), getSpend(), and reset(). You only add listAll() as a convenience to see every recorded entry.
Expected output: After recording three entries for the same scope key, getSpend(scopeType, scopeKey) returns the sum of their costs. An unknown scope returns 0.
Step 5: Create the deterministic embedder
The @reaatech/llm-cache cache engine needs an embedder to generate vectors for semantic cache lookups. This embedder works without any external API — it produces a deterministic 1536-element vector from a SHA-256 hash of the input text.
Create src/lib/embedder.ts:
ts
import { createHash } from "node:crypto";export const EMBEDDING_DIMENSIONS = 1536;export class SimpleEmbedder { embed(text: string, _expectedDimensions?: number): Promise<number[]> { void _expectedDimensions; const hash = createHash("sha256").update(text).digest(); const vec: number[] = []; for (let i = 0; i < EMBEDDING_DIMENSIONS; i++) { vec.push(hash[i % hash.length] / 128 - 1); } const norm = Math.sqrt(vec.reduce((sum, v) => sum + v * v, 0)); return Promise.resolve(vec.map((v) => v / norm)); } embedBatch(texts: string[], _expectedDimensions?: number): Promise<number[][]> { void _expectedDimensions; return Promise.all(texts.map((t) => this.embed(t))); }}
Expected output:embed("test text") returns an array of exactly 1536 numbers. Running it twice with the same input produces identical vectors (deterministic).
Step 6: Build the output validator and repair
When the E2B sandbox returns JSON, it might have trailing commas, single quotes, or surrounding whitespace. Create src/lib/output-repair.ts to handle these edge cases:
ts
import { ReportDataSchema, type ReportData } from "./types";export interface RepairResult { valid: boolean; data: ReportData | null; errors: string[]; warnings: string[];}function attemptRepair(raw: string): unknown { try { return JSON.parse(raw) as unknown; } catch { let cleaned = raw.trim(); const start = cleaned.indexOf("{"); const end = cleaned.lastIndexOf("}"); if (start !== -1 && end !== -1 && end > start) { cleaned = cleaned.slice(start, end + 1); } cleaned = cleaned.replace(/'/g, '"'); cleaned = cleaned.replace(/,\s*([}\]])/g, "$1"); try { return JSON.parse(cleaned) as unknown; } catch { return null; } }}export function repairAndValidate(raw: unknown): RepairResult { const warnings: string[] = []; if (raw == null) { return { valid: false, data: null, errors: ["Input is null or undefined"], warnings }; } let parsed: unknown; if (typeof raw === "string") { const result = attemptRepair(raw); if (result === null) { return { valid: false, data: null, errors: ["Failed to parse input as JSON after repair attempt"], warnings }; } parsed = result; warnings.push("JSON repair heuristics were applied"); } else { parsed = raw; } const validation = ReportDataSchema.safeParse(parsed); if (validation.success) { return { valid: true, data: validation.data, errors: [], warnings }; } const errors = validation.error.issues.map( (issue) => `${issue.path.join(".")}: ${issue.message}` ); return { valid: false, data: null, errors, warnings };}
The attemptRepair function tries a straight JSON parse first, then applies three heuristics: extracting the outermost {...} block, replacing single quotes with double quotes, and removing trailing commas before closing braces/brackets. repairAndValidate handles null, string, and pre-parsed object inputs, then validates against ReportDataSchema.
Expected output: The string "{'title': 'Report', 'period': 'Q1', 'sections': [],}" (single quotes + trailing comma) is repaired and validates successfully. null returns { valid: false, errors: ["Input is null or undefined"] }.
Step 7: Set up the LLM cache
The @reaatech/llm-cacheCacheEngine requires storage, vector storage, an embedder, and configuration. Create src/lib/cache-factory.ts:
Expected output:createCacheEngine(new SimpleEmbedder()) returns a fully initialized CacheEngine that can get() and set() entries. An uncached key returns { hit: false }.
Step 8: Build the Bedrock service
The BedrockService wraps the AWS SDK’s ConverseCommand to call Claude Sonnet 4 for Python code generation. Create src/services/bedrock-service.ts:
The service sends a prompt to Bedrock’s Converse API, extracts the Python code block from the response (delimited by ```python ... ```), and returns it along with the explanation text and token counts.
Expected output: When the Converse response contains a Python code block like ```python\nprint('hello')\n```, generateCode extracts "print('hello')" as pythonCode and the surrounding text as explanation. An empty or missing content array returns empty strings without crashing.
Step 9: Build the sandbox executor with CircuitBreaker
The SandboxService runs generated Python code in an E2B sandbox, wrapped in a @reaatech/circuit-breaker-agents CircuitBreaker for fault tolerance. Create src/services/sandbox-service.ts:
ts
import { Sandbox } from "@e2b/code-interpreter";import { CircuitBreaker, InMemoryAdapter } from "@reaatech/circuit-breaker-agents";import type { ExecutionResult } from "../lib/types";import { SandboxTimeoutError } from "../lib/types";export class SandboxService { private readonly circuitBreaker: CircuitBreaker; constructor() { this.circuitBreaker = new CircuitBreaker({ name: "e2b-sandbox", failureThreshold: 3, recoveryTimeoutMs: 30000, persistence: new InMemoryAdapter(), }); } async executeCode(code: string): Promise<ExecutionResult> { const run = async (): Promise<ExecutionResult> => { const sbx = await Sandbox.create(); try { const execution = await Promise.race([ sbx.runCode(code), new Promise<never>((_, reject) => setTimeout(() => { const err = new Error("Sandbox execution timed out"); err.name = "TimeoutError"; reject(err); }, 60_000) ), ]); const artifacts: Array<{ name: string; data: string }> = []; if (typeof execution === "object" && "artifacts" in execution) { const raw = (execution as Record<string, unknown>).artifacts; if (Array.isArray(raw)) { for (const a of raw) { const item = a as { name: unknown; data: unknown }; if (typeof item === "object" && "name" in item && "data" in item) { artifacts.push({ name: String(item.name), data: String(item.data) }); } } } } return { stdout: execution.text ?? "", stderr: "", artifacts, success: true, }; } finally { // Cleanup — sandbox auto-terminates when the reference is dropped } }; try { return await this.circuitBreaker.execute(run); } catch (error: unknown) { if (error instanceof SandboxTimeoutError) throw error; if (error instanceof Error && error.name === "TimeoutError") { throw new SandboxTimeoutError(); } throw error; } }}
After 3 consecutive sandbox failures, the circuit breaker opens and rejects all further requests for 30 seconds, giving the E2B infrastructure time to recover. The service wraps execution in a Promise.race with a 60-second timeout. The sandbox uses a try/finally to ensure cleanup.
Expected output:executeCode("print(1+1)") returns { stdout: "2", success: true }. If the sandbox call takes longer than 60 seconds, it throws SandboxTimeoutError.
Step 10: Build the report formatter
The ReportFormatter produces PDF and XLSX output from validated ReportData. Create src/services/report-formatter.ts:
ts
import { PDFDocument, StandardFonts, rgb } from "pdf-lib";import * as XLSX from "xlsx";import type { ReportData } from "../lib/types";export class ReportFormatter { async generatePdf(data: ReportData): Promise<Uint8Array> { const pdfDoc = await PDFDocument.create(); const font = await pdfDoc.embedFont(StandardFonts.Helvetica); const page = pdfDoc.addPage(); const { height } = page.getSize(); let y = height - 50; // Title page.drawText(data.title, { x: 50, y, size: 24, font, color: rgb(0, 0, 0) }); y -= 30; // Period page.drawText(data.period, { x: 50, y, size: 14, font, color: rgb(0.3, 0.3, 0.3) }); y -= 25; // Sections for (const section of data.sections) { if (y < 60) { pdfDoc.addPage(); y = height - 50; } page.drawText(section.heading, { x: 50, y, size: 16, font, color: rgb(0, 0, 0) }); y -= 20; if (section.type === "table") { for (const row of section.rows) { if (y < 40) { pdfDoc.addPage(); y = height - 50; } let x = 50; for (const cell of row) { page.drawText(cell, { x, y, size: 10, font, color: rgb(0, 0, 0) }); x += 120; } y -= 14; } } else { for (const row of section.rows) { if (y < 40) { pdfDoc.addPage(); y = height - 50; } page.drawText(row.join(" "), { x: 50, y, size: 10, font, color: rgb(0, 0, 0) }); y -= 14; } } y -= 10; } return pdfDoc.save(); } generateXlsx(data: ReportData): Promise<Uint8Array> { try { const wb = XLSX.utils.book_new(); for (const section of data.sections) { const ws = XLSX.utils.aoa_to_sheet(section.rows); XLSX.utils.sheet_add_aoa(ws, [[section.heading]], { origin: "A1" }); XLSX.utils.book_append_sheet(wb, ws, section.heading.slice(0, 31)); } const buf = XLSX.write(wb, { type: "buffer", bookType: "xlsx" }) as Buffer; return Promise.resolve(new Uint8Array(buf)); } catch (error: unknown) { return Promise.reject(error instanceof Error ? error : new Error(String(error))); } }}
The PDF formatter renders a title and period header, then iterates through sections — table-typed sections render as column-aligned grids, and text-typed sections render rows as joined strings. Pages are added automatically when content overflows the bottom margin. The XLSX formatter creates one sheet per section (truncating the sheet name to 31 characters, Excel’s limit).
Expected output:generatePdf(minimalData) returns a Uint8Array starting with %PDF. generateXlsx(minimalData) returns one starting with PK (ZIP header). Both handle zero sections gracefully.
Step 11: Wire the pipeline orchestrator
The ReportGenerator is the pipeline orchestrator. It runs 9 steps: classify, cache check, budget check, code generation, sandbox execution, output validation, spend recording, formatting, and cache storage. Create src/services/report-generator.ts:
ts
import { ConfidenceRouter } from "@reaatech/confidence-router";import { BudgetController } from "@reaatech/agent-budget-engine";import { CacheEngine } from "@reaatech/llm-cache";import { BudgetScope } from "@reaatech/agent-budget-types";import type { ReportRequest, FormattedReport, ReportMetadata } from "../lib/types";import { BudgetExceededError, OutputValidationError } from "../lib/types";import { repairAndValidate } from "../lib/output-repair";import type { BedrockService } from "./bedrock-service";import type { SandboxService } from "./sandbox-service";import type { ReportFormatter } from "./report-formatter";
The orchestrator enriches the user’s prompt with schema instructions before sending it to Bedrock, calculates exact cost using calculateExactCost after getting real token counts, converts PDF/XLSX byte arrays to base64 via Buffer.from, and stores the final result in the cache with the model version as a key component.
Expected output: When all dependencies are mocked, generate({ prompt: "Balance sheet for Q1" }) returns a FormattedReport with non-empty pdfBase64 and xlsxBase64. A cache hit returns the stored result without calling Bedrock or the sandbox.
Step 12: Create the API route handler
The route handler at app/api/report/generate/route.ts accepts POST requests, wires the full dependency tree, and maps each error type to the appropriate HTTP status code:
ts
import { NextRequest, NextResponse } from "next/server";import { ConfidenceRouter } from "@reaatech/confidence-router";import { BudgetController } from "@reaatech/agent-budget-engine";import { BudgetScope } from "@reaatech/agent-budget-types";import { ReportRequestSchema } from "@/src/lib/types";import { BudgetExceededError, OutputValidationError, SandboxTimeoutError } from "@/src/lib/types";import { BedrockPricingProvider } from "@/src/lib/pricing-provider";import { InMemorySpendStore } from "@/src/lib/spend-store";import { SimpleEmbedder } from "@/src/lib/embedder";import { createCacheEngine } from "@/src/lib/cache-factory";import { BedrockService }
The route handler maps each error type to the appropriate HTTP status code: 400 for bad requests, 429 for budget exceeded, 422 for validation errors, 504 for sandbox timeouts, and 500 for everything else. The defaultEstimateTokens option on BudgetController sets a fallback token count for pre-flight cost checks.
Expected output:POST /api/report/generate with { "prompt": "Generate a balance sheet" } returns 200 with { pdfBase64, xlsxBase64, metadata }. An empty prompt returns 400 with { error: "Invalid request", details: [...] }.
Step 13: Run the tests
The test suite covers every module with happy-path, error, and boundary cases. Run all tests: