Self-hosted, on-premises code execution environment that lets small business analysts run Python and SQL queries with automatic cost controls and reliability safeguards.
SMB analysts need to run custom data processing scripts and financial models, but cloud sandboxes raise data privacy concerns and unpredictable costs. They lack a self-hosted solution that keeps sensitive financial data on-premises while providing robust code execution with governance.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial builds a self-hosted code sandbox for small business financial analysis. You’ll create a Next.js application that accepts natural-language analysis requests, generates Python or SQL code using a local Ollama LLM, executes that code in an isolated Daytona sandbox, and enforces session-level spending limits with a circuit breaker for reliability. Sensitive financial data never leaves your infrastructure.
Prerequisites
Node.js >= 22 and pnpm >= 10
An Ollama server running locally (default http://127.0.0.1:11434) with a model pulled (e.g. llama3.1)
A Daytona server with API access
Familiarity with TypeScript, Next.js App Router, and the terminal
Step 1: Scaffold the project
The scaffold agent has already created the Next.js 16 project shell with all dependencies installed and config files in place. Here’s what you get:
code
app/
api/analyze/route.ts # API route (placeholder — you will replace)
page.tsx # Home page (placeholder)
layout.tsx # Root layout
globals.css # Global styles
src/
lib/ # Your library modules go here
index.ts # Placeholder entry point
tests/ # Your test files go here
index.test.ts # Placeholder test
packages/ # Package API READMEs (your spec)
reaatech__agent-budget-engine.md
reaatech__circuit-breaker-agents.md
reaatech__llm-cache.md
ollama.md
daytonaio__sdk.md
langfuse.md
zod.md
express.md
package.json # Exact-pinned deps
tsconfig.json # Strict TypeScript config
vitest.config.ts # Vitest with coverage thresholds
next.config.ts # Next.js config
eslint.config.mjs # ESLint flat config
.env.example # Environment variable placeholders
The dependencies are already installed and exact-pinned. Run pnpm install only if you add a new package.
Expected output: Your terminal shows the project tree from ls -la. All scaffold files are present.
Step 2: Define the shared types
Start by defining the interfaces that every module in this recipe depends on. Create src/lib/types.ts:
Expected output: You now import logger anywhere in the app and get structured JSON logging at the configured level.
Step 4: Implement the pricing provider
The OllamaPricingProvider estimates token costs from a lookup table of per-model rates, with environment variable overrides. Create src/lib/pricing.ts:
The lookupRate method uses prefix matching so llama3.1:8b resolves to the llama3.1 entry. The env var overrides let operators set flat rates for any model.
Create src/lib/spend-store.ts — a lightweight spend tracker that extends the vendored SpendStore from @reaatech/agent-budget-spend-tracker:
ts
import { SpendStore as VendoredSpendStore } from '@reaatech/agent-budget-spend-tracker';import { BudgetScope } from '@reaatech/agent-budget-types';class InMemorySpendStore extends VendoredSpendStore { private store: Map<string, number> = new Map(); constructor() { super({ maxEntries: 1_000_000 }); } record(entry: import('@reaatech/agent-budget-types').SpendEntry): number { const localKey = `${entry.scopeType}:${entry.scopeKey}`; const current = this.store.get(localKey) ?? 0; this.store.set(localKey, current + entry.cost); return 0; } getSpend(scopeType: BudgetScope, scopeKey: string): number { return this.store.get(`${scopeType}:${scopeKey}`) ?? 0; }}const spendStore = new InMemorySpendStore();export default spendStore;export { InMemorySpendStore };
The store uses a composite key of scopeType:scopeKey so different scopes (users, sessions) never interfere. On a production deployment you would swap this for a database-backed adapter.
Expected output: After calling record() with cost 5 for user scope “alice”, getSpend(BudgetScope.User, 'alice') returns 5.
Step 6: Wire the budget controller
Now connect the spend store and pricing provider into a BudgetController from @reaatech/agent-budget-engine. Create src/lib/budget.ts:
Expected output: After defineSessionBudget('scope-1', 10), checkBudget('scope-1', 1, 'llama3.1') returns { allowed: true }. After spending to the hard cap, checkBudget returns { allowed: false }.
Step 7: Build the semantic cache
The @reaatech/llm-cache package provides exact-match and semantic-similarity caching. You’ll wrap it with a custom OllamaEmbedder that generates embeddings using Ollama’s embed() API. Create src/lib/cache.ts:
The cache first checks for an exact hash match, then falls back to semantic similarity. Queries semantically similar to a previous analysis return the cached result without calling the LLM.
Expected output:cacheGet('same prompt', { model: 'llama3.1' }) after cacheSet returns { hit: true, type: 'exact' }. A semantically similar query returns { hit: true, type: 'semantic' }.
Step 8: Build code repair utilities
LLM output is unpredictable — it might wrap code in markdown fences or leave trailing commas. The repair module handles that. Create src/lib/repair.ts:
ts
export function extractCodeBlock(text: string): string { const match = text.match(/```\w*\n([\s\S]*?)```/); if (match) { return match[1].trim(); } return text;}export function repairSyntax(code: string): string { let result = code.replace(/```/g, ''); result = result.replace(/\r\n/g, '\n'); result = result.replace(/,(\s*[}\]])/g, '$1'); return result;}export function validateGeneratedCode( code: string, language: 'python' | 'sql',): { valid: boolean; errors: string[] } { const errors: string[] = []; const openBraces = (code.match(/\{/g) ?? []).length; const closeBraces = (code.match(/\}/g) ?? []).length; if (openBraces !== closeBraces) { errors.push(`Unmatched braces: ${String(openBraces)} opening, ${String(closeBraces)} closing`); } const openBrackets = (code.match(/\[/g) ?? []).length; const closeBrackets = (code.match(/\]/g) ?? []).length; if (openBrackets !== closeBrackets) { errors.push(`Unmatched brackets: ${String(openBrackets)} opening, ${String(closeBrackets)} closing`); } const openParens = (code.match(/\(/g) ?? []).length; const closeParens = (code.match(/\)/g) ?? []).length; if (openParens !== closeParens) { errors.push(`Unmatched parentheses: ${String(openParens)} opening, ${String(closeParens)} closing`); } const singleQuotes = (code.match(/'/g) ?? []).length; if (singleQuotes % 2 !== 0) { errors.push('Unmatched single quotes'); } const doubleQuotes = (code.match(/"/g) ?? []).length; if (doubleQuotes % 2 !== 0) { errors.push('Unmatched double quotes'); } if (language === 'python') { if (code.includes('def ') || code.includes('class ')) { const emptyBody = /(def|class)\s+\w+[^:]*:\s*(pass|\.\.\.)?\s*($|(?=\n\S))/; if (emptyBody.test(code)) { errors.push('Function or class with empty body'); } } } if (language === 'sql') { const sqlKeywords = /\b(SELECT|INSERT|UPDATE|DELETE|CREATE|DROP|ALTER|WITH)\b/i; if (!sqlKeywords.test(code)) { errors.push('No SQL statement detected'); } } return { valid: errors.length === 0, errors };}
When the Daytona sandbox fails repeatedly (controlled by CIRCUIT_BREAKER_FAILURE_THRESHOLD), the circuit opens and subsequent attempts return immediately with a graceful error instead of hanging or crashing the request.
Expected output: A successful execution returns { exitCode: 0, stdout: '42', stderr: '', durationMs: 100 }. After the failure threshold, the circuit opens and returns { exitCode: -1, stderr: 'Circuit breaker open' }.
Step 10: Create the API route handler
Now wire everything together in the POST /api/analyze route handler. Replace app/api/analyze/route.ts:
ts
import { NextRequest, NextResponse } from "next/server";import { z } from "zod";import ollama from "ollama";import { cacheGet, cacheSet } from "../../../src/lib/cache.js";import { checkBudget, recordSpend } from "../../../src/lib/budget.js";import pricingProvider from "../../../src/lib/pricing.js";import { extractCodeBlock, repairSyntax } from "../../../src/lib/repair.js";import sandbox from "../../../src/lib/sandbox.js";import logger from "../../../src/lib/logger.js";import { Langfuse } from "langfuse";const AnalysisRequestSchema = z.
The handler runs the full pipeline: validate the request, check the cache, verify the budget, call Ollama, repair the generated code, execute it in the Daytona sandbox, record the spend, cache the result, and return the response.
Expected output:curl -X POST http://localhost:3000/api/analyze -H 'Content-Type: application/json' -d '{"query":"print total revenue","language":"python"}' returns a 200 JSON response with code, output, cacheHit, cost, and tokensUsed fields.
Step 11: Build the frontend page
Replace app/page.tsx with a client component that lets users type analysis requests and see the results:
Expected output: Opening http://localhost:3000 shows a textarea labeled “Describe the analysis you want to run…”, a language selector (Python/SQL), and a “Run Analysis” button. Submitting the form calls the API and displays the generated code and execution output.
Step 12: Run the checks
Now verify everything compiles, lints, and passes tests.
terminal
pnpm typecheck
Expected output: tsc --noEmit exits with code 0 and no errors.
terminal
pnpm lint
Expected output: ESLint exits with code 0 and no warnings.
terminal
pnpm test
Expected output: vitest runs all test files and produces vitest-report.json with numFailedTests: 0, numTotalTests >= 40, and all four coverage metrics (lines, branches, functions, statements) at 90% or above.
Next steps
Add a database-backed spend store — Replace InMemorySpendStore with a PostgreSQL or Redis adapter so budget state survives server restarts.
Extend the language support — Add R, Julia, or Bash execution by branching on language in the sandbox executor.
Add prompt templates — Create reusable prompt templates for common financial analyses (P&L, cash flow, inventory turnover) so analysts can pick from a menu instead of writing raw queries.
Add streaming responses — Use Ollama’s stream: true option and Server-Sent Events to show generated code as it arrives.