Small businesses want to let non-engineers use natural language to run data analyses, but fear untrusted code execution, runaway costs, and compliance risks. No off-the-shelf tool offers human-in-the-loop gating with fine-grained spend controls.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building a human-in-the-loop code sandbox for small-to-medium business data teams. You’ll build a Next.js application where a business analyst types a natural-language data question, Mistral AI generates Python code, a reviewer must explicitly approve the code before it runs, and the approved code executes inside an isolated E2B sandbox. Budget caps, session continuity, and full audit trails are baked in from the start.
By the end, you’ll have a working approval-gated AI code analysis tool that non-engineers can use safely, with spend controls, cost telemetry, and Langfuse observability.
Step 4: Create the Supabase client and schema migration
Create src/services/supabase-service.ts — this module initializes the Supabase client and provides schema migration and audit log helpers:
ts
import { createClient, type SupabaseClient } from "@supabase/supabase-js";import { getConfig } from "../lib/config";let client: SupabaseClient | null = null;export function getSupabase(): SupabaseClient { if (client) return client; const config = getConfig(); client = createClient(config.supabaseUrl, config.supabaseSecretKey, { auth: { persistSession: false }, }); return client;}export async function migrateSchema(supabase: SupabaseClient): Promise<void> { const sql = ` CREATE TABLE IF NOT EXISTS sessions ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id TEXT NOT NULL, status TEXT NOT NULL DEFAULT 'active', metadata JSONB DEFAULT '{}', created_at TIMESTAMPTZ DEFAULT NOW(), last_activity_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE IF NOT EXISTS messages ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), session_id UUID REFERENCES sessions(id), role TEXT NOT NULL, content TEXT NOT NULL, created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE IF NOT EXISTS approval_requests ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), code TEXT NOT NULL, user_id TEXT NOT NULL, team_id TEXT NOT NULL, session_id UUID REFERENCES sessions(id), budget_estimate FLOAT NOT NULL, approval_status TEXT DEFAULT 'pending', reviewer_id TEXT, rejection_reason TEXT, created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE IF NOT EXISTS cost_spans ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), provider TEXT NOT NULL, model TEXT NOT NULL, input_tokens INT NOT NULL, output_tokens INT NOT NULL, cost_usd FLOAT NOT NULL, tenant TEXT NOT NULL, feature TEXT NOT NULL, timestamp TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE IF NOT EXISTS audit_logs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), action TEXT NOT NULL, actor_id TEXT NOT NULL, details JSONB DEFAULT '{}', created_at TIMESTAMPTZ DEFAULT NOW() ); `; const { error } = await supabase.rpc("exec_sql", { query: sql }); if (error) throw new Error(`migrateSchema failed: ${error.message}`);}export async function insertAuditLog( supabase: SupabaseClient, action: string, actorId: string, details: Record<string, unknown>,): Promise<void> { const { error } = await supabase.from("audit_logs").insert({ action, actor_id: actorId, details, created_at: new Date().toISOString(), }); if (error) throw new Error(`audit log insert failed: ${error.message}`);}
Expected output:getSupabase() returns a configured SupabaseClient singleton. migrateSchema() creates five tables (sessions, messages, approval_requests, cost_spans, audit_logs).
Step 5: Build the approval store and firewall
The approval gate is the core of this recipe. Create src/services/firewall-service.ts with an ApprovalStore that persists approval requests to Supabase, along with middleware and polling helpers:
ts
import { ApprovalRequiredError, type RequestContext, type Middleware, type MiddlewareResult, redact,} from "@reaatech/tool-use-firewall-core";import type { SupabaseClient } from "@supabase/supabase-js";import type { ApprovalRequest, ApprovalDecision } from "../types";import * as logger from "../lib/logger";export class ApprovalStore { private supabase: SupabaseClient; constructor(supabase: SupabaseClient) { this.supabase = supabase; }
Expected output: The ApprovalStore creates, reads, updates, and lists approval requests in Supabase. waitForApproval() polls the database until a reviewer approves or rejects the request, or the timeout expires.
Step 6: Wire up the Mistral AI code generator
Create src/services/mistral-service.ts — this sends the user’s natural-language query to Mistral and extracts Python code from the response:
ts
import { Mistral } from "@mistralai/mistralai";import { MistralError } from "@mistralai/mistralai/models/errors";import type { ContentChunk } from "@mistralai/mistralai/models/components";import pRetry from "p-retry";import { getConfig } from "../lib/config";import * as logger from "../lib/logger";const systemPrompt = "You are a Python data analysis code generator. Given a natural language query, generate Python code using pandas, matplotlib, and numpy to answer it. Return ONLY the Python code inside a markdown code block. Do NOT include explanatory text.";let client: Mistral | null = null;export function createMistralClient(): Mistral { if (client) return client; const config = getConfig(); client = new Mistral({ apiKey: config.mistralApiKey || "" }); return client;}function extractTextContent(content: string | ContentChunk[] | null | undefined): string { if (!content) return ""; if (typeof content === "string") return content; return content.map((chunk) => { if ("text" in chunk) return (chunk as { text: string }).text; return ""; }).join("");}export async function generateCode( query: string,): Promise<{ code: string; model: string; usage: { promptTokens: number; completionTokens: number } }> { const config = getConfig(); const c = createMistralClient(); const run = async () => { const result = await c.chat.complete({ model: config.mistralModel, messages: [ { role: "user" as const, content: systemPrompt }, { role: "user" as const, content: query }, ], }); const rawContent = result.choices[0]?.message?.content; const content = extractTextContent(rawContent); const model = result.model; const promptTokens = result.usage.promptTokens ?? 0; const completionTokens = result.usage.completionTokens ?? 0; const codeMatch = content.match(/```(?:python)?\s*\n?([\s\S]*?)```/); const code = codeMatch?.[1]?.trim() ?? content.trim(); return { code, model, usage: { promptTokens, completionTokens } }; }; try { return await pRetry(run, { retries: 3 }); } catch (err) { if (err instanceof MistralError) { const statusCode = err.statusCode; const body = err.body; logger.error("Mistral API error", { statusCode, body }); throw new MistralError(`Mistral API error: ${err.message}`, { response: err.rawResponse, request: new Request(err.rawResponse.url), body: err.body, }); } logger.error("Mistral code generation failed", { error: String(err) }); throw err; }}
Expected output:generateCode("Load the sales CSV and plot monthly revenue") returns { code: "import pandas as pd\n...", model: "mistral-large-latest", usage: { promptTokens: 120, completionTokens: 200 } }. The code is extracted from markdown fences with retry on transient failures.
Step 7: Create the code interpreter sandbox
Create src/services/code-interpreter-service.ts — this wraps the E2B sandbox for running approved Python code:
Step 8: Add budget enforcement with agent-budget-engine
The budget system ensures nobody exceeds their cost allocation. Create src/lib/adapters/budget-scope.ts:
ts
export enum BudgetScope { User = "user", Team = "team", Session = "session", Project = "project",}
Create src/lib/adapters/spend-store.ts — a thin in-memory spend tracker that wraps the REAA SpendStore:
ts
import { SpendStore as ExternalSpendStore } from "@reaatech/agent-budget-spend-tracker";import type { SpendEntry as ExternalSpendEntry } from "@reaatech/agent-budget-types";export class SpendStore extends ExternalSpendStore { private _entries: ExternalSpendEntry[] = []; record(entry: ExternalSpendEntry): number { const id = super.record(entry); this._entries.push({ ...entry }); return id; } getSpend(scopeType: string, scopeKey: string): number { return this._entries .filter(e => e.scopeType as string === scopeType && e.scopeKey === scopeKey) .reduce((sum, e) => sum + e.cost, 0); } getTotal(scopeType: string, scopeKey: string): number { return this.getSpend(scopeType, scopeKey); } getSpans(scopeType: string, scopeKey: string): ExternalSpendEntry[] { return this._entries.filter(e => e.scopeType as string === scopeType && e.scopeKey === scopeKey); } reset(scopeType: string, scopeKey: string): void { this._entries = this._entries.filter(e => e.scopeType as string !== scopeType || e.scopeKey !== scopeKey); } health(): boolean { return true; }}
Now create src/services/budget-service.ts — the facade that checks budgets and records spend:
ts
import { BudgetController } from "@reaatech/agent-budget-engine";import { BudgetScope } from "@reaatech/agent-budget-types";import { SpendStore } from "../lib/adapters/spend-store";import * as logger from "../lib/logger";let controller: BudgetController | null = null;export function createBudgetController(): BudgetController { if (controller) return controller; controller = new BudgetController({ spendTracker: new SpendStore() }); controller.on("hard-stop", (event) => { logger.warn("Budget hard stop triggered", { scopeType: event.scopeType, scopeKey: event.scopeKey }); }); controller.on("threshold-breach", (event) => { logger.warn("Budget threshold breached", { threshold: event.threshold }); }); return controller;}export function defineUserBudget( userId: string, limitUsd: number, softCapPct: number = 0.8,): void { const c = createBudgetController(); c.defineBudget({ scopeType: BudgetScope.User, scopeKey: userId, limit: limitUsd, policy: { softCap: softCapPct, hardCap: 1.0, autoDowngrade: [], disableTools: [], }, });}export function checkBudget( userId: string, estimatedCost: number, modelId: string,): { allowed: boolean; suggestedModel?: string; disabledTools?: string[] } { const c = createBudgetController(); const result = c.check({ scopeType: BudgetScope.User, scopeKey: userId, estimatedCost, modelId, tools: [], }); return { allowed: result.allowed, suggestedModel: result.suggestedModel, disabledTools: result.disabledTools, };}export function recordSpend( userId: string, requestId: string, cost: number, inputTokens: number, outputTokens: number, modelId: string,): void { const c = createBudgetController(); c.record({ requestId, scopeType: BudgetScope.User, scopeKey: userId, cost, inputTokens, outputTokens, modelId, provider: "mistral", timestamp: new Date(), });}export function getBudgetState( userId: string,): { spent: number; remaining: number; state: string } { const c = createBudgetController(); const state = c.getState(BudgetScope.User, userId); if (!state) { return { spent: 0, remaining: 0, state: "unknown" }; } return { spent: state.spent, remaining: state.remaining, state: state.state, };}
Expected output:checkBudget("user-1", 0.05, "mistral-large-latest") returns { allowed: true } when the user’s $10 monthly budget hasn’t been exceeded.
Step 9: Add session continuity
Create src/lib/adapters/token-counter.ts for session context budgeting:
ts
import type { Message, TokenCounter } from "@reaatech/session-continuity";export class SimpleTokenCounter implements TokenCounter { readonly model = "simple-character"; readonly tokenizer = "character-count"; count(text: string): number { return Math.ceil(text.length / 4); } countMessages(messages: Message[]): number { let total = 0; for (const m of messages) { total += this.count(typeof m.content === "string" ? m.content : JSON.stringify(m.content)); } return total; }}
Create src/services/session-service.ts — this wraps @reaatech/session-continuity with a Supabase-backed storage adapter:
ts
import { SessionManager, SlidingWindowStrategy, type Session, type Message, type IStorageAdapter,} from "@reaatech/session-continuity";import type { SupabaseClient } from "@supabase/supabase-js";import { SimpleTokenCounter } from "../lib/adapters/token-counter";import { getConfig } from "../lib/config";import type { ExecutionOutput } from "../types";export function createSupabaseStorageAdapter(supabase: SupabaseClient): IStorageAdapter { return { async createSession(session) {
Expected output:openSession("user-1", "team-a") creates a new session in Supabase and returns the session object. appendUserQuery("session-1", "load my CSV") stores the query as a user message.
Step 10: Add cost telemetry and Langfuse observability
Create src/services/telemetry-service.ts — tracks LLM spend per user and team across time windows:
Create src/server/trpc.ts — initialize tRPC with your context type:
ts
import { initTRPC } from "@trpc/server";import type { SupabaseClient } from "@supabase/supabase-js";export interface Context { userId: string; teamId: string; supabase: SupabaseClient;}const t = initTRPC.context<Context>().create({});export const router = t.router;export const publicProcedure = t.procedure;export const protectedProcedure = t.procedure.use(({ ctx, next }) => { if (!ctx.userId || ctx.userId === "anonymous") { throw new Error("UNAUTHORIZED"); } return next({ ctx });});
Create src/server/context.ts — extracts user identity from request headers and injects the Supabase client:
ts
import type { NextRequest } from "next/server";import type { Context } from "./trpc";import { getSupabase } from "../services/supabase-service";export function createTRPCContext(req: NextRequest): Context { return { userId: req.headers.get("x-user-id") ?? "anonymous", teamId: req.headers.get("x-team-id") ?? "default", supabase: getSupabase(), };}
Create src/server/routers/analysis.ts — this is the main router that ties everything together:
ts
import { z } from "zod";import { router, publicProcedure } from "../trpc";import * as budgetService from "../../services/budget-service";import * as mistralService from "../../services/mistral-service";import * as sessionService from "../../services/session-service";import * as firewallService from "../../services/firewall-service";import * as codeInterpreterService from "../../services/code-interpreter-service";import { getSupabase } from "../../services/supabase-service";export const analysisRouter = router({
Create src/server/routers/_app.ts to export the merged router:
ts
import { router } from "../trpc";import { analysisRouter } from "./analysis";export const appRouter = router({ analysis: analysisRouter,});export type AppRouter = typeof appRouter;
Step 13: Create the Next.js API route handlers
Create the tRPC catch-all handler at app/api/trpc/[trpc]/route.ts:
ts
import { type NextRequest } from "next/server";import { fetchRequestHandler } from "@trpc/server/adapters/fetch";import { appRouter } from "../../../../src/server/routers/_app";import { createTRPCContext } from "../../../../src/server/context";const handler = (req: NextRequest) => { return fetchRequestHandler({ endpoint: "/api/trpc", req, router: appRouter, createContext: () => createTRPCContext(req), });};export { handler as GET, handler as POST };
Create the analyze endpoint at app/api/analyze/route.ts:
Expected output:POST /api/analyze with { "query": "sum the sales data" } returns { "requestId": "...", "sessionId": "...", "previewCode": "..." }. POST /api/approve with { "requestId": "...", "approved": true, "reviewerId": "reviewer-1" } returns { "ok": true }.
Step 14: Build the frontend UI
Create the tRPC client utility at src/utils/trpc.tsx:
ts
import { QueryClient, QueryClientProvider } from "@tanstack/react-query";import { httpBatchLink } from "@trpc/client";import { createTRPCReact } from "@trpc/react-query";import type { AppRouter } from "../server/routers/_app";export const trpc = createTRPCReact<AppRouter>();export function TRPCProvider({ children }: { children: React.ReactNode }) { const queryClient = new QueryClient(); const trpcClient = trpc.createClient({ links: [httpBatchLink({ url: "/api/trpc" })], }); return ( <QueryClientProvider client={queryClient}> <trpc.Provider client={trpcClient} queryClient={queryClient}> {children} </trpc.Provider> </QueryClientProvider> );}
Now update app/layout.tsx to provide the tRPC context:
ts
import type { Metadata } from "next";import { TRPCProvider } from "@/src/utils/trpc";export const metadata: Metadata = { title: "Mistral AI Code Sandbox", description: "AI-powered Python code analysis with human approval gates",};export default function RootLayout({ children,}: Readonly<{ children: React.ReactNode;}>) { return ( <html lang="en"> <body style={{ margin: 0, padding: 0 }}> <TRPCProvider>{children}</TRPCProvider> </body> </html> );}
Replace the placeholder app/page.tsx with the sandbox UI — a full React client component that wires query input, budget display, code preview, approval buttons, and execution output:
ts
"use client";import { useEffect, useState } from "react";import { trpc } from "@/src/utils/trpc";import { QueryClient, QueryClientProvider } from "@tanstack/react-query";import { httpBatchLink } from "@trpc/client";function makeQueryClient() { return new QueryClient({ defaultOptions: { queries: { staleTime: 30_000 } }, });}let queryClient: QueryClient | undefined;function getQueryClient() { if (typeof window === "undefined") return
Expected output: When you run pnpm dev and visit http://localhost:3000, you’ll see the sandbox UI with a textarea, budget bar, and — after submitting a query — the generated Python code with Approve and Reject buttons.
Step 15: Run the tests
The project includes unit tests for every service and integration tests for the end-to-end approval workflow. Run them with:
terminal
pnpm test
Expected output: All tests pass with numFailedTests=0 and coverage above 90% for lines, branches, functions, and statements on runtime code.
Tests cover the full approval lifecycle: submit a query, create an approval request, approve it, execute the code in the sandbox, and store results. Rejection paths and budget exhaustion are also tested.
Next steps
Add a real review UI — Create an approval dashboard page listing all pending requests per team, so reviewers can see all queued code at once instead of approving inline.
Persist budget state — The current SpendStore is in-memory; wire it to Supabase so budget state survives server restarts and is shareable across replicas.
Add usage alerts — Integrate the budget engine’s threshold-breach event to send Slack or email notifications when a user approaches their spending cap.
Support file inputs — Extend the UI to let users upload CSVs or data files, which would be passed into the E2B sandbox’s filesystem before code execution.
Add multi-model support — Allow users to choose between Mistral, Claude, or GPT for code generation, with per-model pricing and budget enforcement.