Skip to content
reaatechREAATECH

Files · Ollama Code Sandbox for SMB Financial Analysis

62 (1 binary, 509.2 kB total)attempt 2

README.md·4370 B·markdown
markdown
# Ollama Code Sandbox for SMB Financial Analysis
 
> Self-hosted, on-premises code execution environment that lets small business analysts run Python and SQL queries with automatic cost controls and reliability safeguards.
 
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
 
## Features
 
- **Ollama code generation** — Generate Python/SQL analysis code from natural-language queries using local LLMs
- **Daytona sandbox execution** — Execute generated code in isolated ephemeral sandboxes
- **Budget controller** — Enforce per-user/scope spending limits with soft and hard caps
- **Circuit breaker** — Prevent cascading failures when the sandbox service degrades
- **Semantic cache** — Avoid redundant LLM calls via exact-match and semantic-similarity caching
- **Code repair** — Extract fenced code blocks and fix common syntax issues before execution
 
## Prerequisites
 
- **Node.js** >= 22
- **pnpm** >= 10
- **Ollama** server running locally (default: `http://127.0.0.1:11434`) with a model pulled (e.g. `llama3.1`)
- **Daytona** server with API access
 
## Environment Variables
 
| Variable | Default | Description |
|---|---|---|
| `OLLAMA_HOST` | `http://127.0.0.1:11434` | Ollama server URL |
| `OLLAMA_MODEL` | `llama3.1` | Model name for code generation |
| `DAYTONA_API_KEY` | — | Daytona sandbox API key |
| `DAYTONA_SERVER_URL` | — | Daytona server URL |
| `BUDGET_DEFAULT_LIMIT` | `10.0` | Default budget in dollars |
| `BUDGET_SOFT_CAP` | `0.8` | Soft-cap ratio (0–1) |
| `BUDGET_HARD_CAP` | `1.0` | Hard-cap ratio (0–1) |
| `LANGFUSE_PUBLIC_KEY` | — | Langfuse observability public key |
| `LANGFUSE_SECRET_KEY` | — | Langfuse observability secret key |
| `LANGFUSE_HOST` | `https://cloud.langfuse.com` | Langfuse server URL |
| `CACHE_TTL_SECONDS` | `3600` | Cache entry TTL |
| `CACHE_SIMILARITY_THRESHOLD` | `0.85` | Semantic similarity threshold |
| `LOG_LEVEL` | `info` | Pino log level |
| `CIRCUIT_BREAKER_FAILURE_THRESHOLD` | `5` | Consecutive failures before open |
| `CIRCUIT_BREAKER_RECOVERY_TIMEOUT_MS` | `30000` | Milliseconds before half-open retry |
| `OLLAMA_INPUT_TOKEN_RATE` | `0` | Input token cost override ($/token) |
| `OLLAMA_OUTPUT_TOKEN_RATE` | `0` | Output token cost override ($/token) |
 
## Getting Started
 
```bash
pnpm install
cp .env.example .env
pnpm dev
```
 
Open [http://localhost:3000](http://localhost:3000) to access the sandbox UI.
 
## API Reference
 
### `POST /api/analyze`
 
Generate and execute analysis code from a natural-language query.
 
**Request body:**
 
```json
{
  "query": "Calculate total revenue by month for Q1",
  "language": "python",
  "budgetScope": "default",
  "useCase": "general"
}
```
 
| Field | Type | Default | Description |
|---|---|---|---|
| `query` | `string` | — | Natural-language analysis description (required) |
| `language` | `"python" \| "sql"` | `"python"` | Target language |
| `budgetScope` | `string` | `"default"` | Budget scope key |
| `useCase` | `string` | `"general"` | Cache segmentation key |
 
**Success response (200):**
 
```json
{
  "code": "print('hello')",
  "output": "hello\n",
  "cacheHit": false,
  "cost": 0.0005,
  "tokensUsed": { "input": 120, "output": 45 }
}
```
 
**Error responses:**
 
| Status | Error |
|---|---|
| `400` | `"ValidationError"` — missing or invalid `query` |
| `402` | `"BudgetExceeded"` — spending limit reached |
| `500` | `"InternalServerError"` — unexpected failure |
 
## Architecture
 
The analysis pipeline flows through these stages:
 
```
User query

[1] Semantic cache lookup — return cached result if similar query exists

[2] Budget check — verify the request does not exceed the spending cap

[3] Ollama code generation — LLM produces executable code from the query

[4] Code repair — extract fenced block, fix syntax issues

[5] Daytona sandbox execution — run code in ephemeral isolated environment

[6] Spend recording — tally actual cost against the budget

[7] Cache store — persist the result for future requests

Response returned to caller
```
 
## Testing
 
```bash
pnpm test        # vitest run with coverage
pnpm typecheck   # tsc --noEmit
pnpm lint        # eslint
```
 
## License
 
MIT — see [LICENSE](./LICENSE).