Ollama Reliability Suite for On-Prem SMB Agent Operations

Keep local Ollama agents running 24/7 with automatic circuit breaking, runbook generation, session durability, and repair of broken structured outputs.

ollama reliability circuit-breaker session-continuity structured-output runbook agents typescript nextjs triggerdev

The problem

Small businesses running local AI agents on Ollama face silent failures when models return garbled JSON, hits rate limits, or crash mid-session, leaving customers hanging. They lack the ops tooling to detect and recover from these failures without a dedicated SRE.

Built from

Intro

Small businesses running local AI agents on Ollama face silent failures when models return garbled JSON, hit rate limits, or crash mid-session — leaving customers hanging. They lack the ops tooling to detect and recover from these failures without a dedicated SRE. This tutorial builds a complete reliability suite: a circuit breaker that stops hammering a downed model, session continuity that preserves conversation state across restarts, a structured output repair engine that fixes malformed LLM JSON before it reaches your application logic, and automated runbook generation via Trigger.dev durable workflows when sessions time out.

You will build a CLI agent and a Next.js API route that demonstrate all four reliability layers working together. The code runs against a local Ollama instance and works on any Linux or macOS machine with Node.js 22+, pnpm, and a running Ollama server.

Prerequisites

Node.js >= 22 installed
pnpm 10+ (npm install -g pnpm@10)
Ollama running locally (ollama serve)
An Upstash Redis account (free tier works) for rate limiting
A Trigger.dev account for durable workflows
An Anthropic API key for runbook generation with Claude
A Langfuse account for LLM tracing (free tier works)

Step 1: Scaffold the project and install dependencies

The project uses Next.js 16 (App Router) as its shell, with TypeScript, Vitest for testing, and ESLint for linting. Start by installing the dependencies already listed in package.json:

terminal

pnpm install

This installs the four REAA reliability packages plus third-party dependencies:

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

177 kB·198 tests·100.0% coverage·vitest passing

SHA-25675072bc866ffac4edf35bac82724f844b5ccab5a6e365415b3f4eb997156f90e

Book a conversation All solutions

Comments

Loading comments…

Intro

Prerequisites

Node.js >= 22 installed
pnpm 10+ (npm install -g pnpm@10)
Ollama running locally (ollama serve)
An Upstash Redis account (free tier works) for rate limiting
A Trigger.dev account for durable workflows
An Anthropic API key for runbook generation with Claude
A Langfuse account for LLM tracing (free tier works)

Step 1: Scaffold the project and install dependencies

The project uses Next.js 16 (App Router) as its shell, with TypeScript, Vitest for testing, and ESLint for linting. Start by installing the dependencies already listed in package.json:

terminal

pnpm install

This installs the four REAA reliability packages plus third-party dependencies:

Package	Version	Purpose
`@reaatech/circuit-breaker-core`	0.1.0	Circuit breaker state machine
`@reaatech/session-continuity`	0.1.0	Session lifecycle manager
`@reaatech/structured-repair-core`	1.0.0	Structured output repair engine
`@reaatech/agent-runbook-agent`	0.1.0	AI-powered runbook generation
`ollama`	0.6.3	Ollama TypeScript client
`@trigger.dev/sdk`	4.4.6	Durable workflow framework
`zod`	4.4.3	Schema validation
`jsonrepair`	3.14.0	JSON syntax repair
`@upstash/ratelimit`	2.0.8	Rate limiting via Upstash Redis
`@upstash/redis`	1.38.0	Upstash Redis client (used by ratelimit)
`langfuse`	3.38.20	LLM observability and tracing
`commander`	14.0.0	CLI argument parsing
`next`	16.2.6	Next.js framework
`react` / `react-dom`	19.2.4	Next.js peer dependencies

import { task, logger } from '@trigger.dev/sdk'; import { createAnalysisAgent } from '@reaatech/agent-runbook-agent'; import { createSessionManager } from './session-manager.js'; export const sessionTimeoutTask = task({ id: 'session-timeout', run: async (payload: { sessionId: string }) => { const manager = createSessionManager(); try { const messages = await manager.getConversationContext(payload.sessionId); const analysisContext: Record<string, unknown> = { systemPrompts: messages .filter((m) => m.role === 'system') .map((m) => typeof m.content === 'string' ? m.content : JSON.stringify(m.content), ), userMessages: messages .filter((m) => m.role === 'user') .map((m) => typeof m.content === 'string' ? m.content : JSON.stringify(m.content), ), assistantResponses: messages .filter((m) => m.role === 'assistant') .map((m) => typeof m.content === 'string' ? m.content : JSON.stringify(m.content), ), errorCount: 0, circuitState: 'unknown', }; const agent = createAnalysisAgent({ provider: 'claude', model: 'claude-opus-4-5', apiKey: process.env.ANTHROPIC_API_KEY, temperature: 0.3, }); const [failureModes, alertsSection, incidentSection, healthSection] = await Promise.all([ agent.identifyFailureModes(analysisContext as never), agent.generateRunbookSection('alerts', analysisContext as never), agent.generateRunbookSection( 'incident-response', analysisContext as never, ), agent.generateRunbookSection( 'health-checks', analysisContext as never, ), ]); logger.info('Session timeout analysis complete', { sessionId: payload.sessionId, failureModeCount: failureModes.length, }); return { sessionId: payload.sessionId, failureModes, runbook: { alerts: alertsSection, 'incident-response': incidentSection, 'health-checks': healthSection, }, }; } catch (error) { const message = error instanceof Error ? error.message : String(error); logger.error('Session timeout analysis failed', { sessionId: payload.sessionId, error: message, }); return { sessionId: payload.sessionId, error: message, failureModes: [], runbook: { alerts: `Error generating runbook: ${message}`, 'incident-response': `Error generating runbook: ${message}`, 'health-checks': `Error generating runbook: ${message}`, }, }; } finally { await manager.close(); } }, }); export function createTimeoutWorkflow(): void { void sessionTimeoutTask; }

Ollama Reliability Suite for On-Prem SMB Agent Operations

The problem

Built from

Intro

Prerequisites

Step 1: Scaffold the project and install dependencies

Example artifact

Comments

Intro

Prerequisites

Step 1: Scaffold the project and install dependencies

Step 2: Create the Ollama API client wrapper

Step 3: Configure the circuit breaker

Step 4: Build the structured output repair pipeline

Step 5: Create the token counter and rate limiter

Step 6: Set up observability with Langfuse

Step 7: Implement the in-memory storage adapter

Step 8: Create the session manager

Step 9: Build the agent orchestrator

Step 10: Set up Trigger.dev workflow for runbook generation

Step 11: Build the CLI

Step 12: Add the Next.js API route

Step 13: Run the tests

Next steps