SMBs relying on AI agents experience frequent tool failures, duplicate operations, unhandled errors, and uncontrolled costs, which erode trust and delay critical processes.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
In this tutorial you’ll build a production-grade AI agent system on Google Gemini that handles tool failures gracefully, prevents duplicate operations, enforces spending limits, and generates automated incident runbooks. By the end you’ll have a working Next.js application with a reliability layer wrapping every Gemini call — circuit breakers isolate broken tools, idempotency keys block duplicate mutations, a budget engine caps spend with auto-downgrade, and incident workflows fire automatically when things go wrong.
Prerequisites
Node.js >= 22 — the project requires modern ESM support
pnpm 10.x — package manager (set via packageManager in package.json)
Google Cloud project — with Vertex AI or Gemini API enabled
Redis instance — local or cloud (Upstash, Redis Enterprise Cloud, etc.)
Trigger.dev account — free tier works; you need an API key
Required environment variables (copy the values from your providers):
Start from an empty directory and create the project configuration files. These define the TypeScript, Next.js, Vitest, and ESLint setup that all subsequent code depends on.
Expected output: The install finishes with no error-level output. You should see the node_modules directory created and pnpm-lock.yaml written.
Step 2: Configure environment variables
Create .env.local with the values for your Google Cloud project, Redis instance, and Trigger.dev account. The config module validates these at runtime using Zod.
Create src/config/env.ts. This validates all environment variables at startup using Zod. If a required variable is missing, the module falls back to empty strings rather than crashing — this allows tests to run without a full environment:
Create src/config/genai.ts. This creates the Gemini client once and exports it as a singleton:
ts
import { GoogleGenAI } from "@google/genai";import { env } from "./env.js";export const genai = new GoogleGenAI({ enterprise: true, project: env.GOOGLE_CLOUD_PROJECT, location: env.GOOGLE_CLOUD_LOCATION,});
Create src/config/redis.ts. This instantiates the Redis client that the idempotency and circuit-breaker layers depend on:
ts
import { Redis } from "ioredis";import { env } from "./env.js";export const redis = new Redis(env.REDIS_URL);
Create src/config/trigger.ts. This wraps the Trigger.dev v3 tasks API:
ts
import { tasks } from "@trigger.dev/sdk/v3";export const triggerClient = { trigger: tasks.trigger, triggerAndWait: tasks.triggerAndWait,};export type TriggerClient = typeof triggerClient;
Create src/index.ts. This is the public entry point for the package:
ts
// Placeholder entry point — the builder replaces this with real services.export const SCAFFOLD_VERSION = "0.1.0" as const;
Expected output: After running pnpm typecheck, TypeScript reports no errors.
Step 4: Build the circuit breaker infrastructure
The circuit breaker layer isolates tool failures so a broken tool (e.g., a down email service) doesn’t cascade to other operations. Each tool gets its own CircuitBreaker instance backed by a Redis adapter that distributes state across your infrastructure.
The adapter uses a Redis key prefix of "cb" and a 86400-second TTL (24 hours). When a tool fails 5 times within 30 seconds, the circuit opens for 30 seconds before allowing a retry. createToolCircuitBreaker lazily connects to Redis on first use so tests can mock the adapter without needing a real Redis instance.
Step 5: Build the idempotency middleware
The idempotency layer prevents duplicate operations. Before any mutation runs, you call checkAndSetIdempotencyKey — if the key already exists in Redis, another worker is already processing it, and you return the cached result instead of running again.
checkAndSetIdempotencyKey uses Redis’s SET ... NX command — it atomically sets the key only if it doesn’t already exist. If isNew is false, another process already claimed this operation. getIdempotencyResult reads the cached output stored under result:<key> to return a prior response. setIdempotencyResult writes the result back to Redis so subsequent calls get the cached output.
Step 6: Build the runbook infrastructure
The runbook layer generates incident response workflows automatically when something goes wrong — a circuit breaker opens, a budget threshold is breached, or a tool fails. It uses @reaatech/agent-runbook-incident to produce severity-based escalation policies and notification templates.
reportFailure takes a failure event and returns the generated incident workflows, escalation policy, and a mapped severity. The workflow output is designed to plug into your incident management system (PagerDuty, OpsGenie, etc.) — the workflows object contains structured steps that can be executed programmatically.
Step 7: Build the budget controller
The budget layer enforces spending limits per user session. It checks estimated costs before each Gemini call, can auto-downgrade the model (e.g., from gemini-2.5-pro to gemini-2.5-flash) when spending approaches a soft cap, and fires a hard stop when the budget is exhausted.
defineUserBudget registers a session with a $0.50 limit and an auto-downgrade policy — if the session hits 80% of its budget, the engine suggests switching from gemini-2.5-pro to gemini-2.5-flash. Both threshold-breach and hard-stop events trigger reportFailure so the incident runbook system captures the budget violation.
Step 8: Create the reliability agent workflow
This is the heart of the recipe. reliability-agent.ts defines a Trigger.dev task that wires together all the reliability layers: idempotency check, budget check, Gemini call, circuit-breaker-wrapped tool execution, and result caching. The task is also exported as a standalone function (runReliabilityAgent) so it can be tested directly without Trigger.dev.
Create src/workflows/reliability-agent.ts:
ts
import { task } from "@trigger.dev/sdk/v3";import { type FunctionDeclaration } from "@google/genai";import { genai } from "../config/genai.js";import { checkAndSetIdempotencyKey, getIdempotencyResult, setIdempotencyResult,} from "../infrastructure/idempotency.js";import { checkBudget, recordSpend, defineUserBudget, BudgetScope,} from "../infrastructure/budget.js";import { createToolCircuitBreaker, executeWithBreaker, CircuitOpenError,} from "../infrastructure/circuit-breaker.js";import { reportFailure }
The flow: idempotency check → budget pre-check → Gemini call → circuit-breaker-wrapped tool calls → spend recording → result caching. If the idempotency key is a hit, the function returns the cached output immediately. If the budget is exceeded, it throws BudgetExceededError before any LLM call is made. Circuit breaker errors are reported to the runbook and then re-thrown so Trigger.dev can retry the full workflow.
Step 9: Create the API routes
The API exposes two endpoints: POST /api/agent to trigger a reliability agent run, and GET /api/status as a health check.
Create app/api/agent/route.ts:
ts
import { z } from "zod";import { triggerClient } from "../../../src/config/trigger.js";const agentInputSchema = z.object({ prompt: z.string().min(1), sessionId: z.string().min(1), idempotencyKey: z.string().optional(),});export async function POST(req: Request) { const body = (await req.json()) as unknown; const parse = agentInputSchema.safeParse(body); if (!parse.success) { return Response.json({ error: "Invalid input" }, { status: 400 }); } const run = await triggerClient.trigger("reliability-agent", parse.data); return Response.json({ runId: run.id }, { status: 202 });}
The route validates the request body with Zod, then fires the Trigger.dev task asynchronously. It returns a 202 Accepted with the Trigger.dev run ID so the client can poll for completion.
import type { ReactNode } from "react";export const metadata = { title: "Recipe", description: "Tutorialized reference solution from reaatech.com",};export default function RootLayout({ children }: { children: ReactNode }) { return ( <html lang="en"> <body>{children}</body> </html> );}
Create app/page.tsx:
tsx
export default function Home() { return ( <main style={{ padding: "2rem", fontFamily: "sans-serif" }}> <h1>Gemini Reliability Suite for SMB AI Operations</h1> <p> A production-ready reference recipe wrapping Google Gemini API calls with circuit breakers, budget controls, idempotency, runbook incident generation, and Trigger.dev background task execution. </p> <ul> <li> <strong>Circuit Breaker</strong> — per-tool Redis-backed failure isolation </li> <li> <strong>Budget Engine</strong> — soft/hard caps with auto-downgrade from gemini-2.5-pro to gemini-2.5-flash </li> <li> <strong>Idempotency</strong> — Redis-based deduplication for agent runs </li> <li> <strong>Runbook</strong> — automated incident workflow generation </li> <li> <strong>Trigger.dev</strong> — background reliability-agent task </li> </ul> <p> API endpoints: <code>POST /api/agent</code> |{" "} <code>GET /api/status</code> </p> </main> );}
Step 10: Run the tests
The test suite uses Vitest with v8 coverage. Every infrastructure module and the reliability agent workflow have dedicated test files with mocked external dependencies so they run without real Redis, Gemini, or Trigger.dev connections.
Run the full test suite:
terminal
pnpm test
Expected output: Vitest prints a summary with the coverage report. The suite targets 90% line, branch, function, and statement coverage. The test output should show all tests passing with a final coverage summary like:
code
TEST PASSED test count
Coverage: lines 92.3% | branches 91.1% | functions 95.0% | statements 92.3%
If you see coverage below the threshold, the test run exits with a non-zero code. Fix the uncovered branches by adding test cases for edge conditions (null function names, empty response text, missing cache results).
You can run a specific test file for faster iteration:
terminal
pnpm vitest run tests/workflows/reliability-agent.test.ts
Next steps
Wire up a real Trigger.dev endpoint — register the reliability-agent task in your Trigger.dev project and replace the mock triggerClient with the real client using your TRIGGER_API_KEY. The task ID "reliability-agent" must match your Trigger.dev project configuration.
Add PagerDuty/OpsGenie integration — extend reportFailure in src/infrastructure/runbook.ts to send alerts to your incident management system. The generated workflow objects contain structured severity levels and escalation contacts that map directly to those platforms’ APIs.
Persist budget state to Redis — replace the in-memory SpendStore in src/infrastructure/budget.ts with a Redis-backed implementation so budget limits survive server restarts and work across multiple application instances.
from
"../infrastructure/runbook.js"
;
export interface ReliabilityAgentInput {
prompt: string;
sessionId: string;
idempotencyKey?: string;
}
export interface ReliabilityAgentOutput {
text: string;
usage: {
inputTokens: number;
outputTokens: number;
cost: number;
};
toolsCalled: string[];
idempotencyStatus: "cache_hit" | "fresh";
}
const createOrderTool: FunctionDeclaration = {
name: "create_order",
description: "Create a new order",
parametersJsonSchema: {
type: "object",
properties: {
product: { type: "string" },
quantity: { type: "number" },
},
required: ["product", "quantity"],
},
};
const sendEmailTool: FunctionDeclaration = {
name: "send_email",
description: "Send an email notification",
parametersJsonSchema: {
type: "object",
properties: {
to: { type: "string" },
subject: { type: "string" },
body: { type: "string" },
},
required: ["to", "subject", "body"],
},
};
function createOrder(args: Record<string, unknown>): string {
return `Order created for ${String(args.product)} x ${String(args.quantity)}`;
}
function sendEmail(args: Record<string, unknown>): string {