Multi-Agent Observability for Small Business Support

Intro

You’ll build a Next.js observability dashboard that tracks every AI agent your small business runs — support bots, lead qualifiers, appointment setters — in a single pane of glass. Each agent call is instrumented with OpenTelemetry spans, logged with Pino, and routed to Langfuse for aggregation. By the end you’ll have a working dashboard that shows cost, latency, failure rate, and lets you replay any past conversation to debug what went wrong.

Prerequisites

Node.js >= 22 (check with node --version)
pnpm 10.15.1 (check with pnpm --version; install with npm install -g pnpm@10.15.1)
A Langfuse account (cloud at langfuse.com or self-hosted) — you’ll need a public key, secret key, and host URL
An OpenTelemetry Collector endpoint — this recipe sends traces via OTLP/HTTP; you can use the Langfuse OTel endpoint directly or run a local collector
A Slack webhook URL (optional, for health check alerts)
Familiarity with TypeScript and Next.js App Router — you should know how src/app/ route handlers and pages work

Step 1: Scaffold the project

Create an empty directory, then add the project manifest and TypeScript configuration.

Create package.json:

json

{
  "name": "multi-agent-observability-for-small-business-support",
  "version": "0.1.0",
  "private": true,
  "type": "module",
  "engines": {
    "node": ">=22"
  },
  "packageManager": "pnpm@10.15.1",
  "scripts"

The @reaatech/* packages provide the observability hooks, Langfuse aggregates traces, OpenTelemetry handles distributed tracing, and Pino gives you structured JSON logging.

Create tsconfig.json:

json

{
  "compilerOptions": {
    "target": "ES2022",
    "lib": ["ES2022", "DOM", "DOM.Iterable"],
    "module": "ESNext",
    "moduleResolution": "bundler",
    "strict": true,
    "noUncheckedIndexedAccess": true,
    "noImplicitOverride": true,
    "exactOptionalPropertyTypes": true,

The @/* path alias maps to ./src/*, the same alias the Next.js App Router convention uses.

Create vitest.config.ts:

import { defineConfig } from "vitest/config";
import path from "path";
 
export default defineConfig({
  test: {
    environment: "node",
    include: ["src/**/*.test.ts", "src/**/*.test.tsx", "tests/**/*.test.ts"],
    passWithNoTests: false,
    exclude: ["node_modules", ".next", "dist", "coverage"],
    coverage: {

This tells Vitest to match the @/* import alias, run tests with Node, and enforce 90% coverage thresholds across all metrics. UI components under src/app/dashboard are excluded from coverage since they’re visual.

Create a .gitignore file to keep build artifacts out of version control:

code

node_modules/
.next/
dist/
.env
coverage/
*.log
.tsbuildinfo

Step 2: Install dependencies

From your project root, install everything:

terminal

pnpm install

Expected output: pnpm resolves and installs all packages. You’ll see a “Done” message along with the install time. The pnpm-lock.yaml is created automatically.

Step 3: Configure environment variables

Copy .env.example to .env and fill in your real values. The artifact ships with this template:

code

LANGFUSE_PUBLIC_KEY=
LANGFUSE_SECRET_KEY=
LANGFUSE_HOST=
OTEL_EXPORTER_OTLP_ENDPOINT=
SLACK_WEBHOOK_URL=
CRON_SECRET=
LOG_LEVEL=info

A complete .env for local development might look like this:

code

LANGFUSE_PUBLIC_KEY=pk-lf-abc123
LANGFUSE_SECRET_KEY=sk-lf-xyz789
LANGFUSE_HOST=https://cloud.langfuse.com
OTEL_EXPORTER_OTLP_ENDPOINT=https://cloud.langfuse.com/api/public/otel
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/T00/B00/xxxx
CRON_SECRET=super-secret-cron-token-123
LOG_LEVEL=info

The app validates these at startup (you’ll write that validator in the next step). SLACK_WEBHOOK_URL is optional — if blank, health-check failures are logged but not sent to Slack.

Step 4: Write the observability primitives

Create the src/lib/ directory and add four foundational modules: OpenTelemetry span helpers, a Pino logger, the Langfuse API client, and an environment validator.

Create src/lib/otel.ts:

import { trace, Span } from "@opentelemetry/api";
 
const TRACER_NAME = "multi-agent-obs";
 
export function createSpan(name: string): Span {
  return trace.getTracer(TRACER_NAME).startSpan(name);
}
 
export function endSpan(span: Span, error?: Error): void {
  if (error) {
    span.recordException(error);
    span.setStatus

createSpan starts a named span using the OpenTelemetry API. endSpan records any thrown error as an exception and marks the span as errored (status code 2) or OK (status code 1).

Create src/lib/logger.ts:

import pino from "pino";
 
const level = process.env.LOG_LEVEL ?? "info";
 
export const log = pino(
  process.env.NODE_ENV === "development"
    ? { level, transport: { target: "pino-pretty" } }
    : { level },
);

In development mode (NODE_ENV=development) Pino uses pino-pretty for human-readable output. In production it emits raw JSON — ideal for log aggregation systems.

Create src/lib/langfuse.ts:

import Langfuse from "langfuse";
 
export const langfuse = new Langfuse({
  publicKey: process.env.LANGFUSE_PUBLIC_KEY ?? "",
  secretKey: process.env.LANGFUSE_SECRET_KEY ?? "",
  baseUrl: process.env.LANGFUSE_HOST ?? "",
});

This module creates a Langfuse SDK instance for writing traces and provides two REST helpers — fetchTraces for listing and fetchTrace for retrieving a single trace by ID — using HTTP Basic auth with your Langfuse project keys.

Create src/lib/validate-env.ts:

const REQUIRED_VARS = [
  "LANGFUSE_PUBLIC_KEY",
  "LANGFUSE_SECRET_KEY",
  "LANGFUSE_HOST",
  "OTEL_EXPORTER_OTLP_ENDPOINT",
  "CRON_SECRET",
] as const;
 
export function validateEnv(): void {
  for (const varName of REQUIRED_VARS) {
    const val = process.env[varName];
    if (!val || val.trim() === "") {
      throw

This runs at import time inside instrumentation.ts (you’ll write that next) to fail fast if any required variable is missing or empty.

Step 5: Wire up REAA observability and instrumentation

Now add the wrapper modules that bridge the REAA packages to your app, plus the instrumentation.ts entrypoint that Next.js loads on startup.

Create src/lib/runbook-obs.ts:

import {
  initLogger,
  initTracing,
  initMetrics,
  recordAgentCost as reaaRecordAgentCost,
  recordGeneration as reaaRecordGeneration,
} from "@reaatech/agent-runbook-observability";
 
export function initRunbookObs(): void {
  initLogger({ level: process.env.LOG_LEVEL ?? "info", service: "multi-agent-obs" });
  initTracing({
    serviceName: "multi-agent-obs",
    otlpEndpoint: process.env.OTEL_EXPORTER_OTLP_ENDPOINT

initRunbookObs bootstraps the REAA runbook observability layer — logging, tracing, and metrics — all pointed at your OTLP endpoint.

Create src/lib/eval-harness-obs.ts:

import {
  getLogger,
  getTracingManager,
  getMetricsManager,
  getDashboardManager,
} from "@reaatech/agent-eval-harness-observability";
 
export const evalLogger = getLogger();
export const evalTracing = getTracingManager();
export const evalMetrics = getMetricsManager();
export const evalDashboard = getDashboardManager();
 
export function initEvalObs(): void {

This exposes singletons from the eval harness package: a logger, tracing manager, metrics manager, and dashboard manager. API routes call evalMetrics.recordRun and evalDashboard.getSummary to track response quality and surface it in the dashboard.

Create src/lib/mesh-obs.ts:

import {
  logger,
  createChildLogger,
  initOtel,
  shutdownOtel,
  recordAgentDispatchDuration as meshRecordDispatchDuration,
  recordAgentDispatchError as meshRecordDispatchError,
} from "@reaatech/agent-mesh-observability";
 
export function initMeshObs(): void {
  initOtel();
}
 
export function createRequestLogger(
  requestId: string,

The mesh package tracks inter-agent communication. createRequestLogger produces a child logger with request and session IDs attached, so you can trace a conversation across multiple agent dispatches.

Create src/lib/budget-bridge.ts:

import { SpanListener } from "@reaatech/agent-budget-otel-bridge";
 
export class InMemoryBudgetStore {
  private store: Map<string, number> = new Map();
 
  recordSpend(scopeKey: string, amount: number): void {
    const current = this.store.get(scopeKey) ?? 0;
    this.store.set(scopeKey, current

InMemoryBudgetStore keeps a running tally of spend per agent scope. The SpanListener from the budget bridge hooks into completed spans and records costs automatically.

Create src/lib/replay.ts:

import {
  RecordingEngine,
  ReplayEngine,
  LocalFileStorage,
  DiffEngine,
} from "@reaatech/agent-replay-core";
 
export function createRecordingEngine(): RecordingEngine {
  return new RecordingEngine();
}
 
export function createReplayEngine(): ReplayEngine {
  return new ReplayEngine

recordInteraction starts a recording session, opens an LLM span, captures the human message as an event, and stops the recording — producing a serialized trace you can replay later without consuming LLM tokens.

Create src/instrumentation.ts:

import { NodeSDK } from "@opentelemetry/sdk-node";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { validateEnv } from "./lib/validate-env";
import { initRunbookObs } from "./lib/runbook-obs";
import { initEvalObs } from "./lib/eval-harness-obs";
import { initMeshObs } from "./lib/mesh-obs";
 
validateEnv();
 
const otlpExporter = new OTLPTraceExporter

Next.js automatically loads src/instrumentation.ts on startup. This file: validates all required env vars, configures the OTLP trace exporter, starts the OpenTelemetry SDK, and initializes the three REAA observability layers. On SIGTERM it shuts down cleanly so no in-flight spans are lost.

Step 6: Build the API routes

With the observability layer in place, you can now write the route handlers that agents and the dashboard call.

Create src/app/api/support/chat/route.ts:

import { NextResponse } from "next/server";
import { createSpan, endSpan } from "../../../../lib/otel"

This is the main support agent endpoint. Each request: validates input (message required, sessionId required, 4000-char cap), records the interaction via the replay engine, emits runbook metrics (generation status, agent cost), records eval metrics (run count, P99 latency), pushes the trace to Langfuse, and logs the outcome with Pino. On failure it records a failure metric and returns a 500.

Create src/app/api/metrics/route.ts:

import { NextResponse } from "next/server";
import { createSpan, endSpan } from "../../../lib/otel";
import { recordGeneration, recordAgentCost } from "../../../lib/runbook-obs";
import { evalDashboard } from "../../../lib/eval-harness-obs";
 
export async function GET(): Promise<NextResponse> {
  const span =

This endpoint reads the eval dashboard summary (total runs, cost per task, P99 latency, quality scores, active alerts, trends) and returns it as JSON. Downstream monitoring systems or the dashboard UI can poll this.

Create src/app/api/replay/route.ts:

import { NextResponse } from "next/server";
import { fetchTrace } from "../../../lib/langfuse";
import { log } from "../../../lib/logger";
 
export async function GET(request: Request): Promise<NextResponse> {
  const url = new URL(request.url);
  const traceId = url.searchParams.get("traceId");

Given a traceId query parameter, this fetches the full trace (including observations) from Langfuse. The replay viewer page calls this to load trace data.

Create src/app/api/cron/health/route.ts:

import { NextResponse } from "next/server";
import { createSpan, endSpan } from "../../../../lib/otel";
import { log } from "../../../../lib/logger";
import { langfuse } from "../../../../lib/langfuse"

This is designed to be hit by an external cron scheduler (e.g., Vercel Cron Jobs, GitHub Actions). It authenticates with a Bearer token from CRON_SECRET, runs the agent runbook health check, pushes a health-check trace to Langfuse, records metrics, and on failure posts an alert to Slack.

Step 7: Build the dashboard UI

The dashboard queries Langfuse, computes aggregate metrics, and renders summary cards, a cost-by-agent breakdown, and a recent-traces table. The replay viewer lets you step through individual trace spans.

Create src/lib/dashboard-data.ts:

import { fetchTraces, type LangfuseTraceData } from "./langfuse";

This module fetches the 100 most recent traces from Langfuse and computes: average latency, failure rate as a percentage, P50/P95/P99 latencies (sorted, no interpolation for simplicity), estimated cost (duration × a configurable rate), trace counts grouped by agent name, and the 10 most recent traces for the table view.

Create src/app/dashboard/page.tsx:

This is a React Server Component (no "use client" directive) with dynamic = "force-dynamic" so Next.js re-fetches data on every request. DashboardContent (an async component) calls getDashboardData(), then renders summary cards (total traces, average latency, failure rate, estimated cost), a cost-by-agent breakdown grid, and a recent-traces table. React Suspense wraps it all with skeleton cards while the Langfuse API is loading.

Create src/app/dashboard/replay/[traceId]/page.tsx (this file is long — the complete source is in the downloadable artifact; the excerpt below shows the core structure):

tsx

"use client";
 
import { useEffect, useState, useCallback } from "react";
import { fetchTrace } from "@/lib/langfuse";
import type { LangfuseTraceData } from

This is a client component. It resolves the dynamic [traceId] segment, calls fetchTrace() to pull the full trace from Langfuse, parses observations into timeline steps, and renders a split-pane UI: a step list on the left, event details on the right. A “Run CI/CD Check” button runs a regression diff using the replay engine and displays the result. The full file (388 lines) is available in the downloadable artifact — copy it verbatim into place.

Step 8: Run the tests

The project includes 64 tests across 32 suites covering every module. Run them with:

terminal

pnpm test

Expected output: all 64 tests pass with zero failures. You’ll see output like:

code

✓ src/lib/otel.test.ts (4 tests)
✓ src/lib/logger.test.ts (2 tests)
✓ src/lib/langfuse.test.ts (4 tests)
✓ src/lib/validate-env.test.ts (3 tests)
✓ src/lib/runbook-obs.test.ts (2 tests)
✓ src/lib/eval-harness-obs.test.ts (2 tests)
✓ src/lib/mesh-obs.test.ts (2 tests)
✓ src/lib/budget-bridge.test.ts (2 tests)
✓ src/lib/replay.test.ts (2 tests)
✓ src/lib/dashboard-data.test.ts (4 tests)
✓ src/instrumentation.test.ts (2 tests)
✓ src/app/api/support/chat/route.test.ts (5 tests)
  ... (and more)

Test Files  32 passed (32)
     Tests  64 passed (64)

For coverage (with 90% thresholds on lines, branches, functions, and statements):

terminal

pnpm test:coverage

Step 9: Start the dev server and verify

Launch the development server:

terminal

pnpm dev

Expected output:

code

▲ Next.js 15.2.6
- Local:        http://localhost:3000

Test the support chat endpoint. In another terminal, send a message:

terminal

curl -X POST http://localhost:3000/api/support/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hi, I need help with my billing.", "sessionId": "sess-001"}'

Expected output:

json

{"reply":"Thanks for your message about: \"Hi, I need help with my billing.\".... A support agent will follow up soon.","sessionId":"sess-001"}

Check the metrics endpoint:

terminal

curl http://localhost:3000/api/metrics

Expected output: a JSON object with runs, cost, latency, and quality fields.

Trigger the health check (with authorization):

terminal

curl -H "Authorization: Bearer super-secret-cron-token-123" \
  http://localhost:3000/api/cron/health

Expected output:

json

{"status":"success","timestamp":1715000000000}

View the dashboard at http://localhost:3000/dashboard. After sending a few chat messages, you’ll see summary cards with total traces, average latency, failure rate, and estimated cost, plus a recent-traces table.

Replay a conversation by clicking a trace ID in the table or navigating directly to http://localhost:3000/dashboard/replay/<traceId>.

Next steps

Add more agent types. The src/app/api/support/chat/route.ts pattern works for any agent — copy it for a lead qualifier, appointment setter, or knowledge-base bot, each with its own trace name and cost tracking.
Deploy the health check as a real cron. Point Vercel Cron Jobs or a GitHub Actions scheduled workflow at GET /api/cron/health with the Authorization: Bearer <CRON_SECRET> header so you get Slack alerts when agents go down.
Connect the metrics endpoint to a monitoring system. GET /api/metrics returns JSON — pipe it into Grafana, Datadog, or a simple internal status page to get real-time visibility without opening the dashboard.

Intro

Prerequisites

Node.js >= 22 (check with node --version)
pnpm 10.15.1 (check with pnpm --version; install with npm install -g pnpm@10.15.1)
A Langfuse account (cloud at langfuse.com or self-hosted) — you’ll need a public key, secret key, and host URL
An OpenTelemetry Collector endpoint — this recipe sends traces via OTLP/HTTP; you can use the Langfuse OTel endpoint directly or run a local collector
A Slack webhook URL (optional, for health check alerts)
Familiarity with TypeScript and Next.js App Router — you should know how src/app/ route handlers and pages work

Step 1: Scaffold the project

Create an empty directory, then add the project manifest and TypeScript configuration.

Create package.json:

json

{
  "name": "multi-agent-observability-for-small-business-support",
  "version": "0.1.0",
  "private": true,
  "type": "module",
  "engines": {
    "node": ">=22"
  },
  "packageManager": "pnpm@10.15.1",
  "scripts"

The @reaatech/* packages provide the observability hooks, Langfuse aggregates traces, OpenTelemetry handles distributed tracing, and Pino gives you structured JSON logging.

Create tsconfig.json:

json

{
  "compilerOptions": {
    "target": "ES2022",
    "lib": ["ES2022", "DOM", "DOM.Iterable"],
    "module": "ESNext",
    "moduleResolution": "bundler",
    "strict": true,
    "noUncheckedIndexedAccess": true,
    "noImplicitOverride": true,
    "exactOptionalPropertyTypes": true,

The @/* path alias maps to ./src/*, the same alias the Next.js App Router convention uses.

Create vitest.config.ts:

import { defineConfig } from "vitest/config";
import path from "path";
 
export default defineConfig({
  test: {
    environment: "node",
    include: ["src/**/*.test.ts", "src/**/*.test.tsx", "tests/**/*.test.ts"],
    passWithNoTests: false,
    exclude: ["node_modules", ".next", "dist", "coverage"],
    coverage: {

Create a .gitignore file to keep build artifacts out of version control:

code

node_modules/
.next/
dist/
.env
coverage/
*.log
.tsbuildinfo

Step 2: Install dependencies

From your project root, install everything:

terminal

pnpm install

Expected output: pnpm resolves and installs all packages. You’ll see a “Done” message along with the install time. The pnpm-lock.yaml is created automatically.

Step 3: Configure environment variables

Copy .env.example to .env and fill in your real values. The artifact ships with this template:

code

LANGFUSE_PUBLIC_KEY=
LANGFUSE_SECRET_KEY=
LANGFUSE_HOST=
OTEL_EXPORTER_OTLP_ENDPOINT=
SLACK_WEBHOOK_URL=
CRON_SECRET=
LOG_LEVEL=info

A complete .env for local development might look like this:

code

LANGFUSE_PUBLIC_KEY=pk-lf-abc123
LANGFUSE_SECRET_KEY=sk-lf-xyz789
LANGFUSE_HOST=https://cloud.langfuse.com
OTEL_EXPORTER_OTLP_ENDPOINT=https://cloud.langfuse.com/api/public/otel
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/T00/B00/xxxx
CRON_SECRET=super-secret-cron-token-123
LOG_LEVEL=info

The app validates these at startup (you’ll write that validator in the next step). SLACK_WEBHOOK_URL is optional — if blank, health-check failures are logged but not sent to Slack.

Step 4: Write the observability primitives

Create the src/lib/ directory and add four foundational modules: OpenTelemetry span helpers, a Pino logger, the Langfuse API client, and an environment validator.

Create src/lib/otel.ts:

import { trace, Span } from "@opentelemetry/api";
 
const TRACER_NAME = "multi-agent-obs";
 
export function createSpan(name: string): Span {
  return trace.getTracer(TRACER_NAME).startSpan(name);
}
 
export function endSpan(span: Span, error?: Error): void {
  if (error) {
    span.recordException(error);
    span.setStatus

createSpan starts a named span using the OpenTelemetry API. endSpan records any thrown error as an exception and marks the span as errored (status code 2) or OK (status code 1).

Create src/lib/logger.ts:

import pino from "pino";
 
const level = process.env.LOG_LEVEL ?? "info";
 
export const log = pino(
  process.env.NODE_ENV === "development"
    ? { level, transport: { target: "pino-pretty" } }
    : { level },
);

In development mode (NODE_ENV=development) Pino uses pino-pretty for human-readable output. In production it emits raw JSON — ideal for log aggregation systems.

Create src/lib/langfuse.ts:

import Langfuse from "langfuse";
 
export const langfuse = new Langfuse({
  publicKey: process.env.LANGFUSE_PUBLIC_KEY ?? "",
  secretKey: process.env.LANGFUSE_SECRET_KEY ?? "",
  baseUrl: process.env.LANGFUSE_HOST ?? "",
});

Create src/lib/validate-env.ts:

const REQUIRED_VARS = [
  "LANGFUSE_PUBLIC_KEY",
  "LANGFUSE_SECRET_KEY",
  "LANGFUSE_HOST",
  "OTEL_EXPORTER_OTLP_ENDPOINT",
  "CRON_SECRET",
] as const;
 
export function validateEnv(): void {
  for (const varName of REQUIRED_VARS) {
    const val = process.env[varName];
    if (!val || val.trim() === "") {
      throw

This runs at import time inside instrumentation.ts (you’ll write that next) to fail fast if any required variable is missing or empty.

Step 5: Wire up REAA observability and instrumentation

Now add the wrapper modules that bridge the REAA packages to your app, plus the instrumentation.ts entrypoint that Next.js loads on startup.

Create src/lib/runbook-obs.ts:

import {
  initLogger,
  initTracing,
  initMetrics,
  recordAgentCost as reaaRecordAgentCost,
  recordGeneration as reaaRecordGeneration,
} from "@reaatech/agent-runbook-observability";
 
export function initRunbookObs(): void {
  initLogger({ level: process.env.LOG_LEVEL ?? "info", service: "multi-agent-obs" });
  initTracing({
    serviceName: "multi-agent-obs",
    otlpEndpoint: process.env.OTEL_EXPORTER_OTLP_ENDPOINT

initRunbookObs bootstraps the REAA runbook observability layer — logging, tracing, and metrics — all pointed at your OTLP endpoint.

Create src/lib/eval-harness-obs.ts:

import {
  getLogger,
  getTracingManager,
  getMetricsManager,
  getDashboardManager,
} from "@reaatech/agent-eval-harness-observability";
 
export const evalLogger = getLogger();
export const evalTracing = getTracingManager();
export const evalMetrics = getMetricsManager();
export const evalDashboard = getDashboardManager();
 
export function initEvalObs(): void {

Create src/lib/mesh-obs.ts:

import {
  logger,
  createChildLogger,
  initOtel,
  shutdownOtel,
  recordAgentDispatchDuration as meshRecordDispatchDuration,
  recordAgentDispatchError as meshRecordDispatchError,
} from "@reaatech/agent-mesh-observability";
 
export function initMeshObs(): void {
  initOtel();
}
 
export function createRequestLogger(
  requestId: string,

Create src/lib/budget-bridge.ts:

import { SpanListener } from "@reaatech/agent-budget-otel-bridge";
 
export class InMemoryBudgetStore {
  private store: Map<string, number> = new Map();
 
  recordSpend(scopeKey: string, amount: number): void {
    const current = this.store.get(scopeKey) ?? 0;
    this.store.set(scopeKey, current

InMemoryBudgetStore keeps a running tally of spend per agent scope. The SpanListener from the budget bridge hooks into completed spans and records costs automatically.

Create src/lib/replay.ts:

import {
  RecordingEngine,
  ReplayEngine,
  LocalFileStorage,
  DiffEngine,
} from "@reaatech/agent-replay-core";
 
export function createRecordingEngine(): RecordingEngine {
  return new RecordingEngine();
}
 
export function createReplayEngine(): ReplayEngine {
  return new ReplayEngine

Create src/instrumentation.ts:

import { NodeSDK } from "@opentelemetry/sdk-node";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { validateEnv } from "./lib/validate-env";
import { initRunbookObs } from "./lib/runbook-obs";
import { initEvalObs } from "./lib/eval-harness-obs";
import { initMeshObs } from "./lib/mesh-obs";
 
validateEnv();
 
const otlpExporter = new OTLPTraceExporter

Step 6: Build the API routes

With the observability layer in place, you can now write the route handlers that agents and the dashboard call.

Create src/app/api/support/chat/route.ts:

import { NextResponse } from "next/server";
import { createSpan, endSpan } from "../../../../lib/otel"

Create src/app/api/metrics/route.ts:

import { NextResponse } from "next/server";
import { createSpan, endSpan } from "../../../lib/otel";
import { recordGeneration, recordAgentCost } from "../../../lib/runbook-obs";
import { evalDashboard } from "../../../lib/eval-harness-obs";
 
export async function GET(): Promise<NextResponse> {
  const span =

Create src/app/api/replay/route.ts:

import { NextResponse } from "next/server";
import { fetchTrace } from "../../../lib/langfuse";
import { log } from "../../../lib/logger";
 
export async function GET(request: Request): Promise<NextResponse> {
  const url = new URL(request.url);
  const traceId = url.searchParams.get("traceId");

Given a traceId query parameter, this fetches the full trace (including observations) from Langfuse. The replay viewer page calls this to load trace data.

Create src/app/api/cron/health/route.ts:

import { NextResponse } from "next/server";
import { createSpan, endSpan } from "../../../../lib/otel";
import { log } from "../../../../lib/logger";
import { langfuse } from "../../../../lib/langfuse"

Step 7: Build the dashboard UI

Create src/lib/dashboard-data.ts:

import { fetchTraces, type LangfuseTraceData } from "./langfuse";

Create src/app/dashboard/page.tsx:

Create src/app/dashboard/replay/[traceId]/page.tsx (this file is long — the complete source is in the downloadable artifact; the excerpt below shows the core structure):

tsx

"use client";
 
import { useEffect, useState, useCallback } from "react";
import { fetchTrace } from "@/lib/langfuse";
import type { LangfuseTraceData } from

Step 8: Run the tests

The project includes 64 tests across 32 suites covering every module. Run them with:

terminal

pnpm test

Expected output: all 64 tests pass with zero failures. You’ll see output like:

code

✓ src/lib/otel.test.ts (4 tests)
✓ src/lib/logger.test.ts (2 tests)
✓ src/lib/langfuse.test.ts (4 tests)
✓ src/lib/validate-env.test.ts (3 tests)
✓ src/lib/runbook-obs.test.ts (2 tests)
✓ src/lib/eval-harness-obs.test.ts (2 tests)
✓ src/lib/mesh-obs.test.ts (2 tests)
✓ src/lib/budget-bridge.test.ts (2 tests)
✓ src/lib/replay.test.ts (2 tests)
✓ src/lib/dashboard-data.test.ts (4 tests)
✓ src/instrumentation.test.ts (2 tests)
✓ src/app/api/support/chat/route.test.ts (5 tests)
  ... (and more)

Test Files  32 passed (32)
     Tests  64 passed (64)

For coverage (with 90% thresholds on lines, branches, functions, and statements):

terminal

pnpm test:coverage

Step 9: Start the dev server and verify

Launch the development server:

terminal

pnpm dev

Expected output:

code

▲ Next.js 15.2.6
- Local:        http://localhost:3000

Test the support chat endpoint. In another terminal, send a message:

terminal

curl -X POST http://localhost:3000/api/support/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hi, I need help with my billing.", "sessionId": "sess-001"}'

Expected output:

json

{"reply":"Thanks for your message about: \"Hi, I need help with my billing.\".... A support agent will follow up soon.","sessionId":"sess-001"}

Check the metrics endpoint:

terminal

curl http://localhost:3000/api/metrics

Expected output: a JSON object with runs, cost, latency, and quality fields.

Trigger the health check (with authorization):

terminal

curl -H "Authorization: Bearer super-secret-cron-token-123" \
  http://localhost:3000/api/cron/health

Expected output:

json

{"status":"success","timestamp":1715000000000}

Replay a conversation by clicking a trace ID in the table or navigating directly to http://localhost:3000/dashboard/replay/<traceId>.

Next steps

Add more agent types. The src/app/api/support/chat/route.ts pattern works for any agent — copy it for a lead qualifier, appointment setter, or knowledge-base bot, each with its own trace name and cost tracking.
Deploy the health check as a real cron. Point Vercel Cron Jobs or a GitHub Actions scheduled workflow at GET /api/cron/health with the Authorization: Bearer <CRON_SECRET> header so you get Slack alerts when agents go down.
Connect the metrics endpoint to a monitoring system. GET /api/metrics returns JSON — pipe it into Grafana, Datadog, or a simple internal status page to get real-time visibility without opening the dashboard.

Multi-Agent Observability for Small Business Support

The problem

Built from

Intro

Prerequisites

Step 1: Scaffold the project

Step 2: Install dependencies

Step 3: Configure environment variables

Step 4: Write the observability primitives

Step 5: Wire up REAA observability and instrumentation

Step 6: Build the API routes

Step 7: Build the dashboard UI

Step 8: Run the tests

Step 9: Start the dev server and verify

Next steps

Example artifact

Comments

Intro

Prerequisites

Step 1: Scaffold the project

Step 2: Install dependencies

Step 3: Configure environment variables

Step 4: Write the observability primitives

Step 5: Wire up REAA observability and instrumentation

Step 6: Build the API routes

Step 7: Build the dashboard UI

Step 8: Run the tests

Step 9: Start the dev server and verify

Next steps