Small businesses feeding customer data into Databricks-hosted LLMs risk accidental PII exposure and prompt injection attacks, but lack the security engineering capacity to build custom guardrails for every model endpoint.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building a pluggable security layer that sits between your users and a Databricks model-serving endpoint. Every incoming chat request passes through Presidio PII detection and a configurable sequence of guardrails (PII redaction, prompt injection detection, toxicity filtering, topic boundaries, cost pre-checks) before reaching Databricks. The model’s output is also scanned. All guardrail activity is logged, metered, and traced through Langfuse for audit trails. By the end you’ll have a working Next.js 16 API that you can point at any Databricks model endpoint.
Prerequisites
Node.js >= 22
pnpm 10.x (the packageManager field in package.json specifies the exact version)
A Databricks workspace with at least one model-serving endpoint deployed (needed for the live proxy, but the test suite runs fully mocked)
A Langfuse account for observability (optional for local dev; observability skips gracefully when env vars are absent)
Step 1: Scaffold the Project and Install Dependencies
The project shell is already on disk — package.json, tsconfig.json, vitest.config.ts, next.config.ts, and root config files are provided by the scaffold agent. Your job is to verify the dependencies and install them.
Open package.json and confirm it contains these dependencies:
All versions are exact-pinned (no ^ or ~). Now install everything:
terminal
pnpm install
Expected output: pnpm resolves all packages and writes pnpm-lock.yaml. You should see no resolution errors.
Next, set up your environment variables:
terminal
cp .env.example .env
The .env.example provides a template with all the keys you’ll need:
env
# Env vars used by databricks-security-guardrails-for-smb-data-pipelines.# The builder adds entries here as it wires up each integration.# Keep placeholders only — never commit real values.NODE_ENV=developmentDATABRICKS_HOST=DATABRICKS_TOKEN=LANGFUSE_SECRET_KEY=LANGFUSE_PUBLIC_KEY=LANGFUSE_HOST=GUARDRAIL_CHAIN_BUDGET_MAX_LATENCY_MS=GUARDRAIL_CHAIN_BUDGET_MAX_TOKENS=
For local development you can leave DATABRICKS_HOST, DATABRICKS_TOKEN, and the Langfuse values empty — the test suite mocks all external calls. The guardrail chain budget overrides are optional; the code falls back to sensible defaults (5000 ms latency, 10000 tokens) when they’re unset.
Run a quick type-check to verify the install is sound:
terminal
pnpm typecheck
Expected output:tsc --noEmit exits with code 0 and no error messages.
Step 2: Define the Shared Types
All the TypeScript interfaces that cross module boundaries live in src/types.ts. Create this file:
Three main domain objects: DatabricksRequest / DatabricksResponse model the chat-completion shape that the API sends and receives; GuardrailProfile captures per-endpoint configuration (budget limits and which guardrails are active); and GuardrailOutcome is the result returned by the guardrail service — pass, reject, or sanitize, plus a list of violations for audit logging.
Expected output: A 32-line TypeScript file with no imports and no dependencies beyond the TypeScript standard library.
Step 3: Build the Configuration System
The configuration layer loads guardrail profiles from environment variables and provides a runtime registry for per-endpoint overrides. Create src/config/rules.ts:
The constructor calls loadConfigFromEnv('GUARDRAIL_CHAIN') — this reads GUARDRAIL_CHAIN_BUDGET_MAX_LATENCY_MS, GUARDRAIL_CHAIN_BUDGET_MAX_TOKENS, and GUARDRAIL_CHAIN_CONFIG from the environment. The result is validated via validateConfigSafe from @reaatech/guardrail-chain-config.
If validation succeeds and a budget is present, it uses those values. Otherwise it falls back to 5000 ms latency and 10000 tokens.
The default profile enables five guardrails by default: PII redaction, prompt injection detection, toxicity filtering, topic boundary enforcement, and cost pre-check.
getProfile() returns a registered profile or a copy of the default with the requested endpointId.
setProfile() merges a partial profile on top of an existing one — you can change just the budget without re-specifying the guardrail list.
listEndpoints() returns all endpoint IDs that have been explicitly registered via setProfile().
Now create a barrel export at src/config/index.ts:
ts
export { EndpointProfileManager, profileManager } from './rules.js';
Expected output: Two files — src/config/rules.ts (57 lines) and src/config/index.ts (1 line). Run pnpm typecheck to confirm no type errors.
Step 4: Implement Presidio PII Detection
The Presidio analyzer wraps @presidio-dev/hai-guardrails with three guards: injection detection, PII detection, and secret/credential detection. Create src/guards/presidio.ts:
The constructor configures a GuardrailsEngine with three guards:
injectionGuard — heuristic detection of prompt injection patterns (jailbreak attempts, “ignore previous instructions”, role-reversal), with a confidence threshold of 0.7.
piiGuard — pattern-based detection of email addresses, phone numbers, SSNs, and credit card numbers across all message roles.
secretGuard — pattern + entropy analysis for API keys, tokens, and credentials.
The run() method sends the input as a single user message and checks whether every guard’s result passed. sanitizedText() applies regex-based redaction when PII is detected — it replaces emails, phone numbers, SSNs, and credit card numbers with [REDACTED].
Add a barrel at src/guards/index.ts:
ts
export { PresidioAnalyzer, createDefaultAnalyzer } from "./presidio.js";
Expected output: Two files — src/guards/presidio.ts (74 lines) and src/guards/index.ts (1 line).
Step 5: Create the Databricks Proxy Client
The DatabricksClient is a thin HTTP wrapper that forwards requests to Databricks model-serving endpoints. Create src/api/databricks-proxy.ts:
The constructor reads DATABRICKS_HOST and DATABRICKS_TOKEN from process.env. When these are empty (local dev), the client will try to fetch from an empty base URL — your routes and tests handle this gracefully.
forward() strips the endpoint field from the request body and builds the URL as {host}/serving-endpoints/{endpoint}/invocations. The endpoint ID becomes part of the URL path, not part of the JSON payload.
An AbortController with a 30-second timeout (configurable via the second argument) prevents hung requests. When the timeout fires, the AbortError is caught and re-thrown as a typed DatabricksTimeoutError.
Non-2xx responses throw DatabricksApiError carrying the HTTP status and parsed response body (or null if the body isn’t JSON).
healthCheck() does a simple GET and returns a boolean — no complex schema validation.
Expected output:src/api/databricks-proxy.ts at 84 lines.
Step 6: Build the Guardrail Service
This is the central orchestrator. It combines the Presidio PII analyzer with the full guardrail chain from @reaatech/guardrail-chain-guardrails. Create src/services/guardrail-service.ts:
ts
import { GuardrailChain, ChainBuilder,} from '@reaatech/guardrail-chain';import { PIIRedaction, PromptInjection, ToxicityFilter, TopicBoundary, CostPrecheck, CachedGuardrail, PIIScan, RateLimiter,} from '@reaatech/guardrail-chain-guardrails';import type { GuardrailProfile, GuardrailOutcome } from '../types.js';import { profileManager } from '../config/index.js';import { PresidioAnalyzer } from '../guards/presidio.js';function buildChain(profile: GuardrailProfile): GuardrailChain { const
The GuardrailService orchestrates a two-phase guardrail check:
Input phase — processInput() concatenates all message content and runs it through Presidio first. If Presidio detects PII, the input is redacted. Then the input is passed to the endpoint’s guardrail chain (built from the profile). The chain runs guardrails in priority order: rate limiter, cost precheck (cached for 5 minutes), PII redaction, prompt injection, topic boundary, toxicity filter. Each guardrail is only added if the profile’s enabledGuardrails list includes its ID. If any guardrail fails, the chain returns success: false and the service returns a reject verdict.
Output phase — processOutput() runs the model’s response through a separate output chain that scans for PII leaks (PIIScan) and toxicity. If either blocks, the verdict is reject.
The buildChain() function uses ChainBuilder from @reaatech/guardrail-chain. Notice that CostPrecheck is wrapped in CachedGuardrail with a 300-second TTL — since the cost precheck runs a character-based estimation on every request, caching avoids redundant computation for repeated inputs.
Expected output:src/services/guardrail-service.ts at 162 lines. Run pnpm typecheck to verify.
Step 7: Create the API Route Handlers
The application exposes two endpoints: the main guardrail proxy and a health check.
This route handler implements the full request lifecycle:
Parse the JSON body. If parsing fails, return 400.
Validate that endpoint is present. If missing, return 400.
Run input through GuardrailService.processInput(). If the verdict is reject, return 403 with the violation details and log a warning via the observability logger.
If the verdict is sanitize, the forwarded body uses the sanitized text. If pass, the original messages are forwarded.
Forward the request to Databricks via DatabricksClient.forward(). If Databricks returns an HTTP error, return 502 with upstream details. If the request times out, return 504.
Run the model output through processOutput(). If output guardrails block, return 500.
Otherwise return 200 with the Databricks response body.
A top-level catch block catches any unexpected error, logs it, and returns 500.
The handler uses getLogger() from @reaatech/guardrail-chain-observability rather than creating its own logger. This means it automatically picks up whatever logger was installed during initObservability() — Langfuse if configured, or the default NoOpLogger.
Create the health endpoint at app/api/health/route.ts:
ts
import { NextResponse } from 'next/server';import { DatabricksClient } from '../../../src/api/databricks-proxy.js';export async function GET(): Promise<NextResponse> { const client = new DatabricksClient(); const databricksHealthy = await client.healthCheck(); return NextResponse.json({ status: 'ok', databricks: databricksHealthy });}
Expected output: Two route files. The guardrails chat route is 72 lines; the health route is 8 lines.
Step 8: Wire Up Langfuse Observability
The observability layer implements three interfaces from @reaatech/guardrail-chain-observability — Logger, MetricsCollector, and Tracer — backed by Langfuse.
The metrics collector maps the three metric primitives to Langfuse concepts:
increment — creates a Langfuse score with value 1.
histogram — creates a trace and an immediate generation event, both with the histogram value as metadata.
gauge — creates a trace with the current value.
Create src/observability/langfuse-tracer.ts:
ts
import type { Span, Tracer } from '@reaatech/guardrail-chain-observability';import type Langfuse from 'langfuse';class LangfuseSpan implements Span { id: string; private attributes: Record<string, string | number | boolean> = {}; private generation: ReturnType<ReturnType<Langfuse['trace']>['generation']>; constructor(id: string, generation: ReturnType<ReturnType<Langfuse['trace']>['generation']>) { this.id = id; this.generation = generation; } setAttribute(key: string, value: string | number | boolean): void { this.attributes[key] = value; } end(): void { const metadata = { ...this.attributes }; this.generation.end({ metadata }); }}export class LangfuseTracer implements Tracer { constructor(private langfuse: Langfuse) {} startSpan(name: string, _parent?: Span): Span { void _parent; const trace = this.langfuse.trace({ name }); const generation = trace.generation({ name }); const id = crypto.randomUUID(); return new LangfuseSpan(id, generation); }}
The tracer creates a Langfuse trace and generation for every span. Attributes set during the span’s lifetime are passed as metadata when end() is called. Each span gets a UUID via crypto.randomUUID().
Create the initialization entry point at src/observability/index.ts:
ts
import Langfuse from 'langfuse';import { setLogger, setMetrics, setTracer } from '@reaatech/guardrail-chain-observability';import { LangfuseLogger } from './langfuse-logger.js';import { LangfuseMetricsCollector } from './langfuse-metrics.js';import { LangfuseTracer } from './langfuse-tracer.js';let _langfuse: Langfuse | null = null;export function initObservability(): void { const secretKey = process.env.LANGFUSE_SECRET_KEY; const publicKey = process.env.LANGFUSE_PUBLIC_KEY; const baseUrl = process.env.LANGFUSE_HOST; _langfuse = new Langfuse({ secretKey, publicKey, baseUrl }); setLogger(new LangfuseLogger(_langfuse)); setMetrics(new LangfuseMetricsCollector(_langfuse)); setTracer(new LangfuseTracer(_langfuse)); process.on('beforeExit', () => { _langfuse?.shutdown(); });}
This function creates a single Langfuse client, wraps it in all three adapter classes, and registers them via the global setLogger/setMetrics/setTracer functions. The beforeExit handler ensures pending events are flushed before the process terminates.
Expected output: Four files under src/observability/ totaling 96 lines.
Step 9: Set Up Next.js Instrumentation
Next.js provides an instrumentation.ts file that runs once at server startup — this is where you call initObservability() so the Langfuse adapters are active before any requests arrive. Create src/instrumentation.ts:
ts
export async function register(): Promise<void> { if (process.env.NEXT_RUNTIME === 'nodejs') { const { initObservability } = await import('./observability/index.js'); initObservability(); }}
The dynamic import() is required because register() runs in both Node.js and Edge runtimes. The Edge runtime doesn’t support Node-only APIs like langfuse or crypto.randomUUID(). The guard on process.env.NEXT_RUNTIME === 'nodejs' ensures the import only happens in the Node.js context.
For register() to actually fire, next.config.ts must have experimental.instrumentationHook: true. The scaffold does not include this flag yet, so add it now. Open next.config.ts and update it to:
ts
import type { NextConfig } from "next";const nextConfig: NextConfig = { experimental: { instrumentationHook: true, },};export default nextConfig;
Now create a public API barrel at src/index.ts:
ts
export * from './types.js';export * from './config/index.js';export * from './guards/index.js';export * from './api/databricks-proxy.js';export { GuardrailService, createGuardrailService } from './services/guardrail-service.js';export { initObservability } from './observability/index.js';
Expected output:src/instrumentation.ts (6 lines) and src/index.ts (6 lines).
Step 10: Run the Tests
The test suite covers every module — unit tests for configuration, Presidio, the guardrail service, the Databricks proxy, the route handlers, and observability, plus an integration test that exercises the full request flow with mocked external dependencies.
Run the full suite:
terminal
pnpm test
Expected output: Vitest runs all tests and produces a test report. You should see output similar to:
The coverage threshold is 90% across all four metrics (lines, branches, functions, statements). The 12 test files exercise every route, service, guard, proxy, config, and observability module. The integration test (tests/integration/full-flow.test.ts) includes 9 cases covering clean input, injection blocks, PII sanitization, output guardrail blocks, Databricks API errors, empty messages, timeout handling, config changes, and Langfuse observability wiring.
Next steps
Add a rate-limiter configuration endpoint — Extend POST /api/guardrails/chat to accept an X-Endpoint-Profile header that dynamically enables or disables the rate-limiter guardrail per request, using profileManager.setProfile() at runtime.
Integrate a real LLM-based guardrail — Replace the heuristic injectionGuard with the LLM-based mode (mode: 'language-model') by wiring a custom LLM provider into the Presidio engine, giving you stronger detection at the cost of additional latency.
Build a dashboard — Create a Next.js page under app/dashboard/ that fetches guardrail metrics from the Langfuse API and renders real-time violation charts, letting your ops team monitor blocked requests without digging into traces.