Small healthcare providers using AI chatbots for patient intake risk exposing protected health information (PHI) to LLM APIs—a HIPAA violation that can result in severe fines and loss of trust.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
Small healthcare clinics and SMBs are using AI chatbots for patient intake, but every message sent to a Gemini API risks exposing protected health information (PHI) — a direct HIPAA violation. This tutorial walks you through building a guardrail pipeline that detects, redacts, and re-identifies PHI across multi-turn conversations. You’ll wire together PII detection from @presidio-dev/hai-guardrails, deterministic redaction via @reaatech/guardrail-chain, a tool-use firewall from @reaatech/tool-use-firewall-core, session persistence from @reaatech/session-continuity, and token-level cost tracking from @reaatech/llm-cost-telemetry — all inside a single Next.js App Router endpoint at POST /api/chat.
Expected output:pnpm-lock.yaml is created. ls node_modules/@reaatech/ shows five package directories: guardrail-chain, guardrail-chain-guardrails, llm-cost-telemetry, session-continuity, tool-use-firewall-core.
Step 2: Configure environment variables
Create .env.example at the project root. Every process.env value your code reads must have a placeholder here.
env
# Env vars used by google-gemini-security-guardrails-for-smb-healthcare-pii-redaction.# The builder adds entries here as it wires up each integration.# Keep placeholders only — never commit real values.NODE_ENV=development# Gemini API key (required) — get yours at https://aistudio.google.com/apikeyGEMINI_API_KEY=<your-gemini-api-key># Session managementSESSION_TTL_SECONDS=3600# Cost budgetCOST_BUDGET_DAILY_USD=10.0# OpenTelemetry service name (used by @reaatech/llm-cost-telemetry)OTEL_SERVICE_NAME=gemini-pii-guardrails
Copy to .env.local and set your real API key:
terminal
cp .env.example .env.local# Edit .env.local and replace <your-gemini-api-key> with your actual key
Expected output:cat .env.local | grep GEMINI_API_KEY shows your key value.
Step 3: Write shared types with Zod
Create src/types.ts with all the shared interfaces, Zod schemas, and custom error classes the rest of the codebase depends on.
Expected output: TypeScript compilation succeeds with zero errors.
Step 4: Create the PII detector
The PiiDetector wraps GuardrailsEngine from @presidio-dev/hai-guardrails to scan every user message for personally identifiable information — emails, phone numbers, SSNs, and credit cards.
typescript
// src/services/pii-detector.tsimport { piiGuard, GuardrailsEngine, SelectionType } from '@presidio-dev/hai-guardrails';import type { PIIEntity } from '../types.js';import { PiiDetectionError } from '../types.js';class PiiDetector { private engine: GuardrailsEngine; constructor() { this.engine = new GuardrailsEngine({ guards: [piiGuard({ selection: SelectionType.All })], }); } async detect(text: string): Promise<PIIEntity[]> { if (text.length === 0) { return []; } try { const result = await this.engine.run([{ role: 'user', content: text }]); if (!result.messagesWithGuardResult.length) { return []; } const entities: PIIEntity[] = []; for (const entry of result.messagesWithGuardResult) { for (const guardMsg of entry.messages) { if (!guardMsg.passed) { entities.push({ type: entry.guardId, originalText: guardMsg.modifiedMessage?.content ?? text, redactedToken: '[REDACTED]', startIndex: 0, endIndex: 0, }); } } } return entities; } catch (error) { throw new PiiDetectionError( error instanceof Error ? error.message : 'Unknown PII detection error', ); } }}export const piiDetector = new PiiDetector();
The piiGuard({ selection: SelectionType.All }) detects every PII category. Empty input returns [] immediately — a fast path for benign messages. Engine errors are wrapped in a typed PiiDetectionError so the route handler can distinguish detection failures.
Expected output: TypeScript compilation succeeds. The singleton piiDetector is exported for use by the route handler.
Step 5: Build the guardrail chain for PHI redaction
The GuardrailChainService builds a budget-aware processing pipeline. A PIIRedaction guardrail performs the actual redaction, wrapped in a CachedGuardrail so identical inputs are served from cache within 5 minutes.
setLogger(new ConsoleLogger()) activates structured console logging across the chain.
.withBudget({ maxLatencyMs: 500, maxTokens: 4000 }) caps execution at 500ms and 4000 tokens.
CachedGuardrail with ttlMs: 300000, maxSize: 500 avoids redundant redaction on repeated inputs.
Expected output: TypeScript compilation succeeds.
Step 6: Implement the entity mapper
The EntityMapper is a pure utility with no external dependencies. It manages mappings between original PHI values and deterministic tokens like [PHI-1], handling redaction and reconstruction in both directions.
typescript
// src/services/entity-mapper.tsimport type { PIIEntity, EntityMap } from '../types.js';export class EntityMapper { buildEntityMap(entities: PIIEntity[], existingMap?: EntityMap): EntityMap { const map: Map<string, { original: string; redacted: string }> = new Map(existingMap); if (entities.length === 0) { return map; } let nextIndex = map.size + 1; for (const entity of entities) { let found = false; for (const [, entry] of map) { if (entry.original === entity.originalText) { found = true; break; } } if (!found) { const token = `[PHI-${String(nextIndex)}]`; map.set(token, { original: entity.originalText, redacted: token }); nextIndex++; } } return map; } applyRedaction(text: string, entities: PIIEntity[]): string { if (entities.length === 0) { return text; } const sorted = [...entities].sort((a, b) => { const lenDiff = b.originalText.length - a.originalText.length; if (lenDiff !== 0) return lenDiff; return a.startIndex - b.startIndex; }); const covered = new Set<number>(); const filtered: PIIEntity[] = []; for (const entity of sorted) { let isCovered = false; for (let i = entity.startIndex; i < entity.endIndex; i++) { if (covered.has(i)) { isCovered = true; break; } } if (!isCovered) { for (let i = entity.startIndex; i < entity.endIndex; i++) { covered.add(i); } filtered.push(entity); } } const rightToLeft = [...filtered].sort((a, b) => b.startIndex - a.startIndex); let result = text; for (const entity of rightToLeft) { const before = result.slice(0, entity.startIndex); const after = result.slice(entity.endIndex); result = before + entity.redactedToken + after; } return result; } reconstructOriginal(text: string, entityMap: EntityMap): string { if (entityMap.size === 0) { return text; } let result = text; for (const [token, entry] of entityMap) { if (token.startsWith('[PHI-') && token.endsWith(']')) { result = result.split(token).join(entry.original); } } return result; } scanForUnredactedTerms(text: string, entityMap: EntityMap): string[] { const leaked: string[] = []; for (const [, entry] of entityMap) { if (text.includes(entry.original)) { leaked.push(entry.original); } } return leaked; }}export const entityMapper = new EntityMapper();
buildEntityMap merges new entities with an existing map, skipping duplicates. New entities get sequential [PHI-N] tokens.
applyRedaction sorts by length (longest first) and replaces right-to-left to preserve span indices. Overlapping entities are deduplicated.
reconstructOriginal reverses the process so the user sees the real PHI in the assistant reply.
scanForUnredactedTerms detects any original PHI still appearing in text — used by the firewall to catch leaks.
Expected output:EntityMapper is a pure class with no runtime dependencies. Every method handles empty inputs as a no-op.
Step 7: Build the session manager
The SessionService wraps SessionManager from @reaatech/session-continuity to manage multi-turn conversations using an in-memory storage adapter and a character-approximation token counter — no external database needed for development.
typescript
// src/services/session-service.tsimport { SessionManager, TokenBudgetExceededError, type IStorageAdapter, type TokenCounter, type Session, type Message } from '@reaatech/session-continuity';import type { EntityMap } from '../types.js';class InMemoryTokenCounter implements TokenCounter { readonly model = 'gemini-2.5-flash'; readonly tokenizer = 'character-approx'; count(text: string): number { return Math.ceil(text.length / 4); } countMessages(messages: Message
InMemoryTokenCounter approximates tokens as Math.ceil(text.length / 4) — a rough heuristic that avoids needing a real tokenizer.
InMemoryStorageAdapter implements all 11 methods of IStorageAdapter using Map instances.
SessionManager is configured with a 4096-token budget and overflowStrategy: 'compress', using a sliding window to keep the most recent 3500 tokens.
addMessages catches TokenBudgetExceededError and automatically compresses context before retrying.
The EntityMap is serialized into Session.metadata.custom.entityMap as a plain object for persistence.
Expected output: TypeScript compilation succeeds.
Step 8: Wire the Gemini LLM client
The GeminiClient wraps GoogleGenAI from @google/genai to send redacted prompts to Gemini 2.5 Flash and extract token usage metadata.
ai.models.generateContent({ model: 'gemini-2.5-flash', contents: prompt }) sends the redacted prompt as a single generation call.
Token counts come from response.usageMetadata.promptTokenCount and response.usageMetadata.candidatesTokenCount, defaulting to 0.
API errors are caught by their .status and .message properties and wrapped in a typed GeminiApiError.
Expected output: TypeScript compilation succeeds.
Step 9: Implement the tool-use firewall
The ToolUseFirewall inspects tool call arguments for unredacted PHI. If it finds any, it blocks the call and logs the incident through the structured logger from @reaatech/tool-use-firewall-core.
typescript
// src/services/firewall-service.tsimport { Logger, redact } from '@reaatech/tool-use-firewall-core';import type { EntityMap } from '../types.js';class ToolUseFirewall { private logger: Logger; constructor() { this.logger = new Logger('ToolUseFirewall'); } async inspectToolCall(toolName: string, args: Record<string, unknown>, entityMap: EntityMap): Promise<{ allowed: boolean; reason?: string }> { await Promise.resolve(); if (entityMap.size === 0) { return { allowed: true }; } const found = this.deepSearchValues(args, entityMap); if (found) { const safeArgs = redact(args); this.logger.warn(`Tool call "${toolName}" contains unredacted PHI`, { toolName, args: safeArgs, matchedEntity: found }); return { allowed: false, reason: 'Tool call contains unredacted PHI' }; } return { allowed: true }; } private deepSearchValues(obj: unknown, entityMap: EntityMap): string | null { if (typeof obj === 'string') { for (const [, entry] of entityMap) { if (obj.includes(entry.original)) { return entry.original; } } return null; } if (Array.isArray(obj)) { for (const item of obj) { const found = this.deepSearchValues(item, entityMap); if (found) return found; } return null; } if (obj && typeof obj === 'object') { for (const value of Object.values(obj as Record<string, unknown>)) { const found = this.deepSearchValues(value, entityMap); if (found) return found; } return null; } return null; }}export const firewall = new ToolUseFirewall();
Logger('ToolUseFirewall') creates a structured JSON logger that writes to stderr, following the MCP-safe convention.
deepSearchValues recursively crawls strings, arrays, and objects to find any value containing raw PHI.
Arguments are sanitized with redact() before logging. An empty entity map is a fast path — all tool calls are allowed.
Expected output: TypeScript compilation succeeds.
Step 10: Set up cost tracking
The CostTracker records every Gemini API call as a CostSpan, computes dollar cost using Gemini 2.5 Flash pricing ($0.15/M input tokens, $0.60/M output tokens), and validates every span with Zod.
Every CostSpan is validated with CostSpanSchema.parse() before storage.
loadConfig() reads the daily budget from COST_BUDGET_DAILY_USD. The budget check uses config.budget.global.daily with an Infinity fallback.
Expected output: TypeScript compilation succeeds.
Step 11: Wire the API route
The POST /api/chat route handler orchestrates the full pipeline: validate the request, detect PHI, load or create a session, merge entity maps, redact the message, send to Gemini, firewall-check the response, re-identify PHI, track cost, persist session, and respond.
The flow inside the handler: (1) parse and validate JSON body — malformed or missing message returns 400; (2) detect PHI with piiDetector.detect(); (3) load or create a session, restoring the entity map from metadata; (4) merge entity maps with entityMapper.buildEntityMap(); (5) redact PHI with guardrailChainService.redact(); (6) send the sanitized prompt to Gemini; (7) firewall-check the response for leaked PHI — blocked calls get 403; (8) re-identify PHI with entityMapper.reconstructOriginal(); (9) record cost with costTracker.recordCall(); (10) persist session with storeEntityMap() and addMessages(); (11) return JSON with reply, sessionId, redacted, and costUsd.
The handler uses NextRequest/NextResponse exclusively — no bare Request/Response.
Each service module gets a test file covering happy path, error path, and boundary cases. All external packages are mocked with vi.mock() — no real HTTP calls in tests. Here are excerpts from two test files to show the pattern.
Entity mapper tests (pure logic, no mocking needed)
typescript
// tests/services/entity-mapper.test.tsimport { describe, it, expect } from 'vitest';import { entityMapper } from '../../src/services/entity-mapper.js';import type { PIIEntity, EntityMap } from '../../src/types.js';describe('entityMapper', () => { it('buildEntityMap with empty entities returns empty map', () => { const result = entityMapper.buildEntityMap([]); expect(result.size).toBe(0); }); it('buildEntityMap skips duplicate entities already in existingMap', () => { const existing: EntityMap = new Map([['[PHI-1]', { original: 'john@test.com', redacted: '[PHI-1]' }]]); const result = entityMapper.buildEntityMap( [{ type: 'EMAIL', originalText: 'john@test.com', redactedToken: '[MASKED]', startIndex: 0, endIndex: 14 }], existing, ); expect(result.size).toBe(1); expect(result.get('[PHI-1]')?.original).toBe('john@test.com'); }); it('applyRedaction replaces phone number with [PHI-1]', () => { const entities: PIIEntity[] = [ { type: 'PHONE', originalText: '555-1234', redactedToken: '[PHI-1]', startIndex: 13, endIndex: 21 }, ]; const result = entityMapper.applyRedaction('Call John at 555-1234', entities); expect(result).toBe('Call John at [PHI-1]'); }); it('reconstructOriginal restores original text from tokens', () => { const map: EntityMap = new Map([['[PHI-1]', { original: '555-1234', redacted: '[PHI-1]' }]]); const result = entityMapper.reconstructOriginal('Call John at [PHI-1]', map); expect(result).toBe('Call John at 555-1234'); }); it('overlapping entities sorted by length, longest applied first', () => { const entities: PIIEntity[] = [ { type: 'PERSON', originalText: 'John Doe', redactedToken: '[PHI-1]', startIndex: 3, endIndex: 11 }, { type: 'LAST_NAME', originalText: 'Doe', redactedToken: '[PHI-2]', startIndex: 8, endIndex: 11 }, ]; const result = entityMapper.applyRedaction('Hi John Doe!', entities); expect(result).toBe('Hi [PHI-1]!'); expect(result).not.toContain('[PHI-2]'); }); it('scanForUnredactedTerms finds leaked terms', () => { const map: EntityMap = new Map([['[PHI-1]', { original: '555-1234', redacted: '[PHI-1]' }]]); const leaked = entityMapper.scanForUnredactedTerms('Call John at 555-1234', map); expect(leaked).toEqual(['555-1234']); });});
Route handler integration test (all services mocked)
Create the remaining test files for pii-detector, guardrail-chain-service, session-service, firewall-service, llm-client, and cost-tracker. Each should have at least 3 test cases covering happy path, error path, and boundary conditions.
Expected output: After writing all test files, run pnpm vitest run --coverage --reporter=json --outputFile=vitest-report.json and see numFailedTests: 0, numPassedTests: 72 across 20 test suites, with per-file coverage above 90%.
Step 13: Create barrel exports and run quality checks
Create src/index.ts to re-export all singleton services and types as a public API:
typescript
// src/index.tsexport { piiDetector } from './services/pii-detector.js';export { guardrailChainService } from './services/guardrail-chain-service.js';export { sessionService } from './services/session-service.js';export { entityMapper, EntityMapper } from './services/entity-mapper.js';export { firewall } from './services/firewall-service.js';export { geminiClient } from './services/llm-client.js';export { costTracker } from './services/cost-tracker.js';export * from './types.js';
Now run the full quality gate:
terminal
pnpm typecheckpnpm lintpnpm test
Expected output:pnpm typecheck exits 0 with zero TypeScript errors. pnpm lint exits 0 with zero ESLint violations. pnpm test produces a vitest-report.json showing numFailedTests: 0, 72 passing tests across 20 test suites, and coverage at or above 90% lines/branches/functions/statements.
Next steps
Swap to a production storage backend — replace InMemoryStorageAdapter with the PostgreSQL, DynamoDB, or Redis adapter from @reaatech/session-continuity.
Add more guardrails — include ToxicityFilter, PromptInjection, and TopicBoundary guardrails from @reaatech/guardrail-chain-guardrails for defense-in-depth beyond PII redaction.
Expose a dashboard endpoint — create GET /api/chat/costs that calls costTracker.checkBudget() to show real-time spending per session, with alerts when the daily budget is near its limit.
[])
:
number
{
let total = 0;
for (const msg of messages) {
if (typeof msg.content === 'string') {
total += this.count(msg.content);
}
}
return total;
}
}
class InMemoryStorageAdapter implements IStorageAdapter {
private sessions: Map<string, Session> = new Map();
private messages: Map<string, Message[]> = new Map();