A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building an AI-powered lead qualification and routing system that ingests Marketo lead-creation webhooks, enriches them with Mistral AI, and routes high-intent prospects to the right sales rep. You’ll wire up seven @reaatech/* packages for webhook normalization, agent orchestration, confidence-based routing, session continuity, LLM cost tracking, budget enforcement, and response caching. By the end you’ll have a tested Next.js API endpoint that turns a raw Marketo webhook into a qualified, enriched, and routed lead.
Prerequisites
Node.js 22+ and pnpm 10 installed
A Mistral AI API key (MISTRAL_API_KEY) — get one at console.mistral.ai
A Marketo instance with API access (client ID, client secret, base URL)
A Langfuse account (optional, for observability) — get keys at langfuse.com
An OpenAI API key for LLM cache embeddings (LLM_CACHE_EMBEDDING_API_KEY)
Basic familiarity with TypeScript, Next.js App Router, and REST APIs
Step 1: Scaffold the project and install dependencies
Start with a fresh Next.js 16 project and pin all dependencies to exact versions:
The register() function runs during Next.js boot. The experimental.instrumentationHook: true flag is required — without it, this file is dead code. The NEXT_RUNTIME === "nodejs" guard prevents the dynamic imports from failing in the Edge runtime.
Expected output: Next.js starts without errors and the Langfuse client initializes on server boot.
Step 3: Define domain types
Create src/lib/types.ts with the core data structures the pipeline will pass around:
These types model the lead event, the enriched version after Mistral processing, Marketo API credentials, a budget scope constant, and the final routing result that flows back through the webhook response.
Expected output:pnpm typecheck shows no errors on this file.
Step 4: Build the session continuity module
Create src/lib/session.ts — this manages conversation state for leads that need clarification follow-ups. It implements an in-memory storage adapter and a simple token counter on top of @reaatech/session-continuity:
ts
import { SessionManager, type Session, type Message, type IStorageAdapter, type TokenCounter, type SessionId, type MessageId, type SessionFilters, type MessageQueryOptions, type UpdateSessionOptions, type HealthStatus,} from "@reaatech/session-continuity";class InMemorySessionAdapter implements IStorageAdapter { private sessions = new Map<SessionId, Session>(); private messages = new Map<SessionId, Message[]>();
The token counter approximates tokens as ceil(text.length / 4). The session manager compresses conversation context when it exceeds 4,096 tokens using a sliding window that keeps the last 3,500 tokens. Use createLeadSession() to start a session for a lead that needs clarification, then addLeadMessage() and getLeadConversationContext() to build follow-up context.
Expected output: Tests for creating sessions, adding messages, and retrieving conversation context all pass.
Step 5: Build the LLM cache module
Create src/lib/cache.ts — this wraps @reaatech/llm-cache to deduplicate LLM calls when the same company reappears:
The cache uses OpenAI embeddings for semantic similarity. When Mistral enriches a company, the result is cached for one hour (the analytical TTL). A subsequent lead from the same company bypasses the LLM call entirely.
The thresholds mean: intent score >= 0.8 = route to rep, 0.3-0.8 = request clarification, < 0.3 = fall back (no action).
Expected output:makeRoutingDecision([{ label: "route_to_rep", confidence: 0.92 }]) returns { type: "ROUTE" }. A confidence of 0.55 returns { type: "CLARIFY" }. A confidence of 0.15 returns { type: "FALLBACK" }.
Step 7: Build the cost and budget module
Create src/lib/cost.ts — this enforces a daily LLM spend budget and generates cost telemetry spans:
ts
import { BudgetController } from "@reaatech/agent-budget-engine";import { SpendStore } from "@reaatech/agent-budget-spend-tracker";import { BudgetScope } from "@reaatech/agent-budget-types";import { generateId, now, calculateCostFromTokens, getWindowStart, loadConfig,} from "@reaatech/llm-cost-telemetry";class InMemorySpendStore extends SpendStore { private store = new Map<string, number>(); record(entry: { requestId: string; scopeType: string
The InMemorySpendStore extends SpendStore from @reaatech/agent-budget-spend-tracker for proper type compatibility with BudgetController. Spans include provider, model, token counts, and a windowStart timestamp for day-based aggregation. The budget controller fires threshold-breach and hard-stop events that log warnings when spend approaches (80%) or hits (100%) the daily limit.
The OAuth2 token is cached in-memory for 3,600 seconds (Marketo’s TTL). The upsertLead() method uses createOrUpdate with email as the lookup field — Marketo creates a new lead or updates an existing one. updateLeadCustomFields() writes enrichment data back to custom fields on the lead.
Expected output: Tests verify the Authorization: Bearer header is present on every request and that MarketoApiError is thrown when the API returns success: false.
Step 9: Build the webhook handler
Create src/api/marketo/webhook.ts — this verifies the HMAC-SHA1 signature from Marketo, validates the payload with Zod, normalizes it through @reaatech/webhook-relay-core, and dispatches to the qualifier agent:
The HMAC comparison verifies the x-marketo-signature header. The webhook payload requires id, email, source, and timestamp — any missing field causes a 400 response with Zod issues.
Expected output: A valid HMAC with well-formed JSON returns 200. A tampered body returns 401. Missing fields return 400.
Step 10: Build the qualifier agent
Create src/agents/qualifier.ts — the core orchestrator that calls Mistral, enriches the lead, and routes it:
ts
import { IncomingRequestSchema, type IncomingRequest, ContextPacketSchema, type ContextPacket, ConfidenceDecisionSchema, type ConfidenceDecision, AgentResponseSchema } from "@reaatech/agent-mesh";import { Mistral } from "@mistralai/mistralai";import { generateId, now, calculateCostFromTokens } from "@reaatech/llm-cost-telemetry";import type { MarketoLeadEvent, LeadRoutingResult, EnrichedLead } from "../lib/types.js";import { createLeadSession, getLeadConversationContext } from "../lib/session.js";import { getCachedEnrichment, setCachedEnrichment } from "../lib/cache.js";import { checkBudget, recordSpend } from "../lib/cost.js";import { makeRoutingDecision } from "../lib/router.js";import { MarketoClient } from "../api/marketo/client.js"
The agent follows this pipeline:
Validate the incoming event against IncomingRequestSchema and build a ContextPacket with session history
Check the LLM cache — if the same company was recently enriched, use the cached result
Check the daily budget — if exhausted, return FALLBACK immediately
Call mistral-large-latest with a system prompt containing turn history and a user prompt with the lead data
Parse the JSON response for intentScore, companyInfo, summary, and industry
Record the spend via recordSpend(), cache the enrichment, and make a routing decision
If ROUTE: write the lead to Marketo and update custom fields with enrichment data
If CLARIFY: create a lead session for future follow-up
If FALLBACK or on error: return without writing to Marketo
Expected output: A lead with intentScore: 0.92 routes to Marketo. A lead with intentScore: 0.55 creates a session. A lead with intentScore: 0.15 falls back silently.
Step 11: Wire the Next.js API route
Create app/api/webhooks/marketo/route.ts — the public HTTP endpoint that Marketo calls:
ts
import { type NextRequest, NextResponse } from "next/server";import { handleMarketoWebhook } from "../../../../src/api/marketo/webhook.js";export async function POST(req: NextRequest): Promise<NextResponse> { const rawBody = await req.text(); const signature = req.headers.get("x-marketo-signature") ?? ""; const result = await handleMarketoWebhook(rawBody, signature); return NextResponse.json(result.body, { status: result.status });}export function GET(_req: NextRequest): NextResponse { return NextResponse.json({ status: "ok", service: "marketo-webhook" });}
The POST handler reads the raw body as text (required for HMAC verification — you need the exact byte sequence), extracts the signature from the x-marketo-signature header, and delegates to handleMarketoWebhook(). The GET handler provides a simple health check.
Note the use of NextRequest and NextResponse.json() — never bare Request or new Response(JSON.stringify(...)), which omits the Content-Type: application/json header.
Expected output:curl -X POST http://localhost:3000/api/webhooks/marketo -H 'x-marketo-signature: <valid-hmac>' -d '{"id":"1","email":"test@example.com","source":"webinar","timestamp":"2026-06-15T00:00:00Z"}' returns 200 with the routing result.
Step 12: Create the barrel exports
Replace the placeholder src/index.ts with barrel re-exports:
ts
export { handleMarketoWebhook } from "./api/marketo/webhook.js";export { qualifyLead } from "./agents/qualifier.js";export { createLeadSession, addLeadMessage } from "./lib/session.js";export { getCachedEnrichment, setCachedEnrichment } from "./lib/cache.js";export { makeRoutingDecision } from "./lib/router.js";export { checkBudget, recordSpend, withCostTracking } from "./lib/cost.js";
Expected output:pnpm typecheck passes. The barrel exports let consuming code import everything from a single entry point.
Step 13: Run the tests
The test suite covers every module with mocked external dependencies. Create tests/setup.ts for MSW lifecycle:
Expected output: vitest reports numFailedTests: 0, all four coverage metrics at 90% or higher. Tests cover the qualifier agent (15+ test cases including cache hit/miss, budget exhaustion, API errors, and each routing branch), the Marketo client (well-formed request shape, error handling, auth headers), session continuity (CRUD operations, filters, compression), LLM cache (hit/miss, round-trip, use-case segmentation), confidence router (all three thresholds), cost and budget (budget checks, spend recording, CostSpan tracking), the Next.js route (valid/invalid signatures, malformed bodies, health check), and a full integration test that runs the entire webhook-to-Marketo pipeline with mocked HTTP.
Next steps
Add a notification system — integrate Slack or email so reps get pinged when a high-intent lead is routed to them
Deploy with telemetry — configure Langfuse traces to monitor Mistral latency, token usage, and routing distributions in production
Persist sessions — replace the in-memory storage adapter with Redis or PostgreSQL so sessions survive restarts
Scale the cache — swap InMemoryAdapter for a vector database (Pinecone or pgvector) to share cached enrichments across instances
Add fallback models — configure the budget engine to suggest a cheaper model (mistral-small-latest) when mistral-large-latest hits the soft cap
Implement retry logic — add exponential backoff to Mistral calls and Marketo API writes for transient network errors
if (decision.type === "CLARIFY") return wrapAgentResponse({ action: "clarify", confidence: intentScore, reason: "Needs clarification — intent score " + String(intentScore), clarification_question: "Can you provide more details about your requirements?", enrichedLead });