A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
In this tutorial, you’ll build a Mistral AI Knowledge Agent that connects to PostHog’s product analytics API. Once finished, you can ask plain-English questions about your product data — like “How many signups this week?” — and get instant answers powered by Mistral’s LLM, with contextual follow-ups, semantic caching via Qdrant, and multi-turn conversation memory. This recipe is designed for SMB product teams who want product insights without writing SQL.
The project uses Next.js 16 (App Router), TypeScript, and six REAA packages that provide agent memory, LLM caching, session continuity, structured output repair, and confidence evaluation — all wired together by a central orchestrator.
Now install the runtime dependencies. These include Mistral’s SDK, Qdrant’s client, the fastembed library for vector embeddings, Langfuse for tracing, Zod for schema validation, and all six REAA packages.
Expected output: The package.json now lists all dependencies under "dependencies" and "devDependencies" with exact versions (no ^ or ~). Run pnpm install to generate the lockfile.
Step 2: Configure environment variables
Create your .env file from the example. These environment variables configure Mistral, PostHog, Qdrant, and Langfuse.
terminal
cp .env.example .env
The .env.example should contain these placeholder values (fill in your real keys in .env):
Expected output: A .env file populated with your credentials. The agent reads from process.env at runtime.
Step 3: Define the shared types
Create src/types/index.ts — these interfaces model the request/response shapes, PostHog data, and the agent’s internal context. Notice the Memory type is imported directly from @reaatech/agent-memory-core rather than being hand-rolled.
Expected output: A clean TypeScript module with no type errors. These types are used across every service in the agent.
Step 4: Create the PostHog API client
Create src/api/posthog-client.ts. This class wraps PostHog’s REST API — it fetches events and funnels using fetchWithTimeout from @reaatech/agent-memory-core (which includes a default 10-second timeout). On HTTP errors, it throws a custom PostHogApiError with the status code and body.
ts
import { fetchWithTimeout } from "@reaatech/agent-memory-core";import type { PostHogEvent, PostHogFunnel } from "../types/index.js";export class PostHogApiError extends Error { constructor( message: string, public readonly status: number, public readonly body: Record<string, unknown>, ) { super(message); this.name = "PostHogApiError"; }}export class PostHogClient { constructor( private readonly config: { apiKey: string; projectId: string; baseUrl: string; }, ) {} async getEvents( opts: { since?: string; limit?: number; eventName?: string } = {}, ): Promise<PostHogEvent[]> { const params = new URLSearchParams(); if (opts.since) params.set("since", opts.since); if (opts.limit) params.set("limit", String(opts.limit)); if (opts.eventName) params.set("event", opts.eventName); const url = `${this.config.baseUrl}/api/projects/${this.config.projectId}/events?${params.toString()}`; return this.request<PostHogEvent[]>(url); } async getFunnels( opts: { funnelId?: number; dateFrom?: string; dateTo?: string } = {}, ): Promise<PostHogFunnel[]> { const params = new URLSearchParams(); if (opts.funnelId) params.set("funnel_id", String(opts.funnelId)); if (opts.dateFrom) params.set("date_from", opts.dateFrom); if (opts.dateTo) params.set("date_to", opts.dateTo); const queryStr = params.toString(); const url = `${this.config.baseUrl}/api/projects/${this.config.projectId}/funnels${queryStr ? "?" + queryStr : ""}`; return this.request<PostHogFunnel[]>(url); } private async request<T>(url: string): Promise<T> { const response = await fetchWithTimeout(url, { headers: { Authorization: `Bearer ${this.config.apiKey}`, "Content-Type": "application/json", }, }); if (!response.ok) { let body: Record<string, unknown> = {}; try { body = (await response.json()) as Record<string, unknown>; } catch { body = { error: response.statusText }; } throw new PostHogApiError( `PostHog API returned ${String(response.status)}`, response.status, body, ); } const data = (await response.json()) as { results?: T }; return data.results as T; }}export function createPostHogClient(): PostHogClient { return new PostHogClient({ apiKey: process.env.POSTHOG_API_KEY ?? "", projectId: process.env.POSTHOG_PROJECT_ID ?? "", baseUrl: process.env.POSTHOG_BASE_URL ?? "https://app.posthog.com", });}
Expected output: A PostHog client that can fetch events and funnels. The factory function createPostHogClient() reads env vars so you don’t need to configure it manually.
Step 5: Set up the embedding service
Create src/services/embedding-service.ts. This wraps fastembed’s FlagEmbedding to convert text into vector embeddings. It uses the BAAI/bge-small-en-v1.5 model (384 dimensions), which is the recommended default.
Expected output: An EmbeddingService that lazily initializes the ONNX embedding model on first use. embedQuery returns a 384-dimensional vector; embedBatch yields batches for bulk processing.
Step 6: Wire up memory storage
Create src/services/memory-service.ts. This service uses the REAA InMemoryMemoryStorage to store conversation memories as vector-annotated Memory objects. It also re-exports cosineSimilarity from @reaatech/agent-memory-core for any manual similarity computations.
Expected output: An in-memory store that treats each user query as an episodic memory with its embedding vector and full metadata. searchSimilar finds related past interactions for contextual follow-ups.
Step 7: Build the session service
Create src/services/session-service.ts. This wraps MemoryAdapter from @reaatech/session-continuity-storage-memory to manage conversation state across refreshes. Each session stores a message history that the agent reads during multi-turn conversations.
Expected output: A session service that creates sessions, stores messages, and retrieves conversation history. The ttlMs default of 1 hour keeps sessions from growing unbounded.
Step 8: Create the LLM cache with Qdrant
Create src/services/cache-service.ts. This uses QdrantAdapter from @reaatech/llm-cache-adapters-qdrant to store and retrieve LLM responses by semantic similarity. When the user asks a question similar to one already answered, the cached response is returned directly — saving a Mistral API call.
Expected output: A cache service backed by Qdrant’s vector search. connect() auto-creates the collection and payload indexes. findSimilar returns cached entries above a similarity threshold (0.85 by default).
Step 9: Build the Mistral LLM service
Create src/services/llm-service.ts. This wraps Mistral’s SDK using the named Mistral import (not a default import — the named import is required for TypeScript strict mode). The key method is chat.complete(), not the OpenAI-shaped chat.completions.create().
The chatWithContext method builds a system prompt that includes recent PostHog events and past conversation memories, then sends it to Mistral.
ts
import { Mistral } from "@mistralai/mistralai";import type { AgentContext, AgentResult } from "../types/index.js";interface MistralSdkError { message?: string; statusCode?: number; body?: string;}export class LLMServiceError extends Error { constructor( message: string, public readonly statusCode?: number, public readonly body?: string, ) { super(message); this.name
Expected output: An LLM service that calls Mistral’s chat.complete() with a rich system prompt. chatWithContext injects PostHog events and past memories as context so Mistral can answer based on your actual product data.
Step 10: Add structured output repair
Create src/services/repair-service.ts. Mistral sometimes returns malformed JSON or markdown-fenced code blocks. The @reaatech/structured-repair-core package fixes these automatically — it extracts valid JSON, strips fences, and parses structured data from noisy output.
Expected output: A repair service that can fix malformed Mistral output. The safeRepair wrapper ensures the agent never crashes on unrepairable output — it logs the failure and returns the raw text.
Step 11: Wire up confidence evaluation
Create src/services/confidence-service.ts. This uses ConfidenceRouter from @reaatech/confidence-router to decide whether the agent’s response is confident enough to route to the user, needs a fallback, or should clarify. A heuristic score based on reply length feeds the router’s decision.
Expected output: A confidence service that scores each response. Long, substantive replies score high (ROUTE), short or empty replies score low (FALLBACK). Clarification mode is disabled — the agent handles ambiguity by re-querying PostHog with more context.
Step 12: Build the Langfuse tracer
Create src/lib/langfuse.ts. This initializes Langfuse for observability. If credentials are missing, it returns no-op stubs so the agent works offline without any changes.
Expected output: A singleton Langfuse client that gracefully degrades to no-ops when credentials are absent. The orchestrator uses langfuse.log() to trace every query.
Step 13: Write the Knowledge Agent orchestrator
Create src/services/knowledge-agent.ts. This is the central orchestrator that wires all eight services together. Its processMessage method follows a 10-step pipeline:
Resolve or create a session
Embed the user’s message
Check the semantic cache (return immediately on >= 0.90 similarity)
Search past memories by vector similarity
Fetch PostHog events and funnels (best-effort — failures produce empty arrays)
Call Mistral with the enriched context
Repair the LLM output (best-effort — falls through on failure)
Evaluate confidence with the router
Persist to memory, cache, and session history
Log everything to Langfuse
ts
import type { EmbeddingService } from "./embedding-service.js";import type { MemoryService } from "./memory-service.js";import type { SessionService } from "./session-service.js";import type { LLMCacheService } from "./cache-service.js";import type { LLMService } from "./llm-service.js";import type { RepairService } from "./repair-service.js";import type { ConfidenceService } from "./confidence-service.js";import type { PostHogClient } from "../api/posthog-client.js";import type { ChatResponse, AgentContext } from "../types/index.js";import
Expected output: A KnowledgeAgent class with a single processMessage method that handles the full pipeline. The factory function createKnowledgeAgent() wires all services and handles lazy initialization.
Step 14: Create the Next.js API route
Create app/api/chat/route.ts — the HTTP interface for the agent. It follows Next.js 16 App Router conventions: named exports for each HTTP verb (POST, GET, OPTIONS), NextRequest/NextResponse from next/server, and NextResponse.json() for JSON responses.
Expected output: A fully functional API endpoint at POST /api/chat. The agent is created as a lazy singleton so the first request pays the initialization cost (embedding model, Qdrant connection) and subsequent requests reuse it.
Step 15: Update the public exports
Replace the scaffold placeholder in src/index.ts with re-exports of every public symbol. Notice the .js extensions — required by TypeScript’s NodeNext module resolution.
ts
export { KnowledgeAgent, createKnowledgeAgent } from "./services/knowledge-agent.js";export { PostHogClient, PostHogApiError, createPostHogClient } from "./api/posthog-client.js";export { EmbeddingService, EmbeddingError } from "./services/embedding-service.js";export { MemoryService } from "./services/memory-service.js";export { SessionService } from "./services/session-service.js";export { LLMCacheService } from "./services/cache-service.js";export { LLMService, LLMServiceError } from "./services/llm-service.js";export { RepairService, ChatResponseSchema } from "./services/repair-service.js";export { ConfidenceService } from "./services/confidence-service.js";export type { ChatRequest, ChatResponse, PostHogEvent, PostHogFunnel, AgentContext, AgentResult, Memory,} from "./types/index.js";
Expected output: Consumers can import any service, class, or type from the package root.
Step 16: Run the tests
Create a test setup at tests/setup.ts that uses MSW (Mock Service Worker) to intercept every external HTTP call — Mistral, PostHog, and Qdrant. This keeps tests fast and deterministic.
pnpm vitest run --coverage --reporter=json --outputFile=vitest-report.json
Expected output: All 82 tests pass (numFailedTests: 0), with code coverage at 90% or higher across lines, branches, functions, and statements.
Step 17: Try the recipe
Start the dev server and send a query.
terminal
pnpm dev
In another terminal:
terminal
curl -X POST http://localhost:3000/api/chat \ -H "Content-Type: application/json" \ -d '{"message": "How many page views happened in the last 7 days?"}'
Expected output: A JSON response like {"sessionId":"<uuid>","reply":"...","confidence":"high"}. The agent embedded your query, searched past memories, fetched PostHog events, asked Mistral, repaired the output, evaluated confidence, cached the result, and stored the conversation — all in a single request.
Send a follow-up with the same session ID to see multi-turn context in action:
terminal
curl -X POST http://localhost:3000/api/chat \ -H "Content-Type: application/json" \ -d '{"sessionId": "<uuid-from-first-response>", "message": "How does that compare to last week?"}'
The agent will retrieve the session history and past memories, then answer in context.
Next steps
Add user authentication: Replace the "anonymous" user ID with real user identities from your auth system (NextAuth, Clerk, etc.)
Deploy Qdrant in production: Switch from local Qdrant to a hosted Qdrant Cloud cluster for persistent cache and memory storage across server restarts
Build a frontend chat UI: Create a chat interface at app/page.tsx that sends messages to /api/chat and displays responses with confidence badges (high, medium, low)
Add PostHog trend/funnel charts: Extend the agent to return structured trend data that a chart component can render inline
Implement cache TTL policies: Use cacheService.invalidateOlderThan(24) in a cron job to refresh stale cache entries daily
Wire up the full Langfuse trace: Replace the simple log wrapper with full Langfuse traces, spans, and generations for detailed observability