OpenAI Knowledge Agent for Confluence SMB Internal Wiki

A natural‑language Q&A bot that indexes Confluence spaces and delivers instant answers to employee questions.

openai knowledge-agent confluence rag nextjs qdrant reaatech

The problem

Small businesses store SOPs, policies, and tribal knowledge in Confluence, but employees waste hours searching across spaces. The built‑in search is keyword‑based and misses the context of real questions.

Built from

Intro

This tutorial walks you through building an OpenAI-powered knowledge agent that answers questions about your Confluence wiki content. You’ll create a Next.js API that crawls Confluence spaces, converts pages to Markdown, embeds them into a Qdrant vector store, and responds to natural-language questions using OpenAI’s GPT models — with semantic caching, session continuity, confidence-based routing, and Langfuse observability along the way.

It uses the REAA stack (confidence-router, llm-cache, session-continuity, agent-memory-core, agent-handoff, agents-markdown) plus Qdrant for vector search, Zod for config validation, and MSW for testing.

This tutorial is for TypeScript developers familiar with Next.js App Router and basic RAG concepts.

Prerequisites

Node.js 22+ and pnpm 10
An OpenAI API key with access to gpt-5.2 and text-embedding-3-small
A Confluence instance (Cloud or Server) with an API token
A Qdrant instance (local via Docker, or cloud at cloud.qdrant.io)
A Langfuse account (optional — observability is a no-op when credentials are absent)
A Next.js App Router project with these dependencies already installed:

json

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

145 kB·121 tests·99.7% coverage·vitest passing

SHA-25668d42692f0a2ff5bba3489360dabd0e923fc494b9285819e1f25c6392f12905c

Book a conversation All solutions

Comments

Loading comments…

// src/lib/confluence-client.ts import { NodeHtmlMarkdown } from "node-html-markdown"; import { config } from "./config.js"; const auth = Buffer.from( `${config.confluenceUsername}:${config.confluenceApiToken}`, ).toString("base64"); export interface ConfluencePage { id: string; title: string; spaceKey: string; body: string; } export class ConfluenceAuthError extends Error { constructor(message: string) { super(message); this.name = "ConfluenceAuthError"; } } const nhm = new NodeHtmlMarkdown(); export function htmlToMarkdown(html: string): string { return nhm.translate(html); } export async function fetchAllPages( spaceKeys: string[], ): Promise<ConfluencePage[]> { const pages: ConfluencePage[] = []; for (const spaceKey of spaceKeys) { let nextUrl: string | null = `${config.confluenceBaseUrl}/rest/api/content?spaceKey=${spaceKey}&expand=body.storage&limit=50`; while (nextUrl) { const response = await fetch(nextUrl, { headers: { Authorization: `Basic ${auth}`, Accept: "application/json", }, }); if (response.status === 401 || response.status === 403) { throw new ConfluenceAuthError( `Authentication failed for Confluence at ${config.confluenceBaseUrl}`, ); } if (!response.ok) { throw new Error( `Confluence API error: ${String(response.status)} ${response.statusText}`, ); } const data = (await response.json()) as { results: Array<{ id: string; title: string; space?: { key: string }; body?: { storage?: { value: string } }; }>; _links?: { next?: string }; }; for (const result of data.results) { pages.push({ id: result.id, title: result.title, spaceKey, body: result.body?.storage?.value ?? "", }); } nextUrl = data._links?.next ? `${config.confluenceBaseUrl}${data._links.next}` : null; } } return pages; }

// src/lib/session-store.ts import { type IStorageAdapter, type TokenCounter, type Session, type Message, type UpdateSessionOptions, ConcurrencyError, } from "@reaatech/session-continuity"; export class InMemoryStorageAdapter implements IStorageAdapter { private sessions: Map<string, Session> = new Map(); private messages: Map<string, Message[]> = new Map(); createSession( session: Omit<Session, "id" | "createdAt" | "lastActivityAt">, ): Promise<Session> { const id = crypto.randomUUID(); const now = new Date(); const newSession: Session = { ...session, id, createdAt: now, lastActivityAt: now, version: 1, }; this.sessions.set(id, newSession); this.messages.set(id, []); return Promise.resolve(newSession); } getSession(id: string): Promise<Session | null> { return Promise.resolve(this.sessions.get(id) ?? null); } updateSession( id: string, updates: Partial<Session>, options?: UpdateSessionOptions, ): Promise<Session> { const existing = this.sessions.get(id); if (!existing) { throw new Error(`Session not found: ${id}`); } if ( options?.expectedVersion !== undefined && existing.version !== undefined && existing.version !== options.expectedVersion ) { throw new ConcurrencyError( id, options.expectedVersion, existing.version, ); } const updated: Session = { ...existing, ...updates, id: existing.id, createdAt: existing.createdAt, version: (existing.version ?? 0) + 1, }; this.sessions.set(id, updated); return Promise.resolve(updated); } // ... deleteSession, listSessions, addMessage, getMessages, // updateMessage, deleteMessage, deleteAllMessages, // getExpiredSessions, health, close } export class SimpleTokenCounter implements TokenCounter { readonly model = "simple"; readonly tokenizer = "character-count"; count(text: string): number { return Math.ceil(text.length / 4); } countMessages(messages: Message[]): number { let total = 0; for (const msg of messages) { if (typeof msg.content === "string") { total += this.count(msg.content); } else if (Array.isArray(msg.content)) { for (const part of msg.content) { if (part.type === "text") { total += this.count(part.text); } } } } return total; } }

// src/jobs/ingest.ts import { fetchAllPages, htmlToMarkdown } from "../lib/confluence-client.js"; import { validatePageContent } from "../lib/markdown-validator.js"; import { chunkDocument } from "../lib/chunker.js"; import { generateEmbedding } from "../lib/openai-client.js"; import { ensureCollection, upsertChunks, createQdrantClient } from "../lib/vector-store.js"; import { config } from "../lib/config.js"; export async function runIngestion(): Promise<{ pagesProcessed: number; chunksStored: number; errors: string[]; }> { const qdrant = createQdrantClient(); const errors: string[] = []; let pagesProcessed = 0; let chunksStored = 0; try { await ensureCollection(qdrant, "confluence_pages", 1536); const pages = await fetchAllPages(config.confluenceSpaceKeys); for (const page of pages) { try { const markdown = htmlToMarkdown(page.body); const validation = validatePageContent(markdown, page.id); if (!validation.valid) { errors.push(`Page ${page.id} skipped: invalid markdown content`); continue; } const chunks = chunkDocument(markdown, 512); if (chunks.length === 0) { continue; } const embeddings: number[][] = []; for (const chunk of chunks) { try { const embedding = await generateEmbedding(chunk); embeddings.push(embedding); } catch (error) { const message = error instanceof Error ? error.message : "Embedding failed"; errors.push(`Page ${page.id} chunk embedding failed: ${message}`); } } if (embeddings.length > 0) { await upsertChunks(qdrant, "confluence_pages", chunks, embeddings, page.id); chunksStored += embeddings.length; } pagesProcessed++; } catch (error) { const message = error instanceof Error ? error.message : "Unknown error"; errors.push(`Page ${page.id} failed: ${message}`); } } } catch (error) { const message = error instanceof Error ? error.message : "Unknown error"; errors.push(`Ingestion pipeline failed: ${message}`); } return { pagesProcessed, chunksStored, errors }; }

// tests/setup.ts import { beforeAll, afterEach, afterAll } from "vitest"; import { setupServer } from "msw/node"; import { http, HttpResponse } from "msw"; export const handlers = [ http.post("https://api.openai.com/v1/responses", () => { return HttpResponse.json({ id: "resp_test", model: "gpt-5.2", output: [ { type: "message", role: "assistant", content: [{ type: "output_text", text: "mocked answer" }], }, ], output_text: "mocked answer", usage: { input_tokens: 10, output_tokens: 5, total_tokens: 15 }, }); }), http.post("https://api.openai.com/v1/embeddings", () => { return HttpResponse.json({ data: [{ embedding: Array(1536).fill(0.1) }], model: "text-embedding-3-small", usage: { prompt_tokens: 5, total_tokens: 5 }, }); }), http.get(/\/rest\/api\/content/, ({ request }) => { const url = new URL(request.url); const spaceKey = url.searchParams.get("spaceKey"); return HttpResponse.json({ results: [ { id: "page1", title: `Test Page ${spaceKey ?? ""}`, body: { storage: { value: "<p>Hello World</p>" } }, _links: { self: "/rest/api/content/page1" }, }, ], _links: { next: null }, size: 1, }); }), http.all(/\/collections\//, ({ request }) => { const url = new URL(request.url); const method = request.method; const path = url.pathname; if (method === "GET") { if (path === "/collections" || path.endsWith("/collections")) { return HttpResponse.json({ result: { collections: [{ name: "confluence_pages" }] }, }); } } if (method === "PUT") { if (path.includes("/points")) { return HttpResponse.json({ result: { operation_id: 0, status: "completed" }, }); } return HttpResponse.json({ result: true }); } if (method === "POST" && path.includes("/search")) { return HttpResponse.json({ result: [ { id: "point1", score: 0.95, payload: { text: "mocked chunk", pageId: "page1" }, }, ], }); } return HttpResponse.json({ result: null }); }), http.post("https://cloud.langfuse.com/api/public/traces", () => { return HttpResponse.json({ id: "trace_test" }); }), ]; export const server = setupServer(...handlers); beforeAll(() => { server.listen({ onUnhandledRequest: "error" }); }); afterEach(() => { server.resetHandlers(); }); afterAll(() => { server.close(); });

OpenAI Knowledge Agent for Confluence SMB Internal Wiki

The problem

Built from

Intro

Prerequisites

Example artifact

Comments

Intro

Prerequisites

Step 1: Configure environment variables

Step 2: Create the typed configuration with Zod

Step 3: Build the OpenAI client

Step 4: Build the Confluence crawler

Step 5: Validate and chunk markdown content

Step 6: Build the Qdrant vector store adapter

Step 7: Wire up LLM response caching

Step 8: Implement session storage and management

Step 9: Add agent memory, confidence routing, handoff, and fallback logic

Step 10: Wire up Langfuse observability

Step 11: Build the Confluence ingestion pipeline

Step 12: Create the chat API route and health check

Step 13: Create the barrel export

Step 14: Set up MSW test infrastructure

Step 15: Run the tests

Next steps