OpenAI Knowledge Agent for SMB Employee Onboarding

A persistent AI memory system that ingests onboarding docs, learns company norms, and answers new hires' questions in natural language, powered by OpenAI.

typescript nextjs openai knowledge-agent rag employee-onboarding pgvector

The problem

SMBs rely on a handful of people to onboard new hires, but that knowledge is trapped in scattered PDFs, Slack messages, and tribal memory. New employees waste weeks hunting for answers while senior staff are pulled from revenue work.

Built from

Intro

You’ll build an AI onboarding assistant that ingests company documents, stores them as searchable vector embeddings, and answers new-hire questions with cited sources using OpenAI. By the end, you’ll have a working Next.js application with document upload, semantic search, and a chat interface — all backed by REAA’s agent-memory stack for persistent knowledge storage.

Prerequisites

Node.js >= 22
pnpm >= 10
PostgreSQL with the pgvector extension installed
An OpenAI API key (for chat completions and embeddings)
Familiarity with TypeScript and Next.js App Router routing

Step 1: Scaffold the project and install dependencies

Create an empty directory and set up the package.json with all required dependencies. The project uses Next.js 16, the REAA agent-memory packages, the OpenAI SDK, fastembed for local embeddings, and pgvector for vector storage.

Create package.json:

json

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

73 kB·47 tests·96.0% coverage·vitest passing

SHA-256f85d1a23b3f71e0b4b6454582e575c8d359e194116f92a041b4ab17e3d0764c3

Book a conversation All solutions

Comments

Loading comments…

import { AgentMemory, OpenAILLMProvider, MemoryType, } from "@reaatech/agent-memory"; const MEMORY_TENANT_ID = process.env["MEMORY_TENANT_ID"] ?? "default"; const MEMORY_OWNER_ID = process.env["MEMORY_OWNER_ID"] ?? "default"; let memoryInstance: AgentMemory | null = null; export interface AgentMemoryConfig { host?: string; port?: number; database?: string; user?: string; password?: string; apiKey?: string; embeddingModel?: string; dimensions?: number; tenantId?: string; ownerId?: string; } export function createAgentMemory(config?: AgentMemoryConfig): AgentMemory { const host = config?.host ?? process.env["DB_HOST"] ?? "localhost"; const port = config?.port ?? Number(process.env["DB_PORT"] ?? 5432); const database = config?.database ?? process.env["DB_NAME"] ?? "agent_memory"; const user = config?.user ?? process.env["DB_USER"] ?? "postgres"; const password = config?.password ?? process.env["DB_PASSWORD"] ?? ""; const apiKey = config?.apiKey ?? process.env["OPENAI_API_KEY"] ?? ""; const embeddingModel = config?.embeddingModel ?? process.env["EMBEDDING_MODEL"] ?? "text-embedding-3-small"; const tenantId = config?.tenantId ?? MEMORY_TENANT_ID; const ownerId = config?.ownerId ?? MEMORY_OWNER_ID; const memory = new AgentMemory({ storage: { provider: "postgres", connection: { host, port, database, user, password, }, }, embedding: { provider: "openai", model: embeddingModel, apiKey, }, extraction: { llmProvider: new OpenAILLMProvider({ apiKey, model: "gpt-4o-mini", }), enabledTypes: [MemoryType.FACT, MemoryType.PREFERENCE], batchSize: 10, confidenceThreshold: 0.7, }, tenantId, ownerId, }); return memory; } export function getMemory(config?: AgentMemoryConfig): AgentMemory { if (!memoryInstance) { memoryInstance = createAgentMemory(config); } return memoryInstance; } export async function closeMemory(): Promise<void> { if (memoryInstance) { await memoryInstance.close(); memoryInstance = null; } }

import type { EmbeddingProvider } from "@reaatech/agent-memory-embedding"; import { OpenAIEmbeddingProvider, CachedEmbeddingProvider, InMemoryEmbeddingCache, } from "@reaatech/agent-memory-embedding"; import { FlagEmbedding, EmbeddingModel } from "fastembed"; interface FastembedModelInfo { name: string; dimensions: number; maxInputLength: number; } export class FastembedProvider implements EmbeddingProvider { private model: FlagEmbedding | null = null; private modelName: string; constructor() { this.modelName = "BAAI/bge-small-en-v1.5"; } async ensureInitialized(): Promise<FlagEmbedding> { if (!this.model) { this.model = await FlagEmbedding.init({ model: EmbeddingModel.BGESmallENV15, }); } return this.model; } async embed(text: string): Promise<number[]> { const model = await this.ensureInitialized(); const result: number[] = await model.queryEmbed(text); return result; } async embedBatch(texts: string[]): Promise<number[][]> { const model = await this.ensureInitialized(); const generator = model.embed(texts); const batches: number[][] = []; for await (const batch of generator) { batches.push(...batch); } return batches; } getModelInfo(): FastembedModelInfo { return { name: this.modelName, dimensions: 384, maxInputLength: 512, }; } } export function getEmbeddingProvider(): CachedEmbeddingProvider { const providerEnv = process.env["EMBEDDING_PROVIDER"] ?? "openai"; const apiKey = process.env["OPENAI_API_KEY"] ?? ""; const embeddingModel = process.env["EMBEDDING_MODEL"] ?? "text-embedding-3-small"; let baseProvider: EmbeddingProvider; if (providerEnv === "fastembed") { baseProvider = new FastembedProvider(); } else { baseProvider = new OpenAIEmbeddingProvider({ apiKey, model: embeddingModel, dimensions: 1536, }); } return new CachedEmbeddingProvider( baseProvider, new InMemoryEmbeddingCache({ maxSize: 1000, ttlMs: 60000 }), ); }

import { randomUUID } from "node:crypto"; import { MemoryType, MemorySource, MemoryImportance, MemoryLifecycle, type Memory, } from "@reaatech/agent-memory-core"; import { getMemory } from "./memory.js"; import { getEmbeddingProvider } from "./embedding.js"; const CHUNK_SIZE = 500; const CHUNK_OVERLAP = 50; const SUPPORTED_EXTENSIONS = new Set([".md", ".txt", ".markdown"]); function splitTextIntoChunks(text: string): string[] { if (text.length === 0) { return []; } const chunks: string[] = []; let startIndex = 0; while (startIndex < text.length) { let endIndex = startIndex + CHUNK_SIZE; if (endIndex >= text.length) { chunks.push(text.slice(startIndex)); break; } // Try to break at a paragraph boundary const searchWindow = text.slice(startIndex, endIndex + CHUNK_OVERLAP); const paragraphBreak = searchWindow.lastIndexOf("\n\n"); if ( paragraphBreak >= CHUNK_SIZE - 100 && paragraphBreak <= CHUNK_SIZE + CHUNK_OVERLAP ) { endIndex = startIndex + paragraphBreak; } else { // Try to break at a sentence boundary const sentenceBreak = searchWindow.lastIndexOf(". "); if ( sentenceBreak >= CHUNK_SIZE - 100 && sentenceBreak <= CHUNK_SIZE + CHUNK_OVERLAP ) { endIndex = startIndex + sentenceBreak + 1; } } chunks.push(text.slice(startIndex, endIndex)); startIndex = endIndex - CHUNK_OVERLAP; } return chunks; } export async function processUpload( fileBuffer: Buffer, fileName: string, ): Promise<number> { // Validate file extension const ext = fileName.toLowerCase().slice(fileName.lastIndexOf(".")); if (!SUPPORTED_EXTENSIONS.has(ext)) { throw new Error( `Unsupported file extension "${ext}". Supported: ${Array.from(SUPPORTED_EXTENSIONS).join(", ")}`, ); } // Validate non-empty if (fileBuffer.length === 0) { return 0; } const text = fileBuffer.toString("utf-8"); const chunks = splitTextIntoChunks(text); const tenantId = "default"; const ownerId = "default"; const memory = getMemory(); const storage = memory.getStorage(); const embedder = getEmbeddingProvider(); const modelInfo = embedder.getModelInfo(); for (const chunkText of chunks) { // Embed the chunk content to get a real vector const vector = await embedder.embed(chunkText); const memoryObj: Memory = { id: randomUUID(), tenantId, ownerId, content: chunkText, type: MemoryType.FACT, category: "onboarding", source: MemorySource.USER_STATEMENT, importance: MemoryImportance.MEDIUM, confidence: 1.0, tags: ["onboarding", fileName], lifecycle: MemoryLifecycle.ACTIVE, createdAt: new Date(), updatedAt: new Date(), lastAccessedAt: new Date(), embeddings: { vector, model: modelInfo.name, dimensions: modelInfo.dimensions, }, version: 1, history: [], }; await storage.create(memoryObj); } return chunks.length; }

import OpenAI from "openai"; export class ChatError extends Error { statusCode: number; constructor(message: string, statusCode: number = 500) { super(message); this.name = "ChatError"; this.statusCode = statusCode; } } const SYSTEM_PROMPT = "You are an onboarding assistant for new employees. Answer questions using ONLY the provided context below. If the context does not contain relevant information, politely say you don't have that information yet. Cite sources by referencing [Source: filename] when available."; const MAX_CONTEXT_CHARS = 8000; function getClient(): OpenAI { const apiKey = process.env["OPENAI_API_KEY"]; if (!apiKey) { throw new ChatError("OPENAI_API_KEY is not set", 500); } return new OpenAI({ apiKey }); } function truncateContext(context: string): string { if (context.length > MAX_CONTEXT_CHARS) { return context.slice(0, MAX_CONTEXT_CHARS) + "...[truncated]"; } return context; } export async function askQuestion( question: string, context: string, ): Promise<string> { if (!question || question.trim().length === 0) { throw new ChatError("Question cannot be empty", 400); } const client = getClient(); const truncatedContext = truncateContext(context); try { const completion = await client.chat.completions.create({ model: "gpt-4o-mini", temperature: 0.2, messages: [ { role: "system", content: SYSTEM_PROMPT }, { role: "user", content: `Context:\n${truncatedContext}\n\nQuestion: ${question}`, }, ], }); const choice = completion.choices[0]; if (!choice) { return ""; } return choice.message.content ?? ""; } catch (err: unknown) { if (err instanceof OpenAI.APIError) { const statusCode: number = (err.status as number | undefined) ?? 500; throw new ChatError( `OpenAI API error (${String(err.status)}): ${err.message}`, statusCode, ); } throw err; } } export async function* askQuestionStream( question: string, context: string, ): AsyncGenerator<string> { if (!question || question.trim().length === 0) { throw new ChatError("Question cannot be empty", 400); } const client = getClient(); const truncatedContext = truncateContext(context); const stream = await client.chat.completions.create({ model: "gpt-4o-mini", temperature: 0.2, messages: [ { role: "system", content: SYSTEM_PROMPT }, { role: "user", content: `Context:\n${truncatedContext}\n\nQuestion: ${question}`, }, ], stream: true, }); for await (const chunk of stream) { const delta = chunk.choices[0]?.delta; if (delta?.content) { yield delta.content; } } }

.container { max-width: 800px; margin: 0 auto; padding: 1rem; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen, Ubuntu, Cantarell, sans-serif; display: flex; flex-direction: column; height: 100vh; } .container h1 { font-size: 1.5rem; margin-bottom: 1rem; color: #1a1a2e; text-align: center; } .chat-window { flex: 1; overflow-y: auto; padding: 1rem; background: #f8f9fa; border-radius: 8px; margin: 1rem 0; display: flex; flex-direction: column; gap: 0.75rem; } .message { padding: 0.75rem 1rem; border-radius: 8px; max-width: 80%; line-height: 1.5; } .message.user { background: #e3f2fd; align-self: flex-end; color: #0d47a1; } .message.assistant { background: #ffffff; align-self: flex-start; border: 1px solid #e0e0e0; color: #212121; } .message strong { display: block; margin-bottom: 0.25rem; font-size: 0.8rem; color: #666; } .message p { margin: 0; } .sources { margin-top: 0.5rem; display: flex; flex-wrap: wrap; gap: 0.25rem; } .source-tag { background: #e8f5e9; color: #2e7d32; font-size: 0.7rem; padding: 0.15rem 0.4rem; border-radius: 4px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; max-width: 200px; } .message-input { display: flex; gap: 0.5rem; align-items: flex-start; } .message-input textarea { flex: 1; padding: 0.75rem; border: 1px solid #ccc; border-radius: 8px; font-size: 1rem; resize: none; font-family: inherit; } .message-input button { padding: 0.75rem 1.5rem; background: #1a73e8; color: white; border: none; border-radius: 8px; font-size: 1rem; cursor: pointer; font-weight: 500; } .message-input button:disabled { background: #ccc; cursor: not-allowed; } .file-upload { background: #fff3e0; padding: 0.75rem; border-radius: 8px; margin-bottom: 0.5rem; } .file-upload label { display: flex; gap: 0.5rem; align-items: center; font-size: 0.9rem; color: #e65100; cursor: pointer; } .upload-idle { color: #666; font-size: 0.85rem; } .upload-uploading { color: #1565c0; font-size: 0.85rem; font-style: italic; } .upload-success { color: #2e7d32; font-size: 0.85rem; font-weight: 500; } .upload-error { color: #c62828; font-size: 0.85rem; } .loading { text-align: center; color: #666; font-style: italic; padding: 0.5rem; }

OpenAI Knowledge Agent for SMB Employee Onboarding

The problem

Built from

Intro

Prerequisites

Step 1: Scaffold the project and install dependencies

Example artifact

Comments

Intro

Prerequisites

Step 1: Scaffold the project and install dependencies

Step 2: Configure TypeScript, Next.js, ESLint, and Vitest

Step 3: Set up environment variables

Step 4: Build the AgentMemory singleton

Step 5: Build the embedding providers

Step 6: Build the document ingestion pipeline

Step 7: Build the memory retrieval pipeline

Step 8: Build the chat module

Step 9: Create the API routes

Step 10: Add server instrumentation

Step 11: Build the chat UI

Step 12: Run the tests

Next steps