Ollama RAG Knowledge Base for Retail Inventory

Self-hosted knowledge base that lets retail staff query product inventory docs using natural language, fully local with Ollama.

ollama rag nextjs lancedb fastembed hybrid-search retail-inventory knowledge-base

The problem

Small retailers waste hours searching through scattered PDFs and spreadsheets to answer customer questions about inventory, causing delays and lost sales.

Built from

Intro

By the end of this tutorial you’ll have a fully local RAG (retrieval-augmented generation) knowledge base running on your own machine. Staff can ask natural-language questions like “Do we have Widget X in stock?” and get answers grounded in your indexed product documents. The stack uses Next.js for the UI, Ollama for the LLM, fastembed for local embeddings, and LanceDB as a serverless vector store.

Prerequisites

Node.js 22 or later (check with node -v)
pnpm 10 (install with npm install -g pnpm@10)
Ollama installed and running locally (https://ollama.com/download)
Familiarity with TypeScript and Next.js App Router

Step 1: Create the Next.js project

Start from an empty directory and scaffold a new Next.js app using the App Router.

terminal

mkdir ollama-rag-knowledge-base-for-retail-inventory
cd ollama-rag-knowledge-base-for-retail-inventory
pnpm dlx create-next-app@16 . --typescript --eslint --app --src-dir --import-alias "@/*" --no-tailwind

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

106 tests·100.0% coverage·vitest passing

Book a conversation All solutions

Comments

Loading comments…

import { NextRequest, NextResponse } from "next/server"; import type { RetrievalResult } from "@reaatech/hybrid-rag"; import type { RetrievalService } from "@/src/services/retrieval.js"; import type { OllamaService } from "@/src/services/llm.js"; export const runtime = "nodejs"; let _retrievalService: RetrievalService | null = null; let _ollamaService: OllamaService | null = null; async function getServices() { if (!_retrievalService) { const { RetrievalService } = await import("@/src/services/retrieval.js"); const { LanceDBStore } = await import("@/src/lib/lancedb-adapter.js"); const { FastembedEmbedder } = await import("@/src/lib/embedding-provider.js"); const store = new LanceDBStore(process.env.LANCEDB_PATH ?? "./lancedb"); await store.initialize(); const embedder = new FastembedEmbedder(); await embedder.initialize(); _retrievalService = new RetrievalService(embedder, store); } if (!_ollamaService) { const { OllamaService } = await import("@/src/services/llm.js"); _ollamaService = new OllamaService({ model: process.env.DEFAULT_MODEL ?? "llama3.2", host: process.env.OLLAMA_HOST ?? "http://127.0.0.1:11434", }); } return { retrievalService: _retrievalService, ollamaService: _ollamaService }; } export async function POST(req: NextRequest) { const { ChatRequestSchema } = await import("@/src/types.js"); let body: unknown; try { body = await req.json(); } catch { return NextResponse.json({ error: "Invalid JSON", details: [] }, { status: 400 }); } const parsed = ChatRequestSchema.safeParse(body); if (!parsed.success) { return NextResponse.json( { error: "Invalid request", details: parsed.error.issues }, { status: 400 } ); } try { const { query } = parsed.data; const { retrievalService, ollamaService } = await getServices(); const results = await retrievalService.retrieve(query, 10); const contextChunks = results.slice(0, 5); const context = contextChunks .map((c: RetrievalResult) => c.content) .join("\n\n") .slice(0, 3000); const answer = await ollamaService.generateAnswer(query, context); const sources = contextChunks.map((c: RetrievalResult) => ({ content: c.content.slice(0, 200), documentId: c.documentId, score: c.score, })); return NextResponse.json({ answer, sources }); } catch (error) { const message = error instanceof Error ? error.message : "Unknown error"; return NextResponse.json({ error: "Internal error", message }, { status: 500 }); } } export async function GET() { return NextResponse.json({ status: "ok", message: "Ollama RAG chat endpoint" }); }

Ollama RAG Knowledge Base for Retail Inventory

The problem

Built from

Intro

Prerequisites

Step 1: Create the Next.js project

Example artifact

Comments

Intro

Prerequisites

Step 1: Create the Next.js project

Step 2: Install dependencies

Step 3: Pull Ollama models

Step 4: Configure environment variables

Step 5: Configure Next.js for native modules

Step 6: Create shared types

Step 7: Build the LanceDB adapter

Step 8: Build the fastembed embedder

Step 9: Build the retrieval service

Step 10: Build the Ollama service

Step 11: Build the ingestion service

Step 12: Create the Chat API route

Step 13: Create the Chat UI

Step 14: Create the CLI for indexing

Step 15: Run the tests

Step 16: Start the dev server and test the API

Next steps