Files · Ollama RAG Knowledge Base for Retail Inventory
42 (1 binary, 130944.0 kB total)attempt 5
README.md·2349 B·markdown
markdown
# Ollama RAG Knowledge Base for Retail Inventory
A self-hosted knowledge base that lets retail staff query product inventory docs using natural language, fully local with Ollama.
## Architecture
- **fastembed** — Local embedding model (BAAI/bge-small-en-v1.5) for vectorizing text
- **LanceDB** — Serverless vector database for storing embeddings and text
- **Ollama** — Local LLM inference (llama3.2) for answer generation
- **Hybrid Retrieval** — Combines vector similarity and full-text search with weighted fusion
## Features
- Fully self-hosted — no external API calls beyond Ollama
- No API costs — all computation runs locally
- CLI for batch indexing of PDF, Markdown, HTML, and text documents
- Chat UI for natural language queries against inventory docs
- Hybrid retrieval (vector + BM25) for accurate results
## Setup
1. Install Ollama: <https://ollama.com/download>
2. Pull required models:
```bash
ollama pull llama3.2
ollama pull nomic-embed-text
```
3. Clone and install dependencies:
```bash
pnpm install
```
4. Configure `.env` (copy from `.env.example`):
```
OLLAMA_HOST=http://localhost:11434
LANCEDB_PATH=./lancedb
DEFAULT_MODEL=llama3.2
```
5. Index documents:
```bash
pnpm cli index --dir ./docs
```
6. Start the dev server:
```bash
pnpm dev
```
## CLI Commands
```bash
# Index documents
pnpm cli index --dir ./docs --db-path ./lancedb --model BAAI/bge-small-en-v1.5
# Show index stats
pnpm cli stats --db-path ./lancedb
```
## API
### POST /api/chat
Send a natural language query against the indexed knowledge base.
```bash
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"query": "Do we have widget X in stock?"}'
```
Response:
```json
{
"answer": "Yes, widget X is in stock at the downtown location.",
"sources": [
{
"content": "Widget X inventory: 50 units at downtown, 20 at westside...",
"documentId": "inv-2024-01",
"score": 0.92
}
]
}
```
### GET /api/chat
Health check endpoint.
```json
{"status": "ok", "message": "Ollama RAG chat endpoint"}
```
## Tech Stack
- **Runtime**: Node.js 22+, Next.js 16 (App Router)
- **Vector DB**: LanceDB 0.27.2
- **Embeddings**: fastembed 2.1.0
- **LLM**: Ollama 0.6.3
- **Testing**: Vitest 4.1 with MSW for HTTP mocking