Files · Ollama RAG Knowledge Base for Retail Inventory

42 (1 binary, 130944.0 kB total)attempt 5

README.md·2349 B·markdown

markdown

# Ollama RAG Knowledge Base for Retail Inventory
 
A self-hosted knowledge base that lets retail staff query product inventory docs using natural language, fully local with Ollama.
 
## Architecture
 
- **fastembed** — Local embedding model (BAAI/bge-small-en-v1.5) for vectorizing text
- **LanceDB** — Serverless vector database for storing embeddings and text
- **Ollama** — Local LLM inference (llama3.2) for answer generation
- **Hybrid Retrieval** — Combines vector similarity and full-text search with weighted fusion
 
## Features
 
- Fully self-hosted — no external API calls beyond Ollama
- No API costs — all computation runs locally
- CLI for batch indexing of PDF, Markdown, HTML, and text documents
- Chat UI for natural language queries against inventory docs
- Hybrid retrieval (vector + BM25) for accurate results
 
## Setup
 
1. Install Ollama: <https://ollama.com/download>
2. Pull required models:
   ```bash
   ollama pull llama3.2
   ollama pull nomic-embed-text
   ```
3. Clone and install dependencies:
   ```bash
   pnpm install
   ```
4. Configure `.env` (copy from `.env.example`):
   ```
   OLLAMA_HOST=http://localhost:11434
   LANCEDB_PATH=./lancedb
   DEFAULT_MODEL=llama3.2
   ```
5. Index documents:
   ```bash
   pnpm cli index --dir ./docs
   ```
6. Start the dev server:
   ```bash
   pnpm dev
   ```
 
## CLI Commands
 
```bash
# Index documents
pnpm cli index --dir ./docs --db-path ./lancedb --model BAAI/bge-small-en-v1.5
 
# Show index stats
pnpm cli stats --db-path ./lancedb
```
 
## API
 
### POST /api/chat
 
Send a natural language query against the indexed knowledge base.
 
```bash
curl -X POST http://localhost:3000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "Do we have widget X in stock?"}'
```
 
Response:
```json
{
  "answer": "Yes, widget X is in stock at the downtown location.",
  "sources": [
    {
      "content": "Widget X inventory: 50 units at downtown, 20 at westside...",
      "documentId": "inv-2024-01",
      "score": 0.92
    }
  ]
}
```
 
### GET /api/chat
 
Health check endpoint.
 
```json
{"status": "ok", "message": "Ollama RAG chat endpoint"}
```
 
## Tech Stack
 
- **Runtime**: Node.js 22+, Next.js 16 (App Router)
- **Vector DB**: LanceDB 0.27.2
- **Embeddings**: fastembed 2.1.0
- **LLM**: Ollama 0.6.3
- **Testing**: Vitest 4.1 with MSW for HTTP mocking