Files · Google Gemini RAG Product Search for Square Online SMB Stores

80 (1 binary, 697.2 kB total)attempt 1

README.md·4634 B·markdown

markdown

# Google Gemini RAG Product Search for Square Online SMB Stores
 
Small Square Online merchants lose sales because customers can't find products using the default keyword search.
 
## Architecture
 
```
Square Catalog API → fastembed embeddings → Qdrant vector store
                                                ↓
                                    semantic cache (llm-cache)
                                                ↓
                              cost-aware model routing (llm-router-engine)
                                                ↓
                                            Gemini
                                                ↓
                                     natural-language answer
```
 
Products are fetched from the Square Catalog API, embedded via fastembed, and stored in a Qdrant vector collection. Incoming search queries are embedded and matched against the vector index. Results pass through llm-cache for semantic deduplication, then llm-router-engine selects the optimal Gemini model based on query complexity and cost constraints.
 
## Setup
 
1. **Qdrant** — `docker run -p 6333:6333 qdrant/qdrant`
2. **Square** — create a Square app at [developer.squareup.com](https://developer.squareup.com), generate an OAuth token, and grant `ITEMS_READ` scope
3. **GCP Vertex AI** — enable the Vertex AI API and create a service account with the `aiplatform.user` role; download the JSON key
4. **Environment variables** — copy `.env.example` to `.env` and set:
   - `SQUARE_ACCESS_TOKEN`
   - `SQUARE_LOCATION_ID`
   - `QDRANT_URL` (default `http://localhost:6333`)
   - `GOOGLE_CLOUD_PROJECT` (your GCP project ID)
   - `GOOGLE_CLOUD_LOCATION` (Vertex AI location, default `us-central1`)
   - `GOOGLE_GENAI_USE_ENTERPRISE` (enable Vertex AI, default `true`)
 
```bash
pnpm install
pnpm dev
```
 
## API Reference
 
### `POST /api/search`
 
Search for products using natural language.
 
**Request:**
```json
{
  "query": "red sneakers size 10",
  "sessionId": "abc123"
}
```
 
**Response:**
```json
{
  "results": [{ "id": "prod_1", "name": "Red Running Sneakers", "price": 89.99, "score": 0.92 }],
  "answer": "We have Red Running Sneakers in size 10 for $89.99.",
  "sessionId": "abc123"
}
```
 
### `POST /api/index`
 
Triggers a re-index of all Square Catalog products into Qdrant.
No request body required.
 
**Response:**
```json
{ "status": "completed" }
```
 
### `GET /api/index`
 
Return index status (total vectors, last indexed timestamp).
 
**Response:**
```json
{ "pointsCount": 150 }
```
 
### `POST /api/session`
 
Create or resume a conversation session.
 
**Request:**
```json
{
  "sessionId": "abc123"
}
```
 
**Response:**
```json
{
  "sessionId": "abc123",
  "created_at": "2026-06-15T10:00:00Z"
}
```
 
### `GET /api/session`
 
Retrieves conversation context (messages) for a given session.
Query param: `sessionId` (required). Returns `{ messages: Message[] }`.
 
**Response:**
```json
{
  "messages": [
    { "role": "user", "content": "red sneakers", "timestamp": "2026-06-15T10:00:00Z" },
    { "role": "assistant", "content": "We have Red Running Sneakers...", "timestamp": "2026-06-15T10:00:01Z" }
  ]
}
```
 
### `DELETE /api/session`
 
Delete a session.
 
**Request:**
```json
{
  "sessionId": "abc123"
}
```
 
**Response:**
```json
{ "ok": true }
```
 
## Tech Stack
 
| Package | Role |
|---|---|
| `next` | App Router, API routes, server-side rendering |
| `square` | Square Catalog API client |
| `fastembed` | Product-query text embeddings |
| `@qdrant/js-client-rest` | Vector-store read/write |
| `llm-cache` | Semantic result cache (deduplication) |
| `llm-router-engine` | Cost-aware model selection |
| `@google/genai @2.8.0` | Gemini LLM inference |
| `@reaatech/llm-router-core 1.0.0` | Shared types and Zod schemas for model routing |
| `@reaatech/llm-cache-adapters-qdrant 0.1.0` | Qdrant vector adapter for semantic cache |
| `@reaatech/session-continuity 0.1.0` | Session lifecycle manager for multi-turn conversations |
| `langfuse 3.38.20` | LLM observability and tracing |
| `zod 4.4.3` | Schema validation and type inference |
| `vitest` | Unit + integration testing |
 
## Testing
 
Run the test suite with vitest:
 
```bash
pnpm test
```
 
This runs all unit and integration tests with coverage reporting (threshold: 90% on lines, branches, functions, statements).
 
Test files are co-located in `tests/` mirroring `src/` — each service module has its own test file with happy-path, error-path, and boundary cases. External API calls (Square, Gemini, Qdrant, Langfuse) are mocked via MSW (Mock Service Worker).
 
## License
 
MIT — see [LICENSE](./LICENSE).