Files · Google Gemini RAG Product Search for Square Online SMB Stores
80 (1 binary, 697.2 kB total)attempt 1
README.md·4634 B·markdown
markdown
# Google Gemini RAG Product Search for Square Online SMB Stores
Small Square Online merchants lose sales because customers can't find products using the default keyword search.
## Architecture
```
Square Catalog API → fastembed embeddings → Qdrant vector store
↓
semantic cache (llm-cache)
↓
cost-aware model routing (llm-router-engine)
↓
Gemini
↓
natural-language answer
```
Products are fetched from the Square Catalog API, embedded via fastembed, and stored in a Qdrant vector collection. Incoming search queries are embedded and matched against the vector index. Results pass through llm-cache for semantic deduplication, then llm-router-engine selects the optimal Gemini model based on query complexity and cost constraints.
## Setup
1. **Qdrant** — `docker run -p 6333:6333 qdrant/qdrant`
2. **Square** — create a Square app at [developer.squareup.com](https://developer.squareup.com), generate an OAuth token, and grant `ITEMS_READ` scope
3. **GCP Vertex AI** — enable the Vertex AI API and create a service account with the `aiplatform.user` role; download the JSON key
4. **Environment variables** — copy `.env.example` to `.env` and set:
- `SQUARE_ACCESS_TOKEN`
- `SQUARE_LOCATION_ID`
- `QDRANT_URL` (default `http://localhost:6333`)
- `GOOGLE_CLOUD_PROJECT` (your GCP project ID)
- `GOOGLE_CLOUD_LOCATION` (Vertex AI location, default `us-central1`)
- `GOOGLE_GENAI_USE_ENTERPRISE` (enable Vertex AI, default `true`)
```bash
pnpm install
pnpm dev
```
## API Reference
### `POST /api/search`
Search for products using natural language.
**Request:**
```json
{
"query": "red sneakers size 10",
"sessionId": "abc123"
}
```
**Response:**
```json
{
"results": [{ "id": "prod_1", "name": "Red Running Sneakers", "price": 89.99, "score": 0.92 }],
"answer": "We have Red Running Sneakers in size 10 for $89.99.",
"sessionId": "abc123"
}
```
### `POST /api/index`
Triggers a re-index of all Square Catalog products into Qdrant.
No request body required.
**Response:**
```json
{ "status": "completed" }
```
### `GET /api/index`
Return index status (total vectors, last indexed timestamp).
**Response:**
```json
{ "pointsCount": 150 }
```
### `POST /api/session`
Create or resume a conversation session.
**Request:**
```json
{
"sessionId": "abc123"
}
```
**Response:**
```json
{
"sessionId": "abc123",
"created_at": "2026-06-15T10:00:00Z"
}
```
### `GET /api/session`
Retrieves conversation context (messages) for a given session.
Query param: `sessionId` (required). Returns `{ messages: Message[] }`.
**Response:**
```json
{
"messages": [
{ "role": "user", "content": "red sneakers", "timestamp": "2026-06-15T10:00:00Z" },
{ "role": "assistant", "content": "We have Red Running Sneakers...", "timestamp": "2026-06-15T10:00:01Z" }
]
}
```
### `DELETE /api/session`
Delete a session.
**Request:**
```json
{
"sessionId": "abc123"
}
```
**Response:**
```json
{ "ok": true }
```
## Tech Stack
| Package | Role |
|---|---|
| `next` | App Router, API routes, server-side rendering |
| `square` | Square Catalog API client |
| `fastembed` | Product-query text embeddings |
| `@qdrant/js-client-rest` | Vector-store read/write |
| `llm-cache` | Semantic result cache (deduplication) |
| `llm-router-engine` | Cost-aware model selection |
| `@google/genai @2.8.0` | Gemini LLM inference |
| `@reaatech/llm-router-core 1.0.0` | Shared types and Zod schemas for model routing |
| `@reaatech/llm-cache-adapters-qdrant 0.1.0` | Qdrant vector adapter for semantic cache |
| `@reaatech/session-continuity 0.1.0` | Session lifecycle manager for multi-turn conversations |
| `langfuse 3.38.20` | LLM observability and tracing |
| `zod 4.4.3` | Schema validation and type inference |
| `vitest` | Unit + integration testing |
## Testing
Run the test suite with vitest:
```bash
pnpm test
```
This runs all unit and integration tests with coverage reporting (threshold: 90% on lines, branches, functions, statements).
Test files are co-located in `tests/` mirroring `src/` — each service module has its own test file with happy-path, error-path, and boundary cases. External API calls (Square, Gemini, Qdrant, Langfuse) are mocked via MSW (Mock Service Worker).
## License
MIT — see [LICENSE](./LICENSE).