Files · Anthropic RAG Product Desc Generator for BigCommerce SMB Sellers
73 (1 binary, 607.9 kB total)attempt 1
README.md·5425 B·markdown
markdown
# Anthropic RAG Product Desc Generator for BigCommerce SMB Sellers
> Generate SEO-optimized product descriptions at scale using your existing BigCommerce catalog and Anthropic's Claude.
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
## Architecture
This pipeline processes each product through 9 steps:
1. **Fetch** — Retrieve product data from BigCommerce catalog API
2. **Embed** — Generate a vector embedding of the product's current description via Voyage AI
3. **Retrieve** — Find semantically similar products from the pgvector (Neon Postgres) database
4. **Plan** — Build a context window using `@reaatech/context-window-planner`, prioritizing system prompts, product info, RAG chunks, and generation buffer within a 128K token budget
5. **Cache Check** — Look up the constructed prompt in the LLM cache (exact + semantic matching)
6. **Generate** — If cache miss, call Claude (claude-sonnet-4-6) with the packed prompt
7. **Repair** — Pass the raw Claude output through `@reaatech/structured-repair-core` to extract valid JSON, coerce types, remove hallucinated fields
8. **Store** — Record the turn in the session store and update the BigCommerce product description
9. **Index** — Upsert the new embedding into pgvector for future similarity searches
## Prerequisites
- **Anthropic API key** — for Claude (claude-sonnet-4-6)
- **Voyage AI API key** — for generating embeddings (voyage-3-large)
- **Neon Postgres database** — with pgvector extension for vector storage and similarity search
- **BigCommerce store** — store hash and API access token (V3 API)
- **Langfuse account** (optional) — for observability and tracing
## Quick Start
```bash
cp .env.example .env
# Fill in your API keys and credentials in .env
pnpm install
pnpm dev
```
Test the endpoint:
```bash
curl -X POST http://localhost:3000/api/generate \
-H "Content-Type: application/json" \
-d '{"productIds": ["123"]}'
```
## API Reference
### POST /api/generate
Generate product descriptions for one or more products.
**Request body:**
```json
{
"productIds": ["123", "456"],
"maxResults": 5,
"style": "seo",
"tone": "professional"
}
```
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `productIds` | `string[]` | Yes | — | 1–20 product IDs |
| `maxResults` | `number` | No | `productIds.length` | Max products to process (1–10) |
| `style` | `"seo" \| "persuasive" \| "descriptive"` | No | `"seo"` | Writing style |
| `tone` | `"professional" \| "friendly" \| "luxury"` | No | `"professional"` | Tone of voice |
**Response:** Array of `GenerateDescriptionResponse` objects, one per product:
```json
[
{
"productId": "123",
"status": "success",
"generatedDescription": "...",
"seoTitle": "...",
"metaDescription": "..."
},
{
"productId": "456",
"status": "error",
"errorMessage": "Failed to fetch product from BigCommerce: 404"
}
]
```
### GET /api/generate?sessionId=abc
Retrieve the message history for a generation session.
**Response:** Array of `Message` objects with `role` and `content` fields.
### PUT /api/generate
Update a specific product description after user review. Records the edit as a turn in the session history.
**Request body:**
```json
{
"sessionId": "abc-123",
"productId": "456",
"description": "Updated product description text"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `sessionId` | `string` | Yes | The generation session ID |
| `productId` | `string` | Yes | The product ID being updated |
| `description` | `string` | Yes | The user-edited description |
**Response 200:**
```json
{
"status": "ok",
"productId": "456"
}
```
**Error responses:**
```json
{
"error": "Missing required fields: sessionId, productId, description"
}
```
Status 400 for missing fields, 500 for internal errors.
## Packages
### REAA Packages (4)
| Package | Role |
|---------|------|
| `@reaatech/context-window-planner` | Pack system prompt, product info, RAG chunks, and generation buffer into a 128K context window using priority-greedy strategy |
| `@reaatech/structured-repair-core` | Repair malformed LLM JSON output through six graduated strategies (strip-fences, extract-json, fix-json-syntax, coerce-types, fuzzy-match-keys, remove-extra-fields) |
| `@reaatech/session-continuity` | Manage generation sessions with token budget enforcement, sliding-window compression, and in-memory storage adapter |
| `@reaatech/llm-cache` | Dual-mode cache (exact SHA-256 + semantic cosine similarity) with use-case segmentation and model-aware fingerprinting |
### Third-Party Packages (6)
| Package | Role |
|---------|------|
| `@anthropic-ai/sdk` | Claude API client for generating product descriptions |
| `voyageai` | Embedding generation via Voyage AI (voyage-3-large, 1024 dimensions) |
| `@neondatabase/serverless` | PostgreSQL connection pool for Neon with pgvector support |
| `pgvector` | PostgreSQL vector extension TypeScript helpers (`toSql`, `registerTypes`) |
| `zod` | Schema declaration and validation for request bodies and structured output |
| `langfuse` | Optional observability — trace generation pipelines |
## License
MIT — see [LICENSE](./LICENSE).