Skip to content
reaatechREAATECH

Files · xAI Grok Invoice Extraction for QuickBooks SMBs

62 (1 binary, 556.2 kB total)attempt 1

README.md·3213 B·markdown
markdown
# xAI Grok Invoice Extraction for QuickBooks SMBs
 
> Extract structured invoice data from PDFs using xAI Grok, with automatic repair of malformed LLM output and cost monitoring.
 
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
 
## Architecture
 
```
POST /api/invoices (multipart PDF + Idempotency-Key)
  → idempotency check (@reaatech/idempotency-middleware)
  → PDF parsing (unstructured-client)
  → Budget pre-check (@reaatech/agent-budget-engine)
  → LLM extraction (xAI Grok via openai SDK + @instructor-ai/instructor)
  → JSON repair (@reaatech/structured-repair-core + jsonrepair)
  → Zod validation (zod 4.x)
  → Cost recording (@reaatech/llm-cost-telemetry + @reaatech/llm-cost-telemetry-calculator)
  → QuickBooks push (node-quickbooks)
  → Cached response via idempotency key
```
 
## Packages
 
### REAA (load-bearing)
| Package | Role |
|---------|------|
| `@reaatech/structured-repair-core` | Repairs malformed LLM JSON output via 6 graduated strategies |
| `@reaatech/llm-cost-telemetry` | Core types, spans, and utilities for cost tracking |
| `@reaatech/llm-cost-telemetry-calculator` | Provider-agnostic LLM cost calculation |
| `@reaatech/agent-budget-engine` | Budget enforcement with pre-flight checks and state machine |
| `@reaatech/agent-budget-spend-tracker` | Circular-buffer in-memory spend tracking |
| `@reaatech/idempotency-middleware` | Idempotency key-based request deduplication |
 
### Third-party
| Package | Use |
|---------|-----|
| `openai` | OpenAI SDK configured for xAI Grok (Chat Completions API) |
| `@instructor-ai/instructor` | Structured extraction via tool calling |
| `unstructured-client` | PDF → markdown via Unstructured API |
| `sharp` | PDF page metadata and image conversion |
| `node-quickbooks` | QuickBooks Online API (create invoices, customers) |
| `zod` | Runtime schema validation |
| `jsonrepair` | JSON syntax repair (pre-processing step) |
 
## Setup
 
```bash
pnpm install
```
 
Copy `.env.example` to `.env` and fill in the required credentials:
 
```env
XAI_API_KEY=<your-xai-api-key>
UNSTRUCTURED_API_KEY=<your-unstructured-api-key>
QUICKBOOKS_CONSUMER_KEY=<your-quickbooks-consumer-key>
QUICKBOOKS_CONSUMER_SECRET=<your-quickbooks-consumer-secret>
QUICKBOOKS_OAUTH_TOKEN=<your-oauth-token>
QUICKBOOKS_REALM_ID=<your-realm-id>
QUICKBOOKS_REFRESH_TOKEN=<your-refresh-token>
QUICKBOOKS_SANDBOX=true
DEFAULT_TENANT_BUDGET_USD=10.00
MAX_INVOICE_COST_USD=0.50
```
 
## Usage
 
```bash
pnpm dev    # start Next.js dev server
pnpm test   # run vitest with coverage
```
 
### API
 
```bash
curl -X POST http://localhost:3000/api/invoices \
  -H "Idempotency-Key: unique-req-001" \
  -F "file=@invoice.pdf"
```
 
Response:
```json
{
  "success": true,
  "invoice": { "vendorName": "Acme Corp", "customerName": "Client Co", ... },
  "cost": { "inputTokens": 120, "outputTokens": 80, "costUsd": 0.0023 },
  "processingTimeMs": 3420
}
```
 
## Development
 
```bash
pnpm typecheck   # tsc --noEmit
pnpm lint        # eslint flat config
pnpm test        # vitest run --coverage
```
 
## License
 
MIT — see [LICENSE](./LICENSE).