Skip to content
reaatech

Files · Google Gemini Bank Statement Extraction for SMB Accounting

69 (1 binary, 596.5 kB total)attempt 1

README.md·1990 B·markdown
markdown
# Google Gemini Bank Statement Extraction for SMB Accounting
 
Upload scanned bank statements and receipts, automatically extract line-item transactions with Gemini, and output categorized accounting entries ready for QuickBooks or Xero.
 
## Architecture
 
1. **Upload** — POST /api/extract accepts multipart form data (PDF or image)
2. **Extract** — unpdf renders PDF pages as text; sharp pre-processes images
3. **Pipeline** — @reaatech/media-pipeline-mcp-core orchestrates extraction steps
4. **Gemini** — @google/genai calls Gemini 2.5 Flash with structured extraction prompts
5. **Repair** — @reaatech/structured-repair-core fixes JSON formatting errors
6. **Cache** — @reaatech/llm-cache avoids reprocessing identical documents
7. **Telemetry** — @reaatech/llm-cost-telemetry tracks token usage and cost per tenant
 
### API
 
`POST /api/extract` — Upload a bank statement (PDF or image) and receive extracted transactions.
 
```bash
curl -X POST http://localhost:3000/api/extract \
  -F "file=@statement.pdf" \
  -F "tenantId=acme-corp"
```
 
Response:
```json
{
  "transactions": [
    {
      "id": "tx-1",
      "date": "2024-01-15",
      "description": "Office supplies",
      "debit": 50.0,
      "credit": null,
      "balance": 1000.0,
      "memo": "Paid via check",
      "category": "office"
    }
  ],
  "totalDebits": 50.0,
  "totalCredits": 0,
  "pageCount": 1,
  "costUsd": 0.00015,
  "cached": false
}
```
 
## Packages
 
| Package | Role |
|---------|------|
| @reaatech/media-pipeline-mcp-core | Pipeline orchestration engine |
| @reaatech/media-pipeline-mcp-doc-extraction | Document extraction operations |
| @reaatech/structured-repair-core | Zod-schema-driven JSON repair |
| @reaatech/llm-cache | Exact-match cache for LLM responses |
| @reaatech/llm-cost-telemetry | Cost tracking and telemetry types |
 
## Running locally
 
```bash
pnpm install
pnpm dev
```
 
## Environment variables
 
See `.env.example`. GEMINI_API_KEY is required.
 
## License
 
MIT