Skip to content
reaatechREAATECH

Files · Cohere Knowledge Agent for Plaid SMB Financial Insights

107 (1 binary, 653.3 kB total)attempt 1

README.md·5704 B·markdown
markdown
# Cohere Knowledge Agent for Plaid SMB Financial Insights
 
> Answer questions about your small business finances by connecting your Plaid-linked bank accounts to a natural language Q&A agent powered by Cohere.
 
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
 
## Problem
 
Small business owners waste hours manually combing through bank statements and spreadsheets to answer basic questions about cash flow, vendor spending, and recurring costs. This recipe connects Plaid transaction data to a Cohere-powered Q&A agent so SMB owners can ask questions in plain English and get instant, sourced answers.
 
## Architecture
 
```
                ┌─────────────┐
                │  Plaid API   │
                └──────┬──────┘
                       │ transactions

                ┌──────────────┐
                │  plaid-ingest│
                │  (fastembed) │
                └──────┬──────┘
                       │ chunks + embeddings

                ┌──────────────┐
                │    Qdrant    │
                │  vector DB   │
                └──────┬──────┘
                       │ retrieval

 ┌──────────────────────────────┐
 │  @reaatech/confidence-router  │
 │  ─ decide: ROUTE/CLARIFY      │
 └──────────┬───────────────────┘
            │ query

 ┌──────────────────────────────┐
 │  @reaatech/hybrid-rag         │
 │  ─ retrieve + answer          │
 └──────────┬───────────────────┘


 ┌──────────────────────────────┐
 │  @reaatech/structured-repair  │
 │  ─ fix malformed Cohere JSON │
 └──────────┬───────────────────┘
            │ answer

 ┌──────────────────────────────┐
 │  Cohere (command-a-03-2025)  │
 │  natural language Q&A        │
 └──────────────────────────────┘
```
 
Transaction data flows from Plaid → **plaid-ingest** chunks transactions and generates embeddings via **fastembed****Qdrant** vector store. When a user asks a question, **@reaatech/confidence-router** decides whether to answer directly or ask a clarifying question. **@reaatech/hybrid-rag** retrieves relevant chunks and builds a prompt for **Cohere**. The raw Cohere response passes through **@reaatech/structured-repair-core** to fix any malformed JSON before it reaches the user.
 
## Quickstart
 
```bash
cp .env.example .env
# Fill in: PLAID_CLIENT_ID, PLAID_SECRET, PLAID_ACCESS_TOKEN,
#          COHERE_API_KEY, QDRANT_URL
pnpm install
pnpm dev
```
 
The dev server opens at [http://localhost:3000](http://localhost:3000).
 
### Ingest transactions
 
```bash
curl -X POST http://localhost:3000/api/ingest
```
 
### Ask a question
 
```bash
curl -X POST http://localhost:3000/api/ask \
  -H "Content-Type: application/json" \
  -d '{"query": "what is my cash flow this month?"}'
```
 
## API Reference
 
| Endpoint | Method | Description |
|---|---|---|
| `/api/ask` | POST | Answer a financial question. Body: `{ query, sessionId?, userId? }`. Response: `{ answer, sources, sessionId, decision }`. |
| `/api/ingest` | POST | Trigger Plaid transaction ingestion (chunk, embed, upsert to Qdrant). |
| `/api/health` | GET | Health check — returns `{ status: "ok" }`. |
 
## Tech Stack
 
| Package | Role |
|---|---|
| `next` | App Router, API routes, server-side rendering |
| `react` | UI component library |
| `react-dom` | DOM renderer for React |
| `plaid` | Plaid API client — fetch transactions, accounts, auth data |
| `cohere-ai` | Cohere SDK — `command-a-03-2025` model for Q&A |
| `@qdrant/js-client-rest` | Qdrant vector store client — upsert & search |
| `fastembed` | Local embedding model (`BGEBaseEN`) — generate vector embeddings |
| `zod` | Runtime schema validation for documents, requests, config |
| `langfuse` | LLM observability — trace prompts, costs, latency |
| `xlsx` | Spreadsheet export for financial reports |
| `pdf-parse` | PDF bank statement ingestion |
| `@reaatech/hybrid-rag` | Hybrid retrieval pipeline — dense + sparse search, prompt building |
| `@reaatech/confidence-router` | Query intent classifier — route or clarify |
| `@reaatech/session-continuity` | Multi-turn conversation state management |
| `@reaatech/structured-repair-core` | Malformed LLM JSON repair |
| `typescript` | Type safety across the entire codebase |
 
## Testing
 
```bash
pnpm typecheck    # TypeScript compiler check
pnpm lint         # ESLint
pnpm test         # Vitest with coverage
```
 
## Project layout
 
```
app/                  Next.js App Router pages + API routes
src/                  services, lib, adapters
tests/                vitest suite (mirrors src/)
packages/             API references for every dependency (read these first)
DEV_PLAN.md           build plan for this recipe
```
 
## License
 
MIT — see [LICENSE](./LICENSE).