Skip to content
reaatechREAATECH

Files · Mistral AI Lead Intake for Clio Legal Client Onboarding

73 (1 binary, 571.0 kB total)attempt 1

README.md·4289 B·markdown
markdown
# Mistral AI Lead Intake for Clio Legal Client Onboarding
 
> Capture new client leads via web chat and documents, detect duplicates with hybrid-RAG, and automatically create contacts and matters in Clio through its REST API.
 
## What it does
 
Mistral-powered legal lead intake pipeline:
 
1. **Conversational intake** — a chat interface powered by `@mistralai/mistralai` that asks structured questions and extracts lead fields (name, email, case type, description).
2. **Hybrid-RAG deduplication** — incoming leads are compared against previously indexed leads using hybrid vector + BM25 retrieval to flag potential duplicates.
3. **Document ingestion** — upload PDFs or images; text is extracted via `unpdf` or `tesseract.js` OCR, chunked, and analyzed for lead fields.
4. **Clio sync** — authenticated leads are pushed to Clio as contacts and matters via the Clio REST API with OAuth2.
 
## Architecture
 
```
┌─────────────┐     ┌──────────────┐     ┌────────────────┐
│  Next.js UI  │────▶│  API Routes  │────▶│  Service Layer │
│  (page.tsx)  │     │  /chat       │     │  MistralChat   │
│              │     │  /upload     │     │  DedupService  │
│              │     │  /clio/*     │     │  Ingestion     │
│              │     │  /dedup      │     │  ClioService   │
└─────────────┘     └──────────────┘     └───────┬────────┘

                                    ┌────────────┴────────────┐
                                    │    @reaatech/* Stack    │
                                    │  (hybrid-rag, embedding,│
                                    │   ingestion, retrieval) │
                                    └────────────┬────────────┘

                                    ┌────────────┴────────────┐
                                    │      Langfuse (tracing) │
                                    └─────────────────────────┘
```
 
## Prerequisites
 
- **Node.js** >= 22
- **pnpm** (see `packageManager` in `package.json`)
- API keys for: Mistral AI, OpenAI, Langfuse, Qdrant, Clio (OAuth2)
 
## Quick start
 
```bash
cp .env.example .env
```
 
Populate the following in `.env`:
 
| Variable | Description |
|---|---|
| `MISTRAL_API_KEY` | Mistral AI API key |
| `OPENAI_API_KEY` | OpenAI API key (for embeddings) |
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key |
| `LANGFUSE_SECRET_KEY` | Langfuse secret key |
| `LANGFUSE_HOST` | Langfuse host URL |
| `QDRANT_URL` | Qdrant vector database URL |
| `QDRANT_COLLECTION_NAME` | Qdrant collection name (default: `leads`) |
| `CLIO_CLIENT_ID` | Clio OAuth2 client ID |
| `CLIO_CLIENT_SECRET` | Clio OAuth2 client secret |
| `CLIO_REDIRECT_URI` | Clio OAuth2 redirect URI (`http://localhost:3000/api/clio/callback`) |
 
```bash
pnpm install
pnpm dev
```
 
Open [http://localhost:3000](http://localhost:3000).
 
## API routes
 
| Method | Path | Purpose |
|---|---|---|
| `POST` | `/api/chat` | Send a chat message; returns assistant response, lead data, and dedup result |
| `GET` | `/api/chat` | Health check |
| `POST` | `/api/upload` | Upload a document (PDF/image); returns extracted text, chunks, lead fields, and dedup result |
| `POST` | `/api/dedup` | Standalone duplicate check; returns `{ isDuplicate, matchedLeadId, similarityScore, matchedChunks }` |
| `POST` | `/api/clio/auth` | Generate Clio OAuth2 authorization URL; returns `{ authUrl }` |
| `GET` | `/api/clio/callback` | Clio OAuth2 callback; exchanges code for token; returns `{ token }` |
| `POST` | `/api/clio/push-lead` | Push a validated lead to Clio as a contact + matter; requires `Authorization: Bearer <token>` |
 
## Testing
 
```bash
pnpm test
```
 
Runs vitest with coverage reporting.
 
## License
 
MIT — see [LICENSE](./LICENSE).