Skip to content
reaatech

Files · Cohere RAG Legal Research for SMB Law Firms

35 (0 binary, 267.3 kB total)attempt 2

README.md·4018 B·markdown
markdown
# Cohere RAG Legal Research for SMB Law Firms
 
Instant case law Q&A with citations, powered by Cohere embeddings and retrieval.
 
## Problem
 
Small law firms cannot afford paralegal hours for manual legal research. Important precedents are missed, increasing risk. This system indexes firm documents and public case law into Qdrant using Cohere Embed, then serves answers using Cohere Command R+ for retrieval-augmented generation.
 
## Architecture
 
```
POST /api/ingest → Ingestion pipeline → Qdrant vector store

POST /api/chat   → Route query → Simple QA or Complex Research

               MemoryRetriever → Qdrant search → Context injection

               Cohere Command R+ → Answer with citations
```
 
The system uses Next.js App Router for API routes and REAA agent packages for routing, retrieval, and budget control.
 
## Tech Stack
 
- **Framework:** Next.js 15 (App Router)
- **LLM:** Cohere Command R+ (`command-a-03-2025`)
- **Embeddings:** Cohere (`embed-english-v3.0`, 1024 dim)
- **Vector Store:** Qdrant (REST)
- **Orchestration:** `@reaatech/agent-memory-retrieval`, `@reaatech/agent-memory-embedding`, `@reaatech/agent-handoff-routing`, `@reaatech/agent-budget-engine`
- **Chunking:** LlamaIndex (`SentenceSplitter`)
- **Validation:** Zod 4
 
## Prerequisites
 
- Node.js >= 22
- pnpm 10
- A running Qdrant instance (see below)
- A Cohere API key
 
### Qdrant (Docker)
 
```bash
docker run -p 6333:6333 qdrant/qdrant
```
 
## Setup
 
```bash
pnpm install
cp .env.example .env
# Edit .env with your keys
pnpm dev
```
 
## Environment Variables
 
| Variable | Description |
|---|---|
| `COHERE_API_KEY` | Cohere API key (from https://dashboard.cohere.com/api-keys) |
| `QDRANT_URL` | Qdrant HTTP URL (default: http://localhost:6333) |
| `QDRANT_API_KEY` | Optional Qdrant API key |
 
## API Endpoints
 
### POST /api/chat
 
Send a legal query and receive an answer with citations.
 
**Request:**
```json
{
  "query": "What is the statute of limitations for breach of contract?",
  "conversationId": "conv-1"
}
```
 
**Response (200):**
```json
{
  "answer": "Based on the retrieved case law...",
  "citations": ["Smith v. Jones (2020)", "Contract Law § 123"]
}
```
 
**Error responses:**
- `400` — Invalid request body or missing fields
- `402` — Budget exceeded (payment required)
- `500` — Internal server error (Cohere API, Qdrant, etc.)
 
### POST /api/ingest
 
Ingest a legal document for indexing.
 
**Request:**
```json
{
  "document": "The parties agree that the statute of limitations..."
}
```
 
**Response (201):**
```json
{
  "chunks": [
    { "id": "uuid-1", "text": "The parties agree...", "metadata": { "chunkIndex": 0, "source": "legal-document" } }
  ]
}
```
 
## Cost Estimation
 
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| `command-a-03-2025` | $0.015 | $0.06 |
| `embed-english-v3.0` | $0.0001 | $0 |
 
Per-conversation budget is set to $0.50 by default. The system pre-checks budget before each API call and blocks requests that exceed the limit.
 
## Query Routing
 
The system automatically classifies queries:
 
- **Simple QA** (`simple-qa` agent): Short, factual queries like "What is a tort?" — single retrieval + answer
- **Complex Research** (`complex-research` agent): Multi-step queries like "Compare recent data privacy rulings across California and EU jurisdictions" — decomposes into sub-questions, retrieves for each, synthesizes final answer
 
## Development
 
```bash
pnpm dev          # Start Next.js dev server
pnpm typecheck    # Run TypeScript type checking
pnpm lint         # Run ESLint
pnpm test         # Run vitest with coverage
```
 
## Test Architecture
 
Tests use:
- **MSW** (Mock Service Worker) for HTTP mocking of Cohere and Qdrant endpoints
- **vi.stubEnv** for environment variable mocking
- Direct module mocking for REAA packages
- In-memory storage for Qdrant adapter tests
 
Run tests with: `pnpm test`
 
## License
 
MIT — see LICENSE