Skip to content
reaatechREAATECH

Files · Cohere Lead Intake Agent for HubSpot SMB Sales

74 (1 binary, 605.8 kB total)attempt 2

README.md·3306 B·markdown
markdown
# Cohere Lead Intake Agent for HubSpot SMB Sales
 
> An AI intake system that processes documents, emails, and chat messages to capture leads and auto-populate HubSpot CRM with accurate, categorized data.
 
## Setup
 
Copy `.env.example` to `.env` and configure the following environment variables:
 
| Variable | Description |
|---|---|
| `COHERE_API_KEY` | Cohere API key for NLP extraction and classification |
| `HUBSPOT_ACCESS_TOKEN` | HubSpot private app access token |
| `SLACK_TOKEN` | Slack bot token for manual review notifications |
| `SLACK_VERIFICATION_CHANNEL` | Slack channel for low-confidence lead reviews (default: `#lead-review`) |
| `LANGFUSE_SECRET_KEY` | Langfuse secret key for observability |
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key |
| `LANGFUSE_HOST` | Langfuse host URL (default: `https://cloud.langfuse.com`) |
| `BUDGET_DEFAULT_MONTHLY_LIMIT` | Default monthly budget cap per org in USD (default: `50`) |
 
```bash
pnpm install
pnpm dev             # starts Next.js on http://localhost:3000
pnpm test            # vitest run with coverage
```
 
## Usage
 
### Email webhook
 
```bash
curl -X POST http://localhost:3000/api/leads \
  -H "Content-Type: application/json" \
  -d '{"email":"user@example.com","subject":"Demo request","body":"I want to see a demo of your product"}'
```
 
### File upload (PDF business card, brochure, etc.)
 
```bash
curl -X POST http://localhost:3000/api/leads \
  -F "file=@business_card.pdf"
```
 
### Check lead status
 
```bash
curl http://localhost:3000/api/leads/lead-123
```
 
## Architecture
 
```
Input → Extraction → Classification → Confidence Routing → HubSpot (or Slack review)
```
 
1. **Extraction** — Documents (PDF, images) are OCR'd via `tesseract.js` or parsed via `pdf-parse`. Text is sent to Cohere (`command-a-03-2025`) which extracts structured fields (name, email, company, phone, industry, intent).
2. **Classification** — Extracted text is classified by `@reaatech/agent-mesh-classifier` into one of four intent categories.
3. **Confidence Routing**`@reaatech/confidence-router` evaluates the classification score: scores >= 0.8 are routed to HubSpot; scores < 0.8 trigger a Slack notification for manual review.
4. **HubSpot Push** — Validated leads are pushed to HubSpot CRM via `@hubspot/api-client`. Existing contacts are updated; new contacts are created with company association.
5. **Budget Enforcement**`@reaatech/agent-budget-engine` caps monthly Cohere API spend per organization to control costs.
6. **Observability** — Langfuse traces every pipeline step, tracks extraction accuracy, and captures errors.
 
## Classifiers
 
- `book_demo` — Requests to schedule a product demo
- `pricing_inquiry` — Questions about pricing and plans
- `support_request` — Technical support and help requests
- `general_contact` — General inquiries and messages
 
## Model
 
All NLP extraction and classification uses Cohere **`command-a-03-2025`**.
 
## Project layout
 
```
app/                  Next.js App Router pages + API routes
src/                  services, lib, adapters
tests/                vitest suite (mirrors src/)
packages/             API references for every dependency (read these first)
DEV_PLAN.md           build plan for this recipe
```
 
## License
 
MIT — see [LICENSE](./LICENSE).