Files · Cohere Lead Intake Agent for HubSpot SMB Sales
74 (1 binary, 605.8 kB total)attempt 2
README.md·3306 B·markdown
markdown
# Cohere Lead Intake Agent for HubSpot SMB Sales
> An AI intake system that processes documents, emails, and chat messages to capture leads and auto-populate HubSpot CRM with accurate, categorized data.
## Setup
Copy `.env.example` to `.env` and configure the following environment variables:
| Variable | Description |
|---|---|
| `COHERE_API_KEY` | Cohere API key for NLP extraction and classification |
| `HUBSPOT_ACCESS_TOKEN` | HubSpot private app access token |
| `SLACK_TOKEN` | Slack bot token for manual review notifications |
| `SLACK_VERIFICATION_CHANNEL` | Slack channel for low-confidence lead reviews (default: `#lead-review`) |
| `LANGFUSE_SECRET_KEY` | Langfuse secret key for observability |
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key |
| `LANGFUSE_HOST` | Langfuse host URL (default: `https://cloud.langfuse.com`) |
| `BUDGET_DEFAULT_MONTHLY_LIMIT` | Default monthly budget cap per org in USD (default: `50`) |
```bash
pnpm install
pnpm dev # starts Next.js on http://localhost:3000
pnpm test # vitest run with coverage
```
## Usage
### Email webhook
```bash
curl -X POST http://localhost:3000/api/leads \
-H "Content-Type: application/json" \
-d '{"email":"user@example.com","subject":"Demo request","body":"I want to see a demo of your product"}'
```
### File upload (PDF business card, brochure, etc.)
```bash
curl -X POST http://localhost:3000/api/leads \
-F "file=@business_card.pdf"
```
### Check lead status
```bash
curl http://localhost:3000/api/leads/lead-123
```
## Architecture
```
Input → Extraction → Classification → Confidence Routing → HubSpot (or Slack review)
```
1. **Extraction** — Documents (PDF, images) are OCR'd via `tesseract.js` or parsed via `pdf-parse`. Text is sent to Cohere (`command-a-03-2025`) which extracts structured fields (name, email, company, phone, industry, intent).
2. **Classification** — Extracted text is classified by `@reaatech/agent-mesh-classifier` into one of four intent categories.
3. **Confidence Routing** — `@reaatech/confidence-router` evaluates the classification score: scores >= 0.8 are routed to HubSpot; scores < 0.8 trigger a Slack notification for manual review.
4. **HubSpot Push** — Validated leads are pushed to HubSpot CRM via `@hubspot/api-client`. Existing contacts are updated; new contacts are created with company association.
5. **Budget Enforcement** — `@reaatech/agent-budget-engine` caps monthly Cohere API spend per organization to control costs.
6. **Observability** — Langfuse traces every pipeline step, tracks extraction accuracy, and captures errors.
## Classifiers
- `book_demo` — Requests to schedule a product demo
- `pricing_inquiry` — Questions about pricing and plans
- `support_request` — Technical support and help requests
- `general_contact` — General inquiries and messages
## Model
All NLP extraction and classification uses Cohere **`command-a-03-2025`**.
## Project layout
```
app/ Next.js App Router pages + API routes
src/ services, lib, adapters
tests/ vitest suite (mirrors src/)
packages/ API references for every dependency (read these first)
DEV_PLAN.md build plan for this recipe
```
## License
MIT — see [LICENSE](./LICENSE).