Files · Bid Takeoff Agent for Small GCs
78 (1 binary, 673.1 kB total)attempt 1
README.md·4383 B·markdown
markdown
# Bid Takeoff Agent for Small GCs
> Turn plan sets and spec docs into a bill of materials and sub RFPs in minutes, not days.
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
## Problem
A GC estimator spends 2-3 days manually measuring plans, counting fixtures, and typing up scopes for subs. A single missed window or miscalculated yardage can blow the bid. This recipe ingests PDF plans and specs, extracts quantities, and generates a structured bill of materials with subcontractor RFP drafts.
## Architecture
PDF → unpdf extract → hybrid-rag-ingestion chunk → @reaatech/hybrid-rag-pipeline index → Vercel AI SDK generateObject → BillOfMaterials → xlsx export / SubcontractorRfp
## Prerequisites
- Node.js >= 22
- pnpm 10.x
- Qdrant instance (Docker: `docker run -p 6333:6333 qdrant/qdrant`)
- OpenAI API key
- (Optional) Cohere API key for reranking
- Langfuse for observability (optional, degrades gracefully)
## Env vars
| Var | Required | Default |
|-----|----------|---------|
| QDRANT_URL | yes | http://localhost:6333 |
| OPENAI_API_KEY | yes | — |
| OPENAI_MODEL | no | gpt-5.2 |
| LANGFUSE_PUBLIC_KEY | no | — |
| LANGFUSE_SECRET_KEY | no | — |
| LANGFUSE_HOST | no | https://cloud.langfuse.com |
| COHERE_API_KEY | no | — |
| DEFAULT_COLLECTION_NAME | no | bid-documents |
## Getting started
```bash
pnpm install
cp .env.example .env # fill in your keys
pnpm typecheck
pnpm test
pnpm dev
```
## API endpoints
### POST /api/takeoff
Upload a PDF plan set for takeoff analysis. Accepts multipart/form-data with fields: `file` (the PDF), `fileName`, `projectName`.
Returns: `{ documents: [...], bom: {...}, rfps: [...], errors: [...] }`
### POST /api/bom
Extract quantities from document text. Accepts JSON `{ projectName, documentText }`.
Returns: `{ bom: {...}, xlsxBase64: "..." }`
### POST /api/rfp
Generate a subcontractor RFP. Accepts JSON `{ trade, scope, docIds }`.
Returns: `{ trade, scope, requirements, docReferences, responseDeadline }`
### GET /api/health
Health check.
Returns: `{ status: "ok", timestamp: "..." }`
## Packages
### REAA packages
- `@reaatech/hybrid-rag-pipeline` — RAGPipeline orchestrator for hybrid vector + BM25 retrieval
- `@reaatech/hybrid-rag-ingestion` — Document loading, preprocessing, chunking strategies
- `@reaatech/agents-markdown` — Core domain types and Zod schemas for agent metadata
- `@reaatech/agent-eval-harness-golden` — Golden trajectory comparison and regression detection
- `@reaatech/llm-router-core` — Domain types and validation schemas for LLM routing
- `@reaatech/context-window-planner` — Token budget planning with priority-greedy packing
### Third-party packages
- `unpdf@1.6.2` — PDF text extraction
- `zod@4.4.3` — Runtime validation
- `langfuse@3.38.20` — LLM observability and tracing
- `ai@6.0.193` — Vercel AI SDK (provider-agnostic)
- `@ai-sdk/openai@3.0.67` — OpenAI provider for AI SDK
- `xlsx@0.18.5` — Spreadsheet generation
## How it works
1. PDF plan sets are uploaded via the takeoff API endpoint
2. `unpdf` extracts text from each page
3. `@reaatech/hybrid-rag-ingestion` preprocesses text and chunks documents using recursive strategy
4. `@reaatech/hybrid-rag-pipeline` indexes chunks into Qdrant for hybrid retrieval
5. Vercel AI SDK's `generateObject` extracts structured quantities from document context
6. Quantities are assembled into a `BillOfMaterials` and exported to `.xlsx` via `xlsx`
7. Subcontractor RFPs are drafted per trade using relevant document chunks
8. Extraction quality can be evaluated with `@reaatech/agent-eval-harness-golden`
## Testing
```bash
pnpm test
```
Runs vitest with v8 coverage. Tests are in `tests/` mirroring `src/`. Externals are mocked with MSW.
## Evaluation
Use `@reaatech/agent-eval-harness-golden` to create golden trajectories from successful takeoff runs, then compare future runs against them to detect regressions in extraction quality.
## Project layout
```
app/ Next.js App Router pages + API routes
src/lib/ Types, constants, utilities
src/services/ Business logic (ingestion, RAG, AI, BOM, RFP, evaluation)
tests/ Vitest suite (mirrors src/)
packages/ API references for every dependency
```
## License
MIT — see LICENSE