Today we shipped 8 new step-by-step tutorials for small-business AI, along with 21 foundational building blocks across 4 repos. If you need to monitor AI reliability, extract invoice data, safeguard customer communications, or analyze business data, pick one and try it this afternoon.
New tutorials
Vercel AI Gateway Reliability Suite for SMB AI Operations
Small teams running LLM features on Vercel get a self-serve dashboard that monitors, replays, and self-heals their AI workflows. It pings live endpoints with health probes, records every LLM interaction for deterministic replay, and triggers incident diagnostics when anomalies are detected—so a weekend spike in errors doesn’t mean lost revenue and Monday firefighting.
Read the tutorial → · Download the code (zip)
Under the hood: Vercel AI Gateway, Next.js, 146 tests, 98.8% coverage. Built with @reaatech/agent-runbook-* and @reaatech/agent-replay.
LangChain Observability for SMB AI Workflow Monitoring
Plug tracing and cost observability into any LangChain pipeline without a separate SaaS. An Express sidecar instruments your chain steps, exports OpenTelemetry traces to Langfuse, and attributes spend per model and per chain—so you know exactly where latency and budget go.
Read the tutorial → · Download the code (zip)
Under the hood: LangChain, Express, 78 tests, 98.9% coverage. Powered by agent-budget-otel-bridge and agent-eval-harness-observability.
Anthropic Eval Harness for Agent Quality Assurance
Catch quality drift before it reaches customers. This harness runs regression tests against golden datasets using Claude as a judge, enforces pass/fail gates, and creates an incident if anything breaks. Cost and latency trends land in a Langfuse dashboard.
Read the tutorial → · Download the code (zip)
Under the hood: Anthropic, Next.js, 94 tests, 99.0% coverage. Combines agent-eval-harness-suite, agent-eval-harness-gate, and agent-runbook-incident.
AWS Bedrock Lead Intake for Small Business Growth
Phone calls and web forms become structured, qualified leads automatically routed to your CRM. Voice calls are transcribed with Deepgram, classified by intent, and enriched with extracted contact details—all backed by Bedrock models. Document attachments get OCR’d and ingested; a circuit breaker keeps the pipeline reliable.
Read the tutorial → · Download the code (zip)
Under the hood: AWS Bedrock, Express, 136 tests, 97.2% coverage. Uses agent-mesh-classifier, agent-memory-extraction, and HubSpot handoff.
xAI Grok PII Detection for SMB Customer Communication
A proxy that sits between your app and the Grok API, scanning messages in both directions for PII and offensive content. When risk is detected, a circuit breaker returns a canned safe response instead. It stops sensitive data leaks and brand damage before they happen.
Read the tutorial → · Download the code (zip)
Under the hood: xAI Grok, Express, 66 tests, 97.9% coverage. Integrated with agent-mesh-classifier, circuit-breaker-core, and agent-handoff-validation.
Mistral AI Invoice Extraction for SMB Accounting
Upload invoices as PDFs or images; the pipeline parses them with LlamaParse, extracts vendor, totals, and line items via Mistral Large, and routes low-confidence results to a human review queue. Every LLM call stays under a configurable monthly budget cap.
Read the tutorial → · Download the code (zip)
Under the hood: Mistral, Express, 125 tests, 99.7% coverage. Built with agent-memory-extraction, agent-handoff-protocol, and agent-budget-spend-tracker.
Cohere RAG Legal Research for SMB Law Firms
Index firm documents and public case law into Qdrant with Cohere embeddings, then ask natural-language legal questions. Multi-step queries (like comparing rulings across jurisdictions) are broken into sub-questions, answered with citations, and synthesized. Per-query spending is capped at $0.50.
Read the tutorial → · Download the code (zip)
Under the hood: Cohere, Next.js, 96 tests, 93.5% coverage. Uses agent-memory-embedding, agent-memory-retrieval, and agent-budget-engine.
Anthropic Code Sandbox for SMB Data Analysis
Upload a CSV, ask a business question in plain English, and get a budget-controlled analysis back. Claude generates Python code that runs in an E2B sandbox; a circuit breaker prevents runaway costs, a quality judge checks the output, and past analyses are stored for reuse.
Read the tutorial → · Download the code (zip)
Under the hood: Anthropic, Next.js + Express, 96 tests, 99.5% coverage. Wires together agent-budget-engine, circuit-breaker-core, and agent-eval-harness-judge.
Building blocks shipped
Confidence Router
A framework for routing based on classification confidence. The confidence-router package decides whether to route, ask for clarification, or fall back; confidence-router-classifiers provides keyword, embedding, and LLM classifiers; and confidence-router-evaluation tunes thresholds against your data. Languages and core types round out the family.
Context Window Planner
Manage token budgets with a builder class and pluggable packing strategies. The CLI accepts JSON via stdin and outputs a packing plan, so it slots into shell pipelines. Both packages depend on js-tiktoken for accurate counting.
Guardrail Chain
Orchestrate sequences of input and output guardrails. guardrail-chain provides the orchestrator with budget, circuit breaking, and retry; guardrail-chain-guardrails ships thirteen pre-built guards for PII redaction, prompt injection, and moderation. Observability and config loading utilities are included.
Hybrid RAG (Qdrant)
Everything you need to build a retrieval pipeline with vector search, BM25, and cross-encoder reranking. hybrid-rag-retrieval fuses Qdrant and in-process keyword results; hybrid-rag-embedding handles OpenAI, Vertex, and local models with batching and cost tracking; hybrid-rag-pipeline ties ingestion, retrieval, and reranking into one class. Also included: an MCP server, CLI, evaluation suite, and full observability.
Browse the full catalog at reaatech.com/products.
Comments
Sign in with GitHub to comment and vote.
