Today we shipped 8 new step-by-step tutorials for small-business AI, plus the building blocks underneath them. If you need to monitor LLM apps on Vercel, extract data from invoices, or safely run code for business analytics, you can try one of these this afternoon.
New tutorials
Vercel AI Gateway Reliability Suite for SMB AI Operations
A self-serve dashboard that monitors, replays, and self-heals AI workflows running through Vercel AI Gateway. It’s for small teams that need their LLM apps stay up without 24/7 ops staff. Health probes ping your endpoints on a schedule, every interaction is recorded for deterministic replay, and incidents auto-trigger when anomalies appear.
Read the tutorial → Download the code (zip)
Under the hood: Vercel AI Gateway, Next.js, 146 tests, 98.77% coverage. Built with agent-runbook-health-checks, agent-replay-core, and other REAA ops packages.
LangChain Observability for SMB AI Workflow Monitoring
Plug-and-play tracing and cost observability for LangChain pipelines. It shows you where latency piles up, which chain step costs the most, and why a prompt is bleeding tokens — all with an Express sidecar you can add to an existing project in hours. Traces export to Langfuse so you can see real-time metrics.
Read the tutorial → Download the code (zip)
Under the hood: LangChain, Express, 78 tests, 98.89% coverage. Uses agent-budget-otel-bridge, agent-eval-harness-observability, and observability packages.
Anthropic Eval Harness for Agent Quality Assurance
Continuous regression testing and safety scoring for Anthropic-powered agents. Before you push a new model version to customers, this harness runs golden datasets and LLM-as-a-judge scoring to catch toxic phrasing, hallucinations, or missed tools. Failures trigger incidents automatically.
Read the tutorial → Download the code (zip)
Under the hood: Anthropic, Next.js, 94 tests, 98.96% coverage. Relies on agent-eval-harness-suite, agent-eval-harness-gate, and agent-runbook-incident.
AWS Bedrock Lead Intake for Small Business Growth
Phone calls and web forms become structured, qualified leads — automatically routed to your CRM. Voice calls are transcribed with Deepgram, classified for intent, and key fields extracted with Bedrock. Document attachments get OCR’d and ingested through the same pipeline. Built to handle load with circuit breakers and budget tracking.
Read the tutorial → Download the code (zip)
Under the hood: AWS Bedrock, Express, 136 tests, 97.19% coverage. Uses agent-mesh-classifier, agent-memory-extraction, and agent-handoff.
xAI Grok PII Detection for SMB Customer Communication
A reverse proxy that scans Grok-powered messages for PII and offensive content before they reach end users. Both incoming prompts and outgoing completions are checked, and risky content triggers a canned safe reply instead. Fails safely when downstream services are unavailable.
Read the tutorial → Download the code (zip)
Under the hood: xAI Grok, Express, 66 tests, 97.90% coverage. Built with agent-mesh-classifier, hai-guardrails, circuit-breaker-core.
Mistral AI Invoice Extraction for SMB Accounting
Upload invoices as PDFs or images, and this pipeline extracts vendor, line items, and totals using Mistral Large. Low-confidence extractions are routed to a pending review queue for a human to confirm, and every API call is tracked against a configurable monthly budget cap.
Read the tutorial → Download the code (zip)
Under the hood: Mistral, Express, 125 tests, 99.71% coverage. Uses agent-memory-extraction, agent-handoff-protocol, agent-budget-spend-tracker.
Cohere RAG Legal Research for SMB Law Firms
Index firm documents and public case law into Qdrant, then ask natural-language legal questions and get cited answers. Complex multi-step queries are decomposed into sub-questions, each with its own budget check. Per-conversation spending is capped at $0.50.
Read the tutorial → Download the code (zip)
Under the hood: Cohere, Next.js, 96 tests, 93.53% coverage. Relies on agent-memory-embedding, agent-memory-retrieval, agent-handoff-routing, agent-budget-engine.
Anthropic Code Sandbox for SMB Data Analysis
Upload CSVs or connect a database, then ask business questions in plain English. Claude generates Python code that executes in an E2B sandbox, with per-query cost limits and circuit breakers to prevent runaway spend. Past analyses are stored for context reuse, and output is quality-gated before you see it.
Read the tutorial → Download the code (zip)
Under the hood: Anthropic, Next.js + Express, 96 tests, 99.48% coverage. Uses agent-budget-engine, circuit-breaker-core, agent-eval-harness-judge, agent-memory.
Building blocks shipped
Confidence Router
The Confidence Router ecosystem provides a decision engine for routing based on classification confidence thresholds. The core library, @reaatech/confidence-router-core, defines the type system and DecisionEngine. @reaatech/confidence-router-classifiers offers a registry of keyword, embedding, and LLM-based classifiers, while @reaatech/confidence-router-evaluation optimizes thresholds against labeled datasets. The @reaatech/confidence-router-languages package handles localized clarification prompts for 47 languages. The main @reaatech/confidence-router ties it all together. All five packages are at version 0.1.0.
Browse the Confidence Router catalog →
Context Window Planner
Two new packages for managing LLM context windows: @reaatech/context-window-planner offers a builder class with packing strategies (priority, summarize, drop) based on token counts from js-tiktoken. Its companion @reaatech/context-window-planner-cli provides a command-line interface that reads JSON from stdin and outputs a packing plan.
Browse the Context Window Planner catalog →
Guardrail Chain
Four new packages form a guardrail orchestration framework. @reaatech/guardrail-chain sequences input/output guardrails with budget management, circuit breaking, and retry logic. It depends on @reaatech/guardrail-chain-observability, which provides pluggable logging, metrics, and tracing interfaces. @reaatech/guardrail-chain-config loads and validates configurations from YAML or environment variables. @reaatech/guardrail-chain-guardrails ships 13 pre-built guardrail classes — PII redaction, prompt injection detection, toxicity filtering, etc.
Browse the Guardrail Chain catalog →
Hybrid RAG (Qdrant)
Ten packages landed for the Hybrid RAG system. The core @reaatech/hybrid-rag provides TypeScript interfaces and Zod schemas. @reaatech/hybrid-rag-embedding generates vectors via OpenAI, Vertex AI, or local models with batching and cost tracking. @reaatech/hybrid-rag-ingestion chunks documents with four strategies. @reaatech/hybrid-rag-qdrant wraps the Qdrant client for collection management and filtered search. @reaatech/hybrid-rag-retrieval fuses vector and BM25 results with cross-encoder reranking. @reaatech/hybrid-rag-pipeline orchestrates end-to-end ingestion and retrieval. @reaatech/hybrid-rag-observability adds structured logging and OpenTelemetry tracing. @reaatech/hybrid-rag-evaluation benchmarks retrieval and generation quality. @reaatech/hybrid-rag-mcp-server exposes 40+ MCP tools for the full lifecycle. Finally, @reaatech/hybrid-rag-cli provides a command-line interface for all of the above.
Browse the Hybrid RAG catalog →
Browse the full catalog at reaatech.com/products.
Comments
Sign in with GitHub to comment and vote.
