Skip to content
reaatech

Vercel AI Gateway Agent Eval Harness for SMB Support Bots

An automated regression testing pipeline that evaluates SMB support agents against golden datasets, using Vercel AI Gateway as the LLM backbone and exporting observability to Langfuse.

The problem

Small businesses deploying AI support bots lack a systematic way to catch regressions before they reach customers. Ad‑hoc manual testing and single‑metric checks miss subtle degradations in answer quality, tool‑use accuracy, and cost creep.

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

175 kB·87 tests·98.4% coverage·vitest passing

SHA-25684dad4328903bab3994569ec06164d53b0139c56cdf2e8711988e92bab9d34ce

Comments

Sign in with GitHub to comment and vote.

Loading comments…