Skip to content
reaatechREAATECH
All postsrecap

Weekly recap, May 4, 2026 – May 10, 2026

51 new open-source repos and 57 npm packages for building, securing, and observing AI agents — all free and open source.

RecapBot7 min read

This week we published 51 new open-source repositories and 57 npm packages for production AI agent systems. The work spans MCP infrastructure, agent orchestration, observability, security, and deployment runtimes — all free and open source.

New repos

mcp-gateway

Production MCP gateway with authentication, rate limiting, schema enforcement, tool allowlists, audit trail, fan-out routing, and response caching. It lets you secure and scale connections to upstream MCP servers with composable middleware.

Browse the code · Product page

guardrail-chain

Composable, budget-aware input/output guardrail pipeline for LLM applications. Fluent chain builder for PII redaction, prompt injection detection, and content safety while enforcing latency and token budgets.

Browse the code · Product page

agent-eval-harness

Enterprise-grade evaluation harness for AI agents: trajectory scoring, tool-use validation, cost tracking, latency budgets, golden trajectories, LLM-as-judge, CI regression gates, and MCP integration.

Browse the code · Product page

llm-cost-telemetry

Multi-tenant LLM cost telemetry with provider SDK wrappers (OpenAI, Anthropic, Google) and observability export to Prometheus, OTLP, and Phoenix.

Browse the code · Product page

confidence-router

Decision engine for route/clarify/fallback patterns using confidence-gated intent routing with configurable thresholds and pluggable classifiers.

Browse the code · Product page

mcp-contract-kit

Conformance test suite for MCP servers with Zod schemas, validators, reporters, and CLI. Automate spec compliance and security validation in CI.

Browse the code · Product page

hybrid-rag-qdrant

Production hybrid RAG with vector + BM25 + reranker, benchmarked chunking strategies, and evaluation frameworks. Pairs with rag-eval-pack.

Browse the code · Product page

classifier-evals

Enterprise classifier evaluation suite with confusion matrices, LLM-as-judge, regression gates, and Phoenix/Langfuse exporters.

Browse the code · Product page

prompt-version-control

Git-like versioning for prompts with eval-gated promotion. API server, SDK, CLI, and MCP server to manage prompt lifecycles.

Browse the code · Product page

prompt-injection-bench

Reproducible benchmark and test corpus for prompt-injection defenses. Swappable defense adapters, parallelized benchmarks, and statistical scoring.

Browse the code · Product page

idempotency-middleware

Framework-agnostic idempotency cache for HTTP APIs. Pluggable storage (in-memory, Redis, DynamoDB, Firestore) with distributed locking and Express/Koa handlers.

Browse the code · Product page

webhook-relay-mcp

MCP server that receives webhooks from Stripe, GitHub, Twilio, normalizes them, and exposes them to agents as subscription-based tools.

Browse the code · Product page

agent-replay

Record and deterministically replay agent interactions. Decouples debugging from live LLM calls, supports diff-mode and step-through debugging.

Browse the code · Product page

otel-cost-exporter

OpenTelemetry-native LLM cost exporter with multi-provider pricing. Converts GenAI semantic spans into USD metrics for Prometheus/OTLP.

Browse the code · Product page

structured-output-repair

Catch and fix malformed LLM structured outputs: strips fences, coerces types, fuzzy-matches keys, and re-prompts if unrepairable.

Browse the code · Product page

agent-budget-controller

Real-time cost budget enforcement for agent systems. Pre-flight cost checks, model downgrades, and per-scope blocking with observability.

Browse the code · Product page

mcp-schema-evolution

Tooling for safely evolving MCP tool schemas: diffing, change classification (breaking/non-breaking), CI policy enforcement, and migration guidance.

Browse the code · Product page

media-pipeline-mcp

Chainable media operations (image, audio, video, documents, 3D) as MCP tools with quality gates, cost tracking, and caching.

Browse the code · Product page

invoicing-app

Personal desktop invoicing application: customers, products, invoices, PDF generation, and email via Electron + SQLite.

Browse the code · Product page

agent-mesh

Multi-agent orchestration mesh: intent classification, confidence-gated routing, session management, circuit breaking, and YAML-configured agents.

Browse the code · Product page

rag-eval-pack

RAG evaluation toolkit: faithfulness, answer relevance, context precision/recall, cost tracking, CI gates. Pairs with hybrid-rag-qdrant.

Browse the code · Product page

multi-tenant-mcp

Primitives for serving multiple tenants from a single MCP server: tenant resolution, rate limiting, tool visibility, cost accounting.

Browse the code · Product page

agents-md-kit

Linter, validator, and scaffolding tool for AGENTS.md and SKILL.md files. Enforces consistent structure with 18 lint rules and Zod validation.

Browse the code · Product page

secret-rotation-kit

Zero-downtime multi-key secret rotation: overlapping validity windows, propagation verification, revocation. Adapters for AWS, GCP, Vault, Vercel.

Browse the code · Product page

agentic-arch-patterns

A reference book as a repo: runnable TypeScript patterns for agent systems — circuit breakers, orchestrator-worker, idempotency caches, and more.

Browse the code · Product page

mcp-server-doctor

CLI diagnostic and profiling tool for MCP servers — transport negotiation, latency profiling, concurrency testing, and graded report cards.

Browse the code · Product page

agent-memory

Long-term memory layer for AI agents: fact extraction, semantic retrieval, decay policies, contradiction resolution, and pluggable storage backends.

Browse the code · Product page

faas-hot-runtime

Kubernetes-based FaaS runtime with warm pod pools for sub-100ms invocations. Functions exposed as MCP tools for agent integration.

Browse the code · Product page

agent-auth-proxy

Identity-aware proxy for agent-to-service communication. Handles OAuth2 token management, API key vaulting, and scope enforcement.

Browse the code · Product page

llm-judge-toolkit

Calibrated LLM-as-judge library: multi-judge consensus, position bias detection, human calibration, cost tracking, and caching.

Browse the code · Product page

mcp-server-starter-ts

Production MCP server template in TypeScript: pluggable middleware, dual transports, tool auto-discovery, observability baked in.

Browse the code · Product page

voice-agent-kit

Real-time voice agent pipeline: Twilio to STT to MCP agent with vector retrieval to TTS to Twilio. Latency budgets, barge-in, session continuity.

Browse the code · Product page

mcp-load-test

Load testing framework for MCP servers: concurrent user simulation, breakpoint identification, real-time metrics, transport-aware clients.

Browse the code · Product page

agent-chaos

Fault injection toolkit for agent systems: declarative scenarios (YAML/JSON), transparent interceptors, hot reload. Test resilience patterns.

Browse the code · Product page

terraform-mcp-amazon-eks

Drop-in Terraform module to deploy MCP workloads on Amazon EKS with FaaS-style warm pods, sub-100ms invoke, Redis, SQS, and KEDA autoscaling.

Browse the code · Product page

llm-router

Intelligent LLM routing: cost/latency/quality‑based selection, fallback chains, budget enforcement, provider‑agnostic with MCP integration.

Browse the code · Product page

agent-handoff-protocol

Standardized lifecycle for transferring AI agent conversations: context compression, capability‑based routing, transport delivery (MCP/HTTP).

Browse the code · Product page

funcdock

Lightweight serverless platform: run multiple Node.js functions in a single Docker container, each route auto‑exposed as an MCP tool.

Browse the code · Product page

circuit-breaker-agents

Circuit breakers for agent-to-tool/agent communication: per-tool isolation, confidence‑ and cost‑based tripping, gradual recovery.

Browse the code · Product page

session-continuity-kit

Multi-turn session management: conversation windowing, token budgets, compression, handoff. Adapters for Firestore, DynamoDB, Redis.

Browse the code · Product page

tool-use-firewall

Policy enforcement proxy between agents and MCP tools: cost caps, rate limits, argument validation, human approval for destructive ops.

Browse the code · Product page

agent-runbook-generator

CLI that ingests a service repo and produces an operator runbook: alerts, dashboards, failure modes, rollback steps, dependency maps.

Browse the code · Product page

otel-genai-semconv

OpenTelemetry semantic conventions for GenAI observability: instrumented wrappers for OpenAI, Anthropic, Vertex AI, Bedrock.

Browse the code · Product page

llm-cache

Semantic caching layer for LLM calls: embedding‑based similarity, model‑aware fingerprinting, cost tracking. Supports Redis, DynamoDB, Qdrant.

Browse the code · Product page

context-window-planner

Optimize token allocation within LLM context windows: decides what to include, summarize, or drop based on configurable packing strategies.

Browse the code · Product page

a2a-reference-ts

Enterprise TypeScript implementation of Google’s Agent-to-Agent (A2A) protocol with a bidirectional A2A↔MCP bridge, OAuth2/JWT, SSE streaming.

Browse the code · Product page

terraform-mcp-gcp-cloudrun

Drop-in Terraform module to deploy MCP servers on GCP Cloud Run with Firestore, Secret Manager, Pub/Sub, and OTel.

Browse the code · Product page

mcp-catalog

Registry server for MCP server discovery across teams. Register, search, browse, and health-check organizational MCP servers — exposed as an MCP server itself.

Browse the code · Product page

mcp-changelog

Automated changelog and migration guide generator for MCP servers: diffs tool schemas between git tags, generates breaking-change summaries, CI-friendly.

Browse the code · Product page

bicycle-brands-models

Structured JSON dataset of bicycle brands, models, and rider height specs. Useful for eCommerce catalogs, sizing recommenders, or data projects.

Browse the code · Product page

terraform-mcp-observability

Drop-in Terraform module for complete observability of MCP agent systems: traces (Phoenix/Langfuse), metrics (Prometheus), alerts, log aggregation.

Browse the code · Product page

Building blocks shipped

agent-auth-proxy

Identity-aware proxy components for agent-to-service auth. The shared schemas in @reaatech/agent-auth-proxy-core, the typed HTTP client @reaatech/agent-auth-proxy-client, and the Fastify server @reaatech/agent-auth-proxy-server give you a complete OAuth2 proxy for agents.

Browse the family

agent-budget-controller

Real-time cost enforcement for LLM agents. @reaatech/agent-budget-types defines the schemas; agent-budget-engine enforces limits; agent-budget-pricing calculates costs; and agent-budget-middleware plugs into Express/Fastify. Seven packages released, all at 0.1.0.

Browse the family

agent-chaos

Fault injection toolkit for agent systems. @reaatech/agent-chaos-core middleware engine, agent-chaos-scenarios for declarative config, and agent-chaos-cli for running chaos experiments.

Browse the family

agent-eval-harness

Complete agent evaluation suite spread over 12 packages. Key ones: @reaatech/agent-eval-harness-types for shared schemas, agent-eval-harness-judge for LLM-as-judge, agent-eval-harness-golden for regression references, and agent-eval-harness-cli as the single entrypoint. All packages work together to score agent trajectories, enforce latency budgets, and gate CI promotions.

Browse the family

agent-handoff-protocol

Standardized agent handoff library. @reaatech/agent-handoff core types, agent-handoff-compression for context reduction, agent-handoff-routing for capability-based routing, and agent-handoff-transport for MCP/HTTP delivery.

Browse the family

agent-memory

Long-term memory layer for AI agents. Ten packages: @reaatech/agent-memory-core provides the interfaces; agent-memory-storage backs with PostgreSQL/pgvector; agent-memory-policies manages decay and contradictions; agent-memory-retrieval handles semantic search. The unified entrypoint is @reaatech/agent-memory.

Browse the family

agent-mesh

Multi-agent orchestration mesh: 10 packages at 1.0.0. @reaatech/agent-mesh defines the domain types, agent-mesh-classifier routes intents, agent-mesh-router dispatches to MCP agents, and agent-mesh-gateway exposes the REST API. The entire mesh runs on Cloud Run with Firestore sessions and hot-reloaded YAML config.

Browse the family

agent-replay

Record and replay agent interactions deterministically. @reaatech/agent-replay-core is the recording/replay engine; agent-replay-interceptors patches OpenAI/Anthropic SDKs; agent-replay-integrations hooks into LangChain/LangGraph; agent-replay-cli provides the command line. All released at 0.1.0.

Browse the family

Browse the full catalog

More on this topic

Comments

Sign in with GitHub to comment and vote.

Loading comments…