Files · Vertex AI Budget Guardrails for Multi-Agent Systems

34 (0 binary, 231.7 kB total)attempt 4

README.md·2455 B·markdown

markdown

# Vertex AI Budget Guardrails for Multi-Agent Systems
 
**Keep your AI costs predictable with per-agent budget limits and automatic model routing on Vertex AI.**
 
Small businesses using multiple AI agents on Vertex AI often exceed their monthly budget due to unpredictable model calls and no per-agent cost controls. This project provides a cost-control middleware that integrates into existing Vertex AI agent setups, enforcing budget limits per agent, automatically switching to cheaper models when necessary, and providing real-time cost dashboards.
 
## Architecture
 
```
Agent Request → Budget Middleware → BudgetController → SpendStore/PricingEngine → Vertex AI
                                        ↓
                                  BudgetAwareStrategy
                                    (model routing)
                                        ↓
                              Metrics Loop → eval-harness-cost
```
 
## Prerequisites
 
- Node.js >= 22
- pnpm >= 10
- A Google Cloud project with the Vertex AI API enabled
 
## Setup
 
```bash
git clone <repo-url>
cd vertex-budget-guardrails
pnpm install
cp .env.example .env
```
 
Edit `.env` and fill in your `PROJECT_ID` from the GCP console. Then:
 
```bash
pnpm dev
```
 
## Usage
 
Send a budget-scoped LLM request:
 
```bash
curl -H "x-budget-scope-type: org" \
     -H "x-budget-scope-key: my-org" \
     -H "Content-Type: application/json" \
     -d '{"prompt":"Hello"}' \
     http://localhost:3000/api/llm
```
 
Check budget metrics:
 
```bash
curl -H "Authorization: Bearer change-me" \
     http://localhost:3000/metrics
```
 
## Budget Lifecycle
 
| State     | Utilization | Behavior                          |
|-----------|-------------|-----------------------------------|
| Active    | 0–50%       | Normal operation                  |
| Warned    | 50–80%      | Warning headers added to response |
| Degraded  | 80–100%     | Model downgrade suggested         |
| Stopped   | 100%+       | Requests return 402               |
 
## Scripts
 
- `pnpm test` — Run tests with coverage
- `pnpm test:coverage` — Run coverage report
- `pnpm typecheck` — TypeScript type checking
- `pnpm lint` — ESLint
 
## License
 
MIT — see [LICENSE](LICENSE)
 
## Badges
 
[![Build Status](https://img.shields.io/badge/build-passing-brightgreen)]()
[![Coverage](https://img.shields.io/badge/coverage-%3E90%25-brightgreen)]()
[![License](https://img.shields.io/badge/license-MIT-blue)]()