Skip to content
reaatech

Files · vLLM Reliability Suite for NetSuite SMB Financial Operations

68 (1 binary, 705.8 kB total)attempt 1

README.md·10801 B·markdown
markdown
# vLLM Reliability Suite for NetSuite SMB Financial Operations
 
> A durable sync pipeline that uses vLLM to enrich NetSuite records, with circuit breakers and idempotency so no transaction is lost or duplicated.
 
Small and medium businesses running NetSuite need to classify financial transactions, detect anomalies, and enrich vendor data — but building a resilient AI pipeline that tolerates network failures, prevents duplicate writes, and maintains job state is non-trivial. This recipe solves that with a vLLM-powered enrichment server guarded by circuit breakers, idempotency middleware, and session continuity.
 
A tutorialized reference solution from [reaatech.com](https://reaatech.com), demonstrating how to build production-grade AI systems with the `@reaatech/*` package family.
 
## Architecture
 
```
┌──────────────┐   Webhook    ┌──────────────────────────────────────────────────┐
│   NetSuite    │────────────▶│          Express Server (:3100)                  │
│  (SMB Org)    │             │  POST /webhooks/netsuite  →  enrich-record      │
└──────────────┘             │  GET  /health                                   │
       ▲                      │  GET  /metrics                                  │
       │                      └──────────┬───────────────────────────────────────┘
       │                                 │
       │        ┌────────────────────────▼────────────────────────┐
       │        │               enrich-record workflow             │
       │        │                                                  │
       │        │  ┌──────────┐  ┌──────────┐  ┌──────────────┐   │
       │        │  │ Classify  │  │ Anomaly  │  │ Entity       │   │
       │        │  │ Record    │──▶Detection │──▶Extract       │   │
       │        │  └─────┬────┘  └────┬─────┘  └──────┬───────┘   │
       │        │        │            │                │           │
       │        │  ┌─────▼────────────▼────────────────▼───────┐   │
       │        │  │          vLLM Client (OpenAI SDK)         │   │
       │        │  │  ┌─ Circuit Breaker ─ (3 fail / 60s) ─┐  │   │
       │        │  └──────────────────┬──────────────────────┘   │
       │        │                     │                          │
       │        │  ┌──────────────────▼──────────────────────┐   │
       │        │  │           NetSuite Client               │   │
       │        │  │  ┌─ Circuit Breaker ─ (5 fail / 30s) ─┐│   │
       │        │  └────────┬────────────────────┬──────────┘   │
       │        │           │                    │              │
       │        │  ┌────────▼────┐       ┌──────▼──────────┐    │
       │        │  │ GET Record  │       │  PATCH Record   │    │
       │        │  └─────────────┘       └──────┬──────────┘    │
       │        │                               │               │
       │        │  ┌────────────────────────────▼────────────┐  │
       │        │  │  Idempotency Middleware                 │  │
       │        │  │  (MemoryAdapter · TTL dedup · PATCH)    │  │
       │        │  └──────────────────────────────────────────┘  │
       │        │                                                 │
       │        │  ┌──────────────────────────────────────────┐  │
       │        │  │  Session Continuity (Redis)              │  │
       │        │  │  (job state · token budget · overflow)   │  │
       │        │  └──────────────────────────────────────────┘  │
       │        └──────────────────────┬─────────────────────────┘
       │                               │
       │        ┌──────────────────────▼─────────────────────────┐
       │        │     Langfuse Observability (LLM tracing)       │
       └────────│     spans for every classify/extract/patch     │
                └────────────────────────────────────────────────┘
```
 
Requests enter through Express, are validated with Zod, and dispatched to the `enrich-record` workflow. The workflow fetches the raw record from NetSuite (via the circuit-breaker-guarded client), runs it through three vLLM inference steps (classification, anomaly detection, entity extraction), and writes the enriched result back with idempotency guarantees. Every step is traced to Langfuse; job state is persisted in Redis via the session continuity package.
 
## What's included
 
| Package | Role |
|---|---|
| `@reaatech/circuit-breaker-core` | Guards NetSuite API calls (5-failure/30s recovery) and vLLM inference calls (3-failure/60s recovery) with gradual recovery |
| `@reaatech/session-continuity` | Redis-backed session manager for job state, message history, token budget enforcement, and overflow compression |
| `@reaatech/idempotency-middleware` | Deduplicates NetSuite PATCH calls using a memory adapter with configurable TTL and lock timeout |
| `@reaatech/agent-runbook-agent` | AI agent that analyzes failure modes and generates incident-response runbook sections from deployment context |
 
## Quickstart
 
```bash
pnpm install
cp .env.example .env          # edit with your credentials
pnpm dev
```
 
## API endpoints
 
- **POST /webhooks/netsuite** — Accepts `{ internalId, type, eventType }`, dispatches async enrichment. Returns `202 { jobId, status }`.
- **GET /health** — Returns overall health including circuit breaker states, Redis connectivity, and vLLM reachability.
- **GET /metrics** — Returns circuit breaker statistics (success/failure counts, state transitions).
 
## Environment variables
 
| Variable | Description | Required |
|---|---|---|
| `NETSUITE_ACCOUNT_ID` | NetSuite account ID for REST API base URL | Yes |
| `NETSUITE_CONSUMER_KEY` | OAuth 1.0 consumer key | Yes |
| `NETSUITE_CONSUMER_SECRET` | OAuth 1.0 consumer secret | Yes |
| `NETSUITE_TOKEN_ID` | OAuth 1.0 token ID | Yes |
| `NETSUITE_TOKEN_SECRET` | OAuth 1.0 token secret | Yes |
| `NETSUITE_BEARER_TOKEN` | Bearer token (overrides OAuth 1.0 if set) | No |
| `VLLM_BASE_URL` | Base URL of the vLLM inference endpoint | Yes |
| `VLLM_API_KEY` | API key for vLLM (if required) | No |
| `VLLM_MODEL` | Model name to use for inference | Yes |
| `REDIS_URL` | Redis connection string for session/job state | Yes |
| `TRIGGER_API_KEY` | Trigger.dev API key | Yes |
| `TRIGGER_PROJECT_ID` | Trigger.dev project ID | Yes |
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key for LLM tracing | Yes |
| `LANGFUSE_SECRET_KEY` | Langfuse secret key | Yes |
| `LANGFUSE_HOST` | Langfuse host URL | Yes |
| `EXPRESS_PORT` | Express server port (default: 3100) | No |
| `AGENT_LLM_PROVIDER` | Runbook agent LLM provider (`openai`, `claude`, `gemini`, `mock`) | No |
| `AGENT_LLM_API_KEY` | API key for the runbook agent LLM | No |
| `AGENT_LLM_MODEL` | Model name for the runbook agent | Yes |
 
## Testing
 
```bash
pnpm test          # vitest run with coverage
```
 
The test suite uses vitest with the V8 coverage provider (thread pool, 90% threshold on all metrics). MSW handles HTTP mocking for NetSuite and vLLM integration tests.
 
## Deployment notes
 
- **Express server**: Runs on the port configured via `EXPRESS_PORT` (default 3100). The server handles graceful shutdown on `SIGTERM`/`SIGINT`, closing the Redis connection and HTTP server.
- **Redis dependency**: The session continuity layer and job state management require a running Redis instance. Configure via `REDIS_URL`.
- **vLLM endpoint**: A running vLLM server (or any OpenAI-compatible inference server) must be reachable at `VLLM_BASE_URL`. Concurrency is limited to 5 simultaneous inference calls via the `p-limit` token bucket.
- **Next.js**: The toolkit is scaffolded with Next.js for future UI extensions. The enrichment pipeline runs as an Express server alongside it.
 
## Project layout
 
```
src/
  app.ts                Express server, route handlers, DI wiring
  index.ts              Package entry point
  lib/
    netsuite-client.ts   NetSuite REST API client with circuit breaker
    vllm-client.ts       vLLM (OpenAI-compatible) client with circuit breaker
    idempotency.ts       Idempotent PATCH wrapper via @reaatech/idempotency-middleware
    job-state.ts         Redis-backed session manager via @reaatech/session-continuity
    runbook-agent.ts     Failure mode analysis agent via @reaatech/agent-runbook-agent
    observability.ts     Langfuse tracing helpers
  types/
    config.ts            Zod-validated environment config schema
    netsuite.ts          NetSuite record types
    workflow.ts          Webhook payload and enrichment result types
  workflows/
    enrich-record.ts     Core enrichment orchestration workflow
tests/                   Vitest suite mirroring src/ layout
packages/                API references for every dependency
DEV_PLAN.md              Build plan for this recipe
```
 
## License
 
MIT — see [LICENSE](./LICENSE).