Files · Databricks LLM Observability for SMB Production AI
63 (1 binary, 562.6 kB total)attempt 1
README.md·3491 B·markdown
markdown
# Databricks LLM Observability for SMB Production AI
> Drop-in OpenTelemetry tracing and cost attribution for every Databricks model call, visualized in Langfuse, so small teams can monitor LLM performance without building custom instrumentation.
## Overview
Small businesses deploying Databricks-hosted LLMs lack visibility into latency, token usage, and spend across their applications, making it hard to debug slowdowns or control costs. This solution wraps Databricks LLM SDK calls with @reaatech/otel-genai-semconv-core to emit OpenTelemetry GenAI spans, routes them to Langfuse via @reaatech/otel-genai-semconv-exporters, and attaches per-span cost data with @reaatech/otel-cost-exporter-core.
## Architecture
```
Databricks SDK → DatabricksWrapper → ModelSpan (SpanBuilder) → CostTracker → LangfuseSpanManager → Langfuse dashboard
```
The observability pipeline:
1. **DatabricksWrapper** — calls Databricks model serving endpoints with latency measurement
2. **ModelSpan** — creates OTel-compliant GenAI spans via @reaatech/otel-genai-semconv-core
3. **CostTracker** — computes token-based cost using @reaatech/otel-cost-exporter-core types
4. **LangfuseSpanManager** — exports spans to Langfuse via @reaatech/otel-genai-semconv-exporters
5. **AlertService** — evaluates p95 latency, error rate, and cost against thresholds
## Prerequisites
- Databricks workspace with model serving enabled
- Langfuse account (cloud or self-hosted)
- Node.js >= 22
- pnpm 10
## Setup
1. Clone the repository:
```bash
git clone <repo-url>
cd databricks-llm-observability-for-smb-production-ai
```
2. Install dependencies:
```bash
pnpm install
```
3. Configure environment variables (copy `.env.example` to `.env` and fill in):
```bash
cp .env.example .env
```
Required variables:
- `DATABRICKS_HOST` — your Databricks workspace hostname (e.g. `your-workspace.cloud.databricks.com`; do NOT include `https://`)
- `DATABRICKS_TOKEN` — Databricks personal access token
- `LANGFUSE_PUBLIC_KEY` — Langfuse public key
- `LANGFUSE_SECRET_KEY` — Langfuse secret key
- `LANGFUSE_BASE_URL` — Langfuse base URL (e.g. `https://cloud.langfuse.com`)
- `OTEL_EXPORTER_OTLP_ENDPOINT` — OTLP endpoint URL (pointing to Langfuse)
## Quick Start
```typescript
import { createObservabilityService } from "./src/index.js";
const service = createObservabilityService();
const result = await service.trackModelCall(
"my-model",
"databricks-meta-llama-3-1-70b-instruct",
{ messages: [{ role: "user", content: "Hello!" }] }
);
console.log("Cost:", result.costBreakdown);
await service.shutdown();
```
## API Routes
### GET /api/observability
Returns aggregated observability metrics:
```json
{
"metrics": {
"totalRequests": 0,
"p95LatencyMs": 0,
"totalCostUsd": 0,
"errorRate": 0
},
"timestamp": 1700000000000
}
```
### POST /api/observability
Evaluate current metrics against alert thresholds:
```json
{
"alerts": [],
"passed": true
}
```
Optional body: `{ "thresholds": { "p95LatencyMsThreshold": 3000 } }`
## Testing
```bash
pnpm test
```
Runs vitest with coverage. Tests mock all external calls (Databricks API, Langfuse, OTel).
## Project layout
```
app/api/observability/ Next.js API route for metrics + alerts
src/lib/ services, wrappers, cost tracking, alerting
src/instrumentation.ts OTel SDK initialization
tests/ vitest suite with mocks
```
## License
MIT — see [LICENSE](./LICENSE).