Skip to content
reaatechREAATECH

@reaatech/otel-genai-semconv-openai

npm v0.1.0

Instruments the OpenAI Node.js SDK to automatically emit OpenTelemetry spans compliant with GenAI semantic conventions. It provides an `OpenAIInstrumentation` class that wraps the client's `chat.completions.create` method to capture request metadata, token usage, and cost metrics.

@reaatech/otel-genai-semconv-openai

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Transparent instrumentation for the OpenAI Node.js SDK. Wraps client.chat.completions.create() to emit OpenTelemetry GenAI semantic convention spans with request metadata, token usage, cost tracking, and streaming metrics — no code changes required beyond calling instrument().

Installation

terminal
npm install @reaatech/otel-genai-semconv-openai
# or
pnpm add @reaatech/otel-genai-semconv-openai

Feature Overview

  • Zero-config instrument/instrument — call instrument(client) once, every create() call is traced
  • Non-streaming + streaming — both response types are fully instrumented with different attribute sets
  • Accurate token counting — tiktoken-based encoding with per-model encoding selection and fallback
  • Cost tracking — calculates llm.cost.* attributes using built-in or custom pricing tables
  • Double-instrumentation guard — calling instrument() twice is a safe no-op
  • Lifecycle hooksonStart and onEnd callbacks for custom span attributes
  • Safe uninstrument — restores the original create method
  • Dual ESM/CJS output — works with import and require

Quick Start

typescript
import { OpenAIInstrumentation } from "@reaatech/otel-genai-semconv-openai";
import OpenAI from "openai";
 
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
 
new OpenAIInstrumentation({ trackCosts: true }).instrument(client);
 
const response = await client.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: "What is OpenTelemetry?" }],
  temperature: 0.7,
  max_tokens: 500,
});
// Each call now emits OTel spans with gen_ai.* attributes

Captured Attributes

Request Attributes

AttributeSourceDescription
gen_ai.request.modelrequest.modelRequested model name
gen_ai.request.temperaturerequest.temperatureSampling temperature
gen_ai.request.top_prequest.top_pTop-p sampling
gen_ai.request.max_tokensrequest.max_tokensMax tokens limit
gen_ai.request.streamingrequest.streamStreaming flag
gen_ai.request.frequency_penaltyrequest.frequency_penaltyFrequency penalty
gen_ai.request.presence_penaltyrequest.presence_penaltyPresence penalty
gen_ai.request.stop_sequencesrequest.stopStop sequences (if array)
gen_ai.request.tool_namesrequest.toolsTool/function names
gen_ai.request.seedrequest.seedReproducibility seed
gen_ai.request.candidates_per_promptrequest.nNumber of choices
gen_ai.provider.namehardcodedopenai

Response Attributes

AttributeSourceDescription
gen_ai.response.modelresponse.modelActual model used
gen_ai.response.idresponse.idResponse identifier
gen_ai.response.finish_reasonsresponse.choices[].finish_reasonPer-choice finish reasons
gen_ai.usage.input_tokensresponse.usage.prompt_tokensInput token count
gen_ai.usage.output_tokensresponse.usage.completion_tokensOutput token count

Streaming Attributes

AttributeDescription
gen_ai.streaming.time_to_first_token_msLatency to first chunk
gen_ai.streaming.total_duration_msTotal streaming duration
gen_ai.streaming.chunk_countNumber of chunks received

Cost Attributes (when trackCosts: true)

AttributeDescription
llm.cost.totalTotal cost in USD
llm.cost.inputInput token cost
llm.cost.outputOutput token cost
llm.cost.currencyCurrency code (always USD)

Span Events

EventWhen
gen_ai.system.messageSystem messages in the request
gen_ai.user.messageUser messages in the request
gen_ai.assistant.messageAssistant messages in the request or response
gen_ai.choiceEach choice in the response (with index, finish_reason, message)

API Reference

OpenAIInstrumentation (class)

Constructor

typescript
new OpenAIInstrumentation({
  captureRequestHeaders?: boolean;
  captureResponseHeaders?: boolean;
  trackCosts?: boolean;
  pricing?: Record<string, PricingInfo>;
  onStart?: (span: Span, request: ChatCompletionCreateParams) => void;
  onEnd?: (span: Span, response: ChatCompletion) => void;
})

Methods

MethodDescription
instrument(client)Wrap client.chat.completions.create() with instrumentation
uninstrument(client)Restore the original create() method

OpenAITokenCounter (class)

Accurate token counting using tiktoken with per-model encoding selection:

typescript
const counter = new OpenAITokenCounter();
counter.countTokens("Hello, world!", "gpt-4");           // count for a specific model
counter.countMessagesTokens(messages, "gpt-4");           // count for a conversation
counter.clearCache();                                     // clear token cache
counter.free();                                           // free tiktoken encodings (call when done)

Encoding Selection

Model FamilyEncoding
gpt-4*, o1*o200k_base
gpt-3.5*cl100k_base
Othercl100k_base (fallback)

Attribute Mappers

Standalone functions for mapping provider data without the full instrumentation class:

typescript
import { mapOpenAIRequest, mapOpenAIResponse, mapOpenAIError } from "@reaatech/otel-genai-semconv-openai";
 
const requestAttrs = mapOpenAIRequest(chatCompletionParams);
const responseAttrs = mapOpenAIResponse(chatCompletionObject);
const errorAttrs = mapOpenAIError(apiError);

Configuration

Custom Pricing

typescript
new OpenAIInstrumentation({
  trackCosts: true,
  pricing: {
    "gpt-4": { input: 0.03, output: 0.06 },
    "gpt-4o-mini": { input: 0.00015, output: 0.0006 },
  },
}).instrument(client);

Lifecycle Hooks

typescript
new OpenAIInstrumentation({
  onStart: (span, request) => {
    span.setAttribute("user.id", request.user);
    span.setAttribute("feature.flag", getFeatureFlag());
  },
  onEnd: (span, response) => {
    span.setAttribute("response.quality_score", calculateQuality(response));
    span.setAttribute("response.latency_ms", Date.now() - startTime);
  },
}).instrument(client);

Error Type Mapping

The instrumentation classifies errors into the following types:

ConditionError Type
rate limit / 429rate_limit_error
authentication / 401authentication_error
invalid / 400invalid_request_error
not found / 404not_found_error
server / 500server_error
Otherunknown_error

Usage Patterns

Streaming

typescript
const stream = await client.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: "Tell me a story" }],
  stream: true,
});
 
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
// Span auto-finalizes with TTFT, duration, and chunk count when the stream ends

Multi-Client

typescript
const instrumentation = new OpenAIInstrumentation({ trackCosts: true });
 
const client1 = new OpenAI({ apiKey: "...", baseURL: "..." });
const client2 = new OpenAI({ apiKey: "...", baseURL: "..." });
 
instrumentation.instrument(client1);
instrumentation.instrument(client2);

Cleanup

typescript
instrumentation.uninstrument(client);
// client.chat.completions.create is now the original, unwrapped method

License

MIT