@reaatech/otel-genai-semconv-openai
Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.
Transparent instrumentation for the OpenAI Node.js SDK . Wraps client.chat.completions.create() to emit OpenTelemetry GenAI semantic convention spans with request metadata, token usage, cost tracking, and streaming metrics — no code changes required beyond calling instrument().
Installation
npm install @reaatech/otel-genai-semconv-openai
# or
pnpm add @reaatech/otel-genai-semconv-openai
Feature Overview
Zero-config instrument/instrument — call instrument(client) once, every create() call is traced
Non-streaming + streaming — both response types are fully instrumented with different attribute sets
Accurate token counting — tiktoken-based encoding with per-model encoding selection and fallback
Cost tracking — calculates llm.cost.* attributes using built-in or custom pricing tables
Double-instrumentation guard — calling instrument() twice is a safe no-op
Lifecycle hooks — onStart and onEnd callbacks for custom span attributes
Safe uninstrument — restores the original create method
Dual ESM/CJS output — works with import and require
Quick Start
import { OpenAIInstrumentation } from "@reaatech/otel-genai-semconv-openai" ;
import OpenAI from "openai" ;
const client = new OpenAI ({ apiKey: process.env.OPENAI_API_KEY });
new OpenAIInstrumentation ({ trackCosts: true }). instrument (client);
const response = await client.chat.completions. create ({
model: "gpt-4" ,
messages: [{ role: "user" , content: "What is OpenTelemetry?" }],
temperature: 0.7 ,
max_tokens: 500 ,
});
// Each call now emits OTel spans with gen_ai.* attributes
Captured Attributes
Request Attributes
Attribute Source Description gen_ai.request.modelrequest.modelRequested model name gen_ai.request.temperaturerequest.temperatureSampling temperature gen_ai.request.top_prequest.top_pTop-p sampling gen_ai.request.max_tokensrequest.max_tokensMax tokens limit gen_ai.request.streamingrequest.streamStreaming flag gen_ai.request.frequency_penaltyrequest.frequency_penaltyFrequency penalty gen_ai.request.presence_penaltyrequest.presence_penaltyPresence penalty gen_ai.request.stop_sequencesrequest.stopStop sequences (if array) gen_ai.request.tool_namesrequest.toolsTool/function names gen_ai.request.seedrequest.seedReproducibility seed gen_ai.request.candidates_per_promptrequest.nNumber of choices gen_ai.provider.namehardcoded openai
Response Attributes
Attribute Source Description gen_ai.response.modelresponse.modelActual model used gen_ai.response.idresponse.idResponse identifier gen_ai.response.finish_reasonsresponse.choices[].finish_reasonPer-choice finish reasons gen_ai.usage.input_tokensresponse.usage.prompt_tokensInput token count gen_ai.usage.output_tokensresponse.usage.completion_tokensOutput token count
Streaming Attributes
Attribute Description gen_ai.streaming.time_to_first_token_msLatency to first chunk gen_ai.streaming.total_duration_msTotal streaming duration gen_ai.streaming.chunk_countNumber of chunks received
Cost Attributes (when trackCosts: true)
Attribute Description llm.cost.totalTotal cost in USD llm.cost.inputInput token cost llm.cost.outputOutput token cost llm.cost.currencyCurrency code (always USD)
Span Events
Event When gen_ai.system.messageSystem messages in the request gen_ai.user.messageUser messages in the request gen_ai.assistant.messageAssistant messages in the request or response gen_ai.choiceEach choice in the response (with index, finish_reason, message)
API Reference
OpenAIInstrumentation (class)
Constructor
new OpenAIInstrumentation ({
captureRequestHeaders?: boolean;
captureResponseHeaders ?: boolean;
trackCosts ?: boolean;
pricing ?: Record < string, PricingInfo>;
onStart?: (span : Span , request : ChatCompletionCreateParams ) => void ;
onEnd ?: (span : Span , response : ChatCompletion ) => void ;
})
Methods
Method Description instrument(client)Wrap client.chat.completions.create() with instrumentation uninstrument(client)Restore the original create() method
OpenAITokenCounter (class)
Accurate token counting using tiktoken with per-model encoding selection:
const counter = new OpenAITokenCounter ();
counter. countTokens ( "Hello, world!" , "gpt-4" ); // count for a specific model
counter. countMessagesTokens (messages, "gpt-4" ); // count for a conversation
counter. clearCache (); // clear token cache
counter. free (); // free tiktoken encodings (call when done)
Encoding Selection
Model Family Encoding gpt-4*, o1*o200k_basegpt-3.5*cl100k_baseOther cl100k_base (fallback)
Attribute Mappers
Standalone functions for mapping provider data without the full instrumentation class:
import { mapOpenAIRequest, mapOpenAIResponse, mapOpenAIError } from "@reaatech/otel-genai-semconv-openai" ;
const requestAttrs = mapOpenAIRequest (chatCompletionParams);
const responseAttrs = mapOpenAIResponse (chatCompletionObject);
const errorAttrs = mapOpenAIError (apiError);
Configuration
Custom Pricing
new OpenAIInstrumentation ({
trackCosts: true ,
pricing: {
"gpt-4" : { input: 0.03 , output: 0.06 },
"gpt-4o-mini" : { input: 0.00015 , output: 0.0006 },
},
}). instrument (client);
Lifecycle Hooks
new OpenAIInstrumentation ({
onStart : (span, request) => {
span. setAttribute ( "user.id" , request.user);
span. setAttribute ( "feature.flag" , getFeatureFlag ());
},
onEnd : (span, response) => {
span. setAttribute ( "response.quality_score" , calculateQuality (response));
span. setAttribute ( "response.latency_ms" , Date. now () - startTime);
},
}). instrument (client);
Error Type Mapping
The instrumentation classifies errors into the following types:
Condition Error Type rate limit / 429rate_limit_errorauthentication / 401authentication_errorinvalid / 400invalid_request_errornot found / 404not_found_errorserver / 500server_errorOther unknown_error
Usage Patterns
Streaming
const stream = await client.chat.completions. create ({
model: "gpt-4" ,
messages: [{ role: "user" , content: "Tell me a story" }],
stream: true ,
});
for await ( const chunk of stream) {
process.stdout. write (chunk.choices[ 0 ]?.delta?.content || "" );
}
// Span auto-finalizes with TTFT, duration, and chunk count when the stream ends
Multi-Client
const instrumentation = new OpenAIInstrumentation ({ trackCosts: true });
const client1 = new OpenAI ({ apiKey: "..." , baseURL: "..." });
const client2 = new OpenAI ({ apiKey: "..." , baseURL: "..." });
instrumentation. instrument (client1);
instrumentation. instrument (client2);
Cleanup
instrumentation. uninstrument (client);
// client.chat.completions.create is now the original, unwrapped method
Related Packages
License
MIT