Skip to main content
This is an experimental feature available for Enterprise and Self-Hosted deployments only.

Overview

Portkey Gateway can push complete LLM request/response logs to any OpenTelemetry-compatible endpoint, following the experimental semantic conventions for Generative AI. This enables integration with external observability platforms like LangSmith, Datadog, Honeycomb, or any OTLP-compatible collector. Unlike the Analytics Export, this feature exports complete logs including:
  • Full prompt and completion content
  • Tool definitions and function calls
  • All request parameters (temperature, max_tokens, etc.)
  • Complete response metadata

Why Use This Feature?

  • Unified Observability: Consolidate LLM telemetry with your existing monitoring infrastructure
  • External Analysis: Send Portkey logs to specialized GenAI observability tools like LangSmith
  • Standards-Based: Follows OpenTelemetry GenAI semantic conventions for consistent attribute naming
  • Zero Code Changes: Enable via environment variables without modifying application code

Scope

Included

  • Automatic log export to OTLP-compatible HTTP endpoints
  • Full GenAI semantic convention attribute mapping
  • Request parameters, response metadata, and token usage
  • Tool definitions and prompt/completion content
  • LangSmith-compatible attribute overrides

Out of Scope

  • gRPC OTLP transport (HTTP/JSON only)
  • Metrics export (use Analytics Export for that)
  • Custom attribute mapping configuration

Architecture

Data Flow:
  1. LLM request is processed through Portkey Gateway
  2. After receiving the LLM response, a log object is generated
  3. The log is transformed into OTLP span format with GenAI semantic attributes
  4. The span is pushed to the configured external OTLP endpoint

Configuration

Environment Variables

VariableRequiredDescription
EXPERIMENTAL_GEN_AI_OTEL_TRACES_ENABLEDYesSet to true to enable log pushing
EXPERIMENTAL_GEN_AI_OTEL_EXPORTER_OTLP_ENDPOINTYesThe OTLP endpoint URL (without /v1/traces suffix)
EXPERIMENTAL_GEN_AI_OTEL_EXPORTER_OTLP_HEADERSNoComma-separated headers in key=value format
EXPERIMENTAL_GEN_AI_OTEL_RESOURCE_ATTRIBUTESNoComma-separated key=value pairs added as OTLP resource attributes

Example Configuration

EXPERIMENTAL_GEN_AI_OTEL_TRACES_ENABLED: "true"
EXPERIMENTAL_GEN_AI_OTEL_EXPORTER_OTLP_ENDPOINT: https://api.smith.langchain.com/otel
EXPERIMENTAL_GEN_AI_OTEL_EXPORTER_OTLP_HEADERS: x-api-key=<your-langsmith-api-key>
The endpoint should be the base OTLP URL. Portkey automatically appends /v1/traces when sending data.

Span Attributes

Logs are exported as OpenTelemetry spans with attributes following the GenAI semantic conventions.

Resource Attributes

AttributeValueDescription
service.nameportkeyService identifier
otel.semconv.version1.40.0Semantic convention version
Custom attributesConfigurableAdditional attributes from EXPERIMENTAL_GEN_AI_OTEL_RESOURCE_ATTRIBUTES

Common Attributes

These attributes are set on all spans (inference and embeddings):
AttributeSourceDescription
gen_ai.operation.nameDerivedOperation type: chat, embeddings, or derived from request URL
gen_ai.provider.nameMetricsSemconv-normalized provider name (e.g., openai, anthropic, aws.bedrock, azure.ai.openai)
gen_ai.systemMetricsBackward-compatible alias for gen_ai.provider.name
gen_ai.request.modelRequest bodyModel identifier
gen_ai.response.modelResponse bodyModel used for generation
server.addressRequest URLProvider endpoint hostname
server.portRequest URLServer port (derived from URL, 443 for HTTPS)
error.typeResponseHTTP status code (for errors ≥300)

Inference Attributes

These attributes are set on inference (chat completion) spans:
AttributeSourceDescription
gen_ai.request.max_tokensRequest bodyMaximum tokens to generate (from max_tokens or max_completion_tokens)
gen_ai.request.temperatureRequest bodySampling temperature
gen_ai.request.top_pRequest bodyTop-p sampling parameter
gen_ai.request.top_kRequest bodyTop-k sampling parameter
gen_ai.request.stop_sequencesRequest bodyStop sequences
gen_ai.request.frequency_penaltyRequest bodyFrequency penalty
gen_ai.request.presence_penaltyRequest bodyPresence penalty
gen_ai.request.seedRequest bodyRandom seed
gen_ai.request.choice.countRequest bodyNumber of choices requested (only set when > 1)
gen_ai.response.idResponse bodyResponse identifier
gen_ai.response.finish_reasonsResponse bodyNormalized finish reasons (e.g., tool_use/tool_calls/function_call → tool_call)
gen_ai.usage.input_tokensResponse usageInput token count
gen_ai.usage.output_tokensResponse usageOutput token count
gen_ai.usage.cache_creation.input_tokensResponse usageTokens written to prompt cache
gen_ai.usage.cache_read.input_tokensResponse usageTokens read from prompt cache
gen_ai.conversation.idRequest/ResponseConversation or thread identifier
gen_ai.output.typeRequest bodyOutput format type: json, image, speech, or text

Embeddings Attributes

These attributes are set on embeddings spans:
AttributeSourceDescription
gen_ai.request.encoding_formatsRequest bodyRequested encoding formats
gen_ai.embeddings.dimension.countRequest bodyEmbedding dimensions
gen_ai.usage.input_tokensResponse usageInput token count

Message Attributes

Messages are exported using structured semconv 1.40.0 format with role-based grouping and multimodal part normalization:
AttributeDescription
gen_ai.input.messagesArray of input messages (excluding system), each with role and parts (text, tool_call, tool_call_response, image, audio, file, reasoning)
gen_ai.system_instructionsSystem message content extracted as instruction parts
gen_ai.output.messagesArray of output messages, each with role, parts, and finish_reason
gen_ai.tool.definitionsArray of tool definitions from request
Tool call semantics are normalized across providers: tool_use (Anthropic), tool_calls (OpenAI), and function_call are all mapped to the standard tool_call type.

Span Structure

Each log is exported as a span with the following structure:
{
  "resourceSpans": [{
    "resource": {
      "attributes": [
        { "key": "service.name", "value": { "stringValue": "portkey" } },
        { "key": "otel.semconv.version", "value": { "stringValue": "1.40.0" } }
      ]
    },
    "scopeSpans": [{
      "scope": { "name": "custom.genai.instrumentation" },
      "spans": [{
        "traceId": "<32-char-hex>",
        "spanId": "<16-char-hex>",
        "parentSpanId": "<optional-16-char-hex>",
        "name": "chat gpt-4o",
        "kind": 3,
        "startTimeUnixNano": "<nanoseconds>",
        "endTimeUnixNano": "<nanoseconds>",
        "status": { "code": 1, "message": "200" },
        "attributes": [/* GenAI attributes */]
      }]
    }]
  }]
}
  • Span name follows the format {operation_name} {model} (e.g., chat gpt-4o, embeddings text-embedding-3-small)
  • Span kind is CLIENT (3), reflecting that the gateway acts as a client to the LLM provider

Trace Context

The feature preserves trace context from incoming requests:
  • Trace ID: Uses Portkey’s trace ID if valid (32 hex chars), otherwise generates a new one
  • Span ID: Uses Portkey’s span ID if valid (16 hex chars), otherwise generates a new one
  • Parent Span ID: Preserved from parent request if available
This allows correlation with upstream traces in distributed tracing scenarios.

Error Handling

  • Export Failures: Logged internally; does not affect request processing
  • Invalid Endpoints: Connection errors are caught and logged
  • Malformed Responses: Error responses from the OTLP endpoint are logged with status text
Export failures do not block or retry the LLM request. Logs may be lost if the external endpoint is unavailable.

Performance Considerations

  • Async Processing: Log export runs in parallel with other post-request handlers
  • No Request Latency Impact: Export happens after the response is returned
  • HTTP/JSON Protocol: Uses standard HTTP POST with JSON payload

Security Considerations

When enabled, request/response content including prompts and completions are sent to the external endpoint. Ensure your endpoint is trusted and properly secured.
  • Authentication: Use the headers configuration to pass API keys or bearer tokens
  • Data Sensitivity: Prompt and completion content is included in exports
  • Network Security: Use HTTPS endpoints for production deployments

Integration with LangSmith

Portkey exports spans using the semconv 1.40.0 structured message format (gen_ai.input.messages, gen_ai.output.messages, gen_ai.system_instructions), which is compatible with LangSmith and other modern GenAI observability tools. For LangSmith setup, refer to LangSmith OpenTelemetry documentation.

Verification

To verify the feature is working:
  1. Enable the feature with your endpoint configuration
  2. Make an LLM request through Portkey Gateway
  3. Check your external observability platform for incoming spans
  4. Verify spans contain expected GenAI attributes

Known Limitations

The OpenTelemetry GenAI semantic conventions are experimental and subject to change. Attribute names may evolve as the specification stabilizes.
  • HTTP Only: gRPC OTLP protocol is not supported
  • No Batching: Each log is pushed individually (no span batching)
  • No Retry Logic: Failed exports are not retried
  • Experimental Conventions: GenAI semantic conventions may change
Last modified on March 25, 2026