Tracing (OpenTelemetry / OTLP)

Enprompta exposes a standard OTLP/HTTP trace endpoint. Any OpenTelemetry SDK, OpenInference, or OpenLLMetry exporter can send traces with no custom code — just point your existing instrumentation at Enprompta with two environment variables.

1. Endpoint & authentication

The OTLP/HTTP trace endpoint accepts the standard OTLP /v1/traces path:

Endpoint
POST https://enprompta.com/api/ingest/otlp/v1/traces

Authenticate with an API key sent as a Bearer token. The key must have the traces:write scope — create one in your dashboard under API Keys.

Authorization header
Authorization: Bearer ep_your_api_key

Both OTLP wire formats are supported: application/x-protobuf (the OTLP default) and application/json.

2. Quick start (any OTel exporter)

If you already use OpenTelemetry — directly, or via OpenInference or OpenLLMetry auto-instrumentation — you don't need an Enprompta SDK. Set the two standard OTLP environment variables and your traces flow to Enprompta:

Environment variables
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="https://enprompta.com/api/ingest/otlp/v1/traces"
OTEL_EXPORTER_OTLP_TRACES_HEADERS="Authorization=Bearer ep_your_api_key"

That's it — any instrumentation that emits the OpenInference (llm.*) or OpenTelemetry GenAI (gen_ai.*) semantic conventions is understood automatically (see section 4).

For example, with the Python OpenTelemetry SDK the standard OTLP HTTP exporter picks those variables up:

Python (OpenTelemetry SDK)
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

provider = TracerProvider()
# Reads OTEL_EXPORTER_OTLP_TRACES_ENDPOINT and _HEADERS from the environment
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))

# Now add your LLM instrumentation (OpenInference / OpenLLMetry) as usual —
# its spans are exported straight to Enprompta.

3. Send a trace with cURL

A minimal OTLP/JSON request — useful to verify connectivity. IDs are hex strings and timestamps are unix nanoseconds:

OTLP/JSON via cURL
curl -X POST https://enprompta.com/api/ingest/otlp/v1/traces \
  -H "Authorization: Bearer ep_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "resourceSpans": [{
      "resource": { "attributes": [
        { "key": "enprompta.environment", "value": { "stringValue": "production" } }
      ]},
      "scopeSpans": [{
        "spans": [{
          "traceId": "5b8efff798038103d269b633813fc60c",
          "spanId": "eee19b7ec3c1b174",
          "name": "chat",
          "startTimeUnixNano": "1700000000000000000",
          "endTimeUnixNano": "1700000001200000000",
          "status": { "code": 1 },
          "attributes": [
            { "key": "gen_ai.system", "value": { "stringValue": "openai" } },
            { "key": "gen_ai.request.model", "value": { "stringValue": "gpt-4o" } },
            { "key": "gen_ai.prompt", "value": { "stringValue": "Summarise this ticket." } },
            { "key": "gen_ai.completion", "value": { "stringValue": "The customer reports..." } },
            { "key": "gen_ai.usage.prompt_tokens", "value": { "intValue": 120 } },
            { "key": "gen_ai.usage.completion_tokens", "value": { "intValue": 48 } }
          ]
        }]
      }]
    }]
  }'

On success the endpoint returns 200 with an OTLP ExportTraceServiceResponse (an empty partialSuccess object). Spans are persisted asynchronously and appear under Observability.

4. Semantic conventions

Enprompta reads span attributes from both the OpenInference (llm.*) and OpenTelemetry GenAI (gen_ai.*) conventions. Provide either; if both are present, llm.* wins.

FieldOpenInferenceOpenTelemetry GenAI
Providerllm.providergen_ai.system
Modelllm.modelgen_ai.request.model
Input / promptllm.input.messagesgen_ai.prompt
Output / completionllm.output.messagesgen_ai.completion
Input tokensllm.token_count.promptgen_ai.usage.prompt_tokens
Output tokensllm.token_count.completiongen_ai.usage.completion_tokens
Latency (ms)llm.latency_ms— (else derived from span start/end)
Costllm.cost— (else priced from model)

Cost & latency are filled in for you. If a span omits llm.cost, Enprompta prices it from the provider/model. If it omits llm.latency_ms, latency is derived from the span's start and end times. Any attribute not in the table above is kept on the trace as metadata.

5. Linking prompts, sessions & environments

Set these enprompta.* span (or resource) attributes to connect a trace to the rest of your workspace:

AttributePurpose
enprompta.environmentEnvironment label (defaults to "production").
enprompta.prompt_idLink the trace to a registered prompt.
enprompta.prompt_version_idLink to a specific prompt version.
enprompta.session_id (or session.id)Group multi-turn calls into one session.
enprompta.project_idAssign the trace to a project.

Resource-level attributes are inherited by every span in the batch (span-level values override them) — so set enprompta.environment and enprompta.project_id once on the resource.

6. Quotas, limits & responses

  • Monthly trace quota. Free includes 5,000 traces/month; Pro 200,000. Over quota, the endpoint returns 403 with a partialSuccess.rejectedSpans count — a permanent (non-retryable) signal per the OTLP spec, so exporters drop the batch rather than retry.
  • Batch size. Up to 100 spans are accepted per request. If a batch exceeds that, the extra spans are reported in partialSuccess.rejectedSpans; send larger volumes across multiple batches.
  • Errors. A span with OTLP status code ERROR is recorded as a failed trace (its status message is preserved).
  • Auth failures. A missing, invalid, or expired key returns 401; a valid key that lacks the traces:write scope returns 403.

Prefer a managed client? The Python and TypeScript SDKs can also send traces via the native /api/v1/traces endpoint.

Tracing (OpenTelemetry / OTLP) - Enprompta