What is the difference between genai.system and genai.provider.name for Bedrock?

They name the same thing, but genai.system is deprecated. The OpenTelemetry GenAI conventions replaced it with genai.provider.name, and for Amazon Bedrock the value is aws.bedrock. Older instrumentations still emit genai.system=aws.bedrock, so during the transition you may see both on a span. Filter dashboards on genai.provider.name where it exists and fall back to genai.system for older data, or set OTELSEMCONVSTABILITYOPTIN=genailatestexperimental so your instrumentation emits the current attribute names.

How do I instrument Bedrock Agents and AgentCore with OpenTelemetry?

For managed Bedrock Agents called through the bedrock-agent-runtime InvokeAgent API, the botocore instrumentation emits a genai span with genai.operation.name=invokeagent. For AgentCore, AWS ships the aws-opentelemetry-distro (ADOT) and you launch the agent with opentelemetry-instrument, which auto-instruments the runtime and propagates the session through the X-Amzn-Bedrock-AgentCore-Runtime-Session-Id header. Both export over OTLP, so setting OTELEXPORTEROTLPENDPOINT to OpenObserve (and DISABLEADOTOBSERVABILITY=true to skip the default CloudWatch path) sends the traces to your own backend instead of only CloudWatch.

How do I calculate Amazon Bedrock cost in dollars from OpenTelemetry token attributes?

The spans carry genai.usage.inputtokens and genai.usage.outputtokens and the model id in genai.request.model. Keep a price map keyed by Bedrock model id with an input and output rate per million tokens, then compute cost as (inputtokens inputrate + outputtokens outputrate) / 1e6 and write it onto the span as a custom attribute such as llm.cost.usd. A span processor does this at export time so every span carries its own cost, and OpenObserve sums it per model, per session, or per user.

Are the OpenTelemetry GenAI semantic conventions stable yet?

No. As of the v1.41 semantic-conventions release the GenAI spans, metrics, and attributes are still marked Development, which is the current term for experimental. Attribute names have already changed once (genai.system to genai.provider.name). Pin your instrumentation versions, expect attribute churn between releases, and use OTELSEMCONVSTABILITYOPTIN=genailatestexperimental to opt a given instrumentation into the newest shape rather than its frozen default.

What is the difference between genai.system and genai.provider.name for Bedrock?

They name the same thing, but genai.system is deprecated. The OpenTelemetry GenAI conventions replaced it with genai.provider.name, and for Amazon Bedrock the value is aws.bedrock. Older instrumentations still emit genai.system=aws.bedrock, so during the transition you may see both on a span. Filter dashboards on genai.provider.name where it exists and fall back to genai.system for older data, or set OTELSEMCONVSTABILITYOPTIN=genailatestexperimental so your instrumentation emits the current attribute names.

How do I instrument Bedrock Agents and AgentCore with OpenTelemetry?

For managed Bedrock Agents called through the bedrock-agent-runtime InvokeAgent API, the botocore instrumentation emits a genai span with genai.operation.name=invokeagent. For AgentCore, AWS ships the aws-opentelemetry-distro (ADOT) and you launch the agent with opentelemetry-instrument, which auto-instruments the runtime and propagates the session through the X-Amzn-Bedrock-AgentCore-Runtime-Session-Id header. Both export over OTLP, so setting OTELEXPORTEROTLPENDPOINT to OpenObserve (and DISABLEADOTOBSERVABILITY=true to skip the default CloudWatch path) sends the traces to your own backend instead of only CloudWatch.

How do I calculate Amazon Bedrock cost in dollars from OpenTelemetry token attributes?

The spans carry genai.usage.inputtokens and genai.usage.outputtokens and the model id in genai.request.model. Keep a price map keyed by Bedrock model id with an input and output rate per million tokens, then compute cost as (inputtokens inputrate + outputtokens outputrate) / 1e6 and write it onto the span as a custom attribute such as llm.cost.usd. A span processor does this at export time so every span carries its own cost, and OpenObserve sums it per model, per session, or per user.

Are the OpenTelemetry GenAI semantic conventions stable yet?

No. As of the v1.41 semantic-conventions release the GenAI spans, metrics, and attributes are still marked Development, which is the current term for experimental. Attribute names have already changed once (genai.system to genai.provider.name). Pin your instrumentation versions, expect attribute churn between releases, and use OTELSEMCONVSTABILITYOPTIN=genailatestexperimental to opt a given instrumentation into the newest shape rather than its frozen default.

Amazon Bedrock OpenTelemetry Observability LLM AgentCore

Instrumenting Amazon Bedrock, Bedrock Agents, and AgentCore with OpenTelemetry

Gorakhnath Yadav

June 22, 2026

17 min read

Don’t forget to share!

Ready to get started?

Try OpenObserve Cloud today for more efficient and performant observability.

Table of Contents

Amazon Bedrock, Bedrock Agents, and AgentCore emitting OpenTelemetry gen_ai spans and token-cost metrics into OpenObserve

TL;DR: To instrument Amazon Bedrock with OpenTelemetry, you put the same gen_ai.* semantic conventions on three different surfaces: raw model calls through boto3, managed Bedrock Agents through the InvokeAgent API, and agents running on Bedrock AgentCore. Each one emits spans with gen_ai.provider.name=aws.bedrock, the model id, and token counts, plus the gen_ai.client.token.usage metric. Point those exports at an OTLP backend like OpenObserve instead of only CloudWatch, add a small cost layer that turns tokens into dollars, and you get one dashboard for spend and latency across every way your app talks to Bedrock.

Three Bedrock surfaces (model calls, managed Agents, AgentCore) all map onto the same gen_ai.* spans and metrics.
gen_ai.system is deprecated. The current attribute is gen_ai.provider.name, and Bedrock's value is aws.bedrock.
Token attributes plus a price map keyed by model id give you cost in dollars per call, per session, and per user.
The conventions are still experimental, so pin versions and opt in with OTEL_SEMCONV_STABILITY_OPT_IN.

Three ways to call Bedrock, one set of conventions

Amazon Bedrock is not one API. An application reaches it through at least three doors, and each produces a different shape of telemetry if you let the vendor SDK decide.

The first door is a direct model call: boto3's bedrock-runtime client running Converse or InvokeModel against a model id. The second is a managed Bedrock Agent, called through the bedrock-agent-runtime InvokeAgent API, where AWS runs the orchestration loop, the tool calls, and the knowledge-base lookups for you. The third is Bedrock AgentCore, the runtime where you deploy your own agent code and AWS gives it a session, memory, and identity.

Left alone, those three produce three unrelated views. The point of the OpenTelemetry GenAI semantic conventions is to collapse them into one. A direct call, an agent invocation, and an AgentCore run all emit spans from the same gen_ai.* namespace, with the same attribute for the provider, the same attributes for token usage, and the same metric for token counts. You read all three the same way, and you can put total Bedrock spend on a single chart regardless of which door the request came through. If you have already set this up for OpenAI, this is the same model applied to Bedrock; the companion guide on monitoring OpenAI API costs with OpenTelemetry covers the OpenAI side, and the conventions are identical here.

Three Bedrock surfaces, direct model calls, managed Bedrock Agents, and AgentCore, all emitting gen_ai spans into one OTLP backend

The gen_ai.* cheat sheet for Bedrock (v1.41)

Before any code, here is the vocabulary. These come from the OpenTelemetry GenAI semantic conventions as of the v1.41 release. One thing to get right up front: everything in this section is still marked Development, the current word for experimental. Attribute names have already changed once, so treat them as a moving target and pin your instrumentation versions.

A Bedrock model span carries these attributes:

Attribute	What it holds	Example
`gen_ai.operation.name`	The operation	`chat`, `invoke_agent`, `execute_tool`
`gen_ai.provider.name`	The provider (replaces `gen_ai.system`)	`aws.bedrock`
`gen_ai.request.model`	The model id requested	`anthropic.claude-3-5-sonnet-20241022-v2:0`
`gen_ai.response.model`	The model id that answered	`amazon.nova-pro-v1:0`
`gen_ai.usage.input_tokens`	Input tokens (includes cached)	`1820`
`gen_ai.usage.output_tokens`	Output tokens (includes reasoning)	`340`
`aws.bedrock.guardrail.id`	Guardrail applied, when present	`abc123`
`aws.bedrock.knowledge_base.id`	Knowledge base queried, when present	`kb-456`

The span name is not free-form. The convention builds it from {gen_ai.operation.name} {gen_ai.request.model}, so a chat call against a model renders as chat anthropic.claude-3-5-sonnet-20241022-v2:0, and a managed agent invocation renders as invoke_agent <agent-id>. That naming is why you can group spans by operation without parsing free text.

The single most important change from older blog posts and older instrumentations: gen_ai.system is deprecated and replaced by gen_ai.provider.name. For Bedrock the value moved from gen_ai.system=aws.bedrock to gen_ai.provider.name=aws.bedrock. During the transition you will see spans with one, the other, or both, depending on the instrumentation version. To make a given instrumentation emit the current shape rather than its frozen default, set:

OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental

On the metrics side, two instruments do most of the work. gen_ai.client.token.usage is a histogram in units of {token}, split by a gen_ai.token.type attribute whose value is input or output, so one metric gives you input and output token distributions per model. gen_ai.client.operation.duration is a histogram in seconds for call latency. Both carry gen_ai.provider.name and gen_ai.request.model, so you slice them by model and by surface the same way you slice the spans. If the three-signal model is new to you, OpenTelemetry for LLMs walks through how spans, metrics, and the GenAI conventions fit together.

Anatomy of a Bedrock gen_ai chat span showing operation name, provider name aws.bedrock, model id, and input and output token attributes

The demo app and sending telemetry to OpenObserve

The running example for the rest of this guide is a small support assistant. It answers customer questions by calling Bedrock directly for simple replies, handing harder cases to a managed Bedrock Agent that can search a knowledge base, and running a long-lived planning agent on AgentCore for multi-step work. One application, all three Bedrock surfaces.

Every part of it exports over OTLP, so the backend configuration is shared. OpenObserve is OpenTelemetry-native: it accepts traces, metrics, and logs on one OTLP endpoint, which means there is no Bedrock-specific integration to install. You point the standard environment variables at it and the data lands. This is the same OTLP path described in OpenTelemetry backends and OTLP support.

OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-host/api/your_org
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic <base64(email:password)>

Two details that cause most first-run failures. The endpoint is the base organization path with no trailing slash; the OpenTelemetry SDK appends /v1/traces and /v1/metrics itself, and a trailing slash produces a 404. The auth header is HTTP Basic, where the token is the base64 of your email:password pair, which you generate with echo -n 'you@example.com:your-password' | base64. For a self-hosted instance, swap the host for http://localhost:5080 and keep the /api/your_org path. A successful ingest returns HTTP 200.

Part 1: Tracing raw Bedrock calls with boto3

A direct Bedrock call goes through boto3, and boto3 is instrumented by the botocore instrumentation, which recognizes Bedrock operations and emits gen_ai.* spans for them. There is no Bedrock-specific package to add.

pip install opentelemetry-instrumentation-botocore opentelemetry-exporter-otlp boto3

import boto3
from opentelemetry.instrumentation.botocore import BotocoreInstrumentor

BotocoreInstrumentor().instrument()

client = boto3.client("bedrock-runtime", region_name="us-east-1")

response = client.converse(
    modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": [{"text": "Why was my order delayed?"}]}],
)
print(response["output"]["message"]["content"][0]["text"])

That single call produces a span named chat anthropic.claude-3-5-sonnet-20241022-v2:0 with gen_ai.provider.name=aws.bedrock, the request and response model ids, and gen_ai.usage.input_tokens and gen_ai.usage.output_tokens read straight from the Bedrock response. It also records the gen_ai.client.token.usage and gen_ai.client.operation.duration metrics for the same call.

One choice matters here: use Converse, not InvokeModel, if you want dependable telemetry. The botocore instrumentation has full span, event, and metric coverage for Converse and ConverseStream across Bedrock models. InvokeModel and InvokeModelWithResponseStream are instrumented only for a subset of models, because each model family has its own request and response body shape and the instrumentation has to understand each one to pull token counts out. Converse normalizes the shape, so one instrumentation reads tokens and finish reasons the same way for every model behind it. If you are stuck on InvokeModel for a model that Converse does not support, expect to fill in token attributes yourself in a span processor.

One practical note: Bedrock model ids retire. The examples here use a Claude id for readability, but if Converse returns ResourceNotFoundException: This model version has reached the end of its life, the id is retired, swap in a current one (for example amazon.nova-lite-v1:0) from aws bedrock list-foundation-models. The instrumentation is identical regardless of model.

Raw Bedrock Converse span in OpenObserve showing gen_ai.provider.name aws.bedrock, the model id, and input and output token counts

Part 2: Tracing managed Bedrock Agents

A managed Bedrock Agent is different from a direct call. You do not run the loop; you call InvokeAgent on the bedrock-agent-runtime client and AWS orchestrates the model calls, the action-group tool invocations, and any knowledge-base retrieval on its side. From your process, it is one API call that hides a multi-step run.

Here is the catch, and it is the single most useful thing to know about this surface: out of the box the botocore instrumentation traces InvokeAgent only as a generic AWS-call span (rpc.method=InvokeAgent), with no gen_ai.* attributes at all. The GenAI conventions it applies so cleanly to bedrock-runtime Converse do not extend to bedrock-agent-runtime. So by default a managed agent is a black box: one bare RPC span and nothing about the models or tools it ran inside.

To get gen_ai-level visibility, ask AWS for the detail. InvokeAgent accepts enableTrace=True, and the response stream then includes the agent's orchestration trace: each internal model invocation with its token usage, plus tool and knowledge-base steps, across the preprocessing and orchestration phases. You map that returned data onto gen_ai.* spans yourself, an invoke_agent parent with a chat child per model invocation.

import boto3
from opentelemetry import trace

tracer = trace.get_tracer("support-assistant")
agent = boto3.client("bedrock-agent-runtime", region_name="us-east-1")

with tracer.start_as_current_span("invoke_agent") as parent:
    parent.set_attribute("gen_ai.operation.name", "invoke_agent")
    parent.set_attribute("gen_ai.provider.name", "aws.bedrock")
    parent.set_attribute("session.id", "user-7421-session-3")

    stream = agent.invoke_agent(
        agentId="AGENT123", agentAliasId="TSTALIASID",
        sessionId="user-7421-session-3",
        inputText="Why was my order delayed?",
        enableTrace=True,
    )
    for event in stream["completion"]:
        if "trace" not in event:
            continue
        tr = event["trace"]["trace"]
        for phase in ("preProcessingTrace", "orchestrationTrace"):
            usage = tr.get(phase, {}).get("modelInvocationOutput", {}).get("metadata", {}).get("usage")
            if not usage:
                continue
            with tracer.start_as_current_span("chat") as s:
                s.set_attribute("gen_ai.operation.name", "chat")
                s.set_attribute("gen_ai.provider.name", "aws.bedrock")
                s.set_attribute("gen_ai.request.model", "amazon.nova-lite-v1:0")
                s.set_attribute("gen_ai.usage.input_tokens", usage["inputTokens"])
                s.set_attribute("gen_ai.usage.output_tokens", usage["outputTokens"])

The result is a real gen_ai trace for a managed agent: an invoke_agent span with gen_ai.provider.name=aws.bedrock and child chat spans carrying the actual token counts AWS reports for the agent's internal calls. That is exactly the kind of agent call chain that monitoring AI agents in production is about: a slow or failing agent stops being a black hole in your latency. If you also leave BotocoreInstrumentor on, you will see its bare InvokeAgent RPC span sitting next to these gen_ai spans; drop it on this path if you want only the gen_ai view.

Managed Bedrock Agent trace in OpenObserve: an invoke_agent span with gen_ai.provider.name aws.bedrock and a child chat span carrying token counts

Note sessionId and the matching session.id attribute. Pass a stable value and every invocation for that conversation shares it, which is what lets you group an entire multi-turn session in OpenObserve later.

Part 3: Bedrock AgentCore observability

AgentCore is the third surface and the one with the most moving parts. Unlike a managed Bedrock Agent, AgentCore is not an agent itself, it is a managed runtime that hosts agent code you write, whether you build that agent with the Strands Agents SDK, another framework, or plain Python. AWS instruments the runtime through ADOT, the AWS Distro for OpenTelemetry, and the path is auto-instrumentation rather than code changes.

aws-opentelemetry-distro>=0.10.0
boto3

You launch the agent through the opentelemetry-instrument wrapper, which loads the distro and instruments the runtime, boto3, and the supported agent libraries without touching your code:

CMD ["opentelemetry-instrument", "python", "main.py"]

For an agent running outside the managed runtime, or when you want it to export somewhere other than CloudWatch, the configuration is environment variables:

AGENT_OBSERVABILITY_ENABLED=true
OTEL_PYTHON_DISTRO=aws_distro
OTEL_PYTHON_CONFIGURATOR=aws_configurator
OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-host/api/your_org
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic <base64(email:password)>
DISABLE_ADOT_OBSERVABILITY=true

DISABLE_ADOT_OBSERVABILITY=true is the switch that matters for this guide. By default ADOT routes AgentCore telemetry to CloudWatch GenAI Observability. Setting it true turns off the automatic CloudWatch path so your standard OTEL_EXPORTER_OTLP_ENDPOINT takes over and the traces flow to OpenObserve instead. You are not locked into one backend.

Two propagation details make AgentCore telemetry actually useful. Sessions propagate through the X-Amzn-Bedrock-AgentCore-Runtime-Session-Id header, mirrored into OpenTelemetry baggage as session.id, so every span from one user's session shares a key you can filter on. Trace context propagates through the W3C traceparent header (AgentCore also understands X-Ray's X-Amzn-Trace-Id), so when your application starts a span and then calls the agent, the agent's spans nest under yours and the whole request reads as one trace.

If you build the agent itself with the Strands Agents SDK, it has its own OpenTelemetry setup through StrandsTelemetry, which emits gen_ai.* spans for the agent loop, the Bedrock model calls, and tool invocations. One caveat worth knowing: its OTLP exporter is HTTP only, so use the HTTP endpoint on port 4318, not gRPC.

import os
from strands.telemetry import StrandsTelemetry

os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://your-host/api/your_org"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = "Authorization=Basic <base64(email:password)>"

StrandsTelemetry().setup_otlp_exporter()

A single run then produces the full agent loop as one trace: an invoke_agent parent, an execute_event_loop_cycle per turn, a chat span for each Bedrock model call with its token counts, and an execute_tool span for each tool the agent runs. This is the gen_ai-level agent visibility that the managed InvokeAgent API only gives you through enableTrace.

AgentCore Strands agent trace in OpenObserve: an invoke_agent span over event-loop cycles, chat model spans, and an execute_tool span

One honest inconsistency to plan for: Strands sets gen_ai.provider.name to strands-agents (the framework that emitted the span), not aws.bedrock (the backend the model actually ran on), while the botocore instrumentation for raw calls sets aws.bedrock. The model id is correct on both (amazon.nova-lite-v1:0), so the reliable cross-surface key is gen_ai.request.model, not gen_ai.provider.name. This is the experimental-conventions churn from the cheat sheet showing up in practice: two instrumentations, written against the spec at different points, disagree on the same field. If you want one consistent value, rewrite it server-side in an OpenObserve ingestion pipeline (strands-agents to aws.bedrock) before storage.

From tokens to dollars: the Bedrock cost layer

Token counts are on every span, but nobody budgets in tokens. The conversion to dollars is a small amount of code you own, because pricing is yours to maintain and changes more often than any library ships.

Keep a price map keyed by Bedrock model id, with an input and an output rate per million tokens, and write the cost onto the current span right after the Bedrock response, while the span is still being recorded. That adds one attribute, llm.cost.usd, so every Bedrock span (direct call, agent, or AgentCore) carries its own cost.

from opentelemetry import trace

# Rates are per 1M tokens. Pull current numbers from the Bedrock pricing
# page for the model ids you actually use; these are placeholders.
PRICES = {
    "anthropic.claude-3-5-sonnet-20241022-v2:0": {"in": 3.00, "out": 15.00},
    "amazon.nova-pro-v1:0": {"in": 0.80, "out": 3.20},
}


def record_cost(model_id, usage):
    price = PRICES.get(model_id)
    if not price:
        return
    cost = (
        usage["inputTokens"] * price["in"]
        + usage["outputTokens"] * price["out"]
    ) / 1_000_000
    trace.get_current_span().set_attribute("llm.cost.usd", round(cost, 6))


# After a Converse call, while the Bedrock span is still the current span:
record_cost("anthropic.claude-3-5-sonnet-20241022-v2:0", response["usage"])

The pattern, not the exact call site, is what matters: cost is derived from token counts and the model id, written once per span, and never hard-coded into the application. If you would rather compute it centrally instead of at each call, read the gen_ai.usage.* attributes off the span in a custom exporter (where you control a writable copy) before it ships, but the call-site approach above is the one that works with the stock SDK. Record it as a span attribute for per-call cost and, if you want aggregate counters and distributions, also emit it as a metric the way the LLM cost monitoring guide lays out, with a counter for cumulative dollars and a histogram for per-call cost. Because the cost rides on the span next to session.id and any user id you attach, you get spend per model, per session, and per user from the same data.

A note specific to Bedrock: if you run a model on a provisioned-throughput or custom-model endpoint, per-token pricing does not describe your real cost, which is the reserved capacity. Keep those model ids out of the per-token map and track their cost separately, or you will under-count by attributing reserved capacity to token volume.

A Bedrock cost and latency dashboard in OpenObserve

The payoff of putting all three surfaces on the same conventions is a single dashboard. Every Bedrock span carries gen_ai.request.model, gen_ai.operation.name, and now llm.cost.usd, so you build panels by grouping on those fields with no per-surface special casing. Group and join on gen_ai.request.model rather than gen_ai.provider.name: as the AgentCore section showed, the model id is consistent across all three surfaces while the provider name is not (unless you normalize it in a pipeline).

A useful starter layout:

Total Bedrock spend over time, summing llm.cost.usd, broken out by gen_ai.request.model so you see which model dominates the bill.
Spend by surface, grouping on gen_ai.operation.name (chat for direct calls, invoke_agent for managed agents) so you know whether agents or raw calls drive cost.
Call latency, charting the gen_ai.client.operation.duration histogram per model, so a slow model is visible next to its cost.
Token volume, charting gen_ai.client.token.usage split by gen_ai.token.type, to catch a prompt that quietly grew its input.

Bedrock cost and latency dashboard in OpenObserve with panels for spend by model, token volume, p95 latency, and spend over time

The investigation flow falls out of the shared keys. A spend spike on the cost panel filters to one model id; you pivot to traces for that model, filter by session.id to find the session that ran up the bill, and read its spans to see whether a runaway agent loop or an oversized prompt caused it. That pivot across signals on shared identifiers is the logs, metrics, and traces correlation workflow, with Bedrock's own attributes doing the joining instead of timestamps you line up by hand.

Keeping prompts and PII out of telemetry

Bedrock telemetry is structural by default, and that is the safe default. The spans record durations, model ids, provider, token counts, and your derived cost. They do not record the prompt text or the model's output, because the GenAI conventions treat message content as sensitive and leave the content attributes (gen_ai.input.messages, gen_ai.output.messages, gen_ai.system_instructions) off unless you opt in.

Leave them off unless your observability pipeline is approved to store what your prompts contain, which for a support assistant is customer data. If you enable content capture in a library, for example OpenLLMetry's TRACELOOP_TRACE_CONTENT, do it knowing the message bodies will land in your backend. When you need a middle ground, OpenObserve ingestion pipelines can redact specific fields server-side before storage, so you can keep tool details for debugging while scrubbing the parts you cannot retain. If OpenTelemetry itself is new to you, what is OpenTelemetry covers how attributes and the SDK fit together before you start turning content capture on.

See your Bedrock spend in OpenObserve

Amazon Bedrock already emits OpenTelemetry through the botocore instrumentation and ADOT, and OpenObserve ingests those traces and metrics on one OTLP endpoint, correlates them by session.id, and stores them next to the rest of your infrastructure observability. You do not need a separate LLM tool or the default CloudWatch path. Set the telemetry environment variables, point them at your instance, add the small cost layer, and your Bedrock spend across direct calls, agents, and AgentCore shows up on one dashboard. For the operational side once the data is flowing, the companion guide on monitoring AWS Bedrock covers the dashboards and alerts to build on top of it. Start free with OpenObserve Cloud and send your first Bedrock trace in minutes.

Frequently Asked Questions

: They name the same thing, but gen_ai.system is deprecated. The OpenTelemetry GenAI conventions replaced it with gen_ai.provider.name, and for Amazon Bedrock the value is aws.bedrock. Older instrumentations still emit gen_ai.system=aws.bedrock, so during the transition you may see both on a span. Filter dashboards on gen_ai.provider.name where it exists and fall back to gen_ai.system for older data, or set OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental so your instrumentation emits the current attribute names.
: For managed Bedrock Agents called through the bedrock-agent-runtime InvokeAgent API, the botocore instrumentation emits a gen_ai span with gen_ai.operation.name=invoke_agent. For AgentCore, AWS ships the aws-opentelemetry-distro (ADOT) and you launch the agent with opentelemetry-instrument, which auto-instruments the runtime and propagates the session through the X-Amzn-Bedrock-AgentCore-Runtime-Session-Id header. Both export over OTLP, so setting OTEL_EXPORTER_OTLP_ENDPOINT to OpenObserve (and DISABLE_ADOT_OBSERVABILITY=true to skip the default CloudWatch path) sends the traces to your own backend instead of only CloudWatch.
: Use Converse when you want reliable telemetry. The botocore instrumentation has full span, event, and metric coverage for Converse and ConverseStream, including token usage, across Bedrock models. InvokeModel and InvokeModelWithResponseStream are instrumented only for a subset of models because each model family has its own request and response body shape. Converse normalizes that shape, so a single instrumentation reads token counts and finish reasons the same way for every model behind it.
: The spans carry gen_ai.usage.input_tokens and gen_ai.usage.output_tokens and the model id in gen_ai.request.model. Keep a price map keyed by Bedrock model id with an input and output rate per million tokens, then compute cost as (input_tokens * input_rate + output_tokens * output_rate) / 1e6 and write it onto the span as a custom attribute such as llm.cost.usd. A span processor does this at export time so every span carries its own cost, and OpenObserve sums it per model, per session, or per user.
: No. As of the v1.41 semantic-conventions release the GenAI spans, metrics, and attributes are still marked Development, which is the current term for experimental. Attribute names have already changed once (gen_ai.system to gen_ai.provider.name). Pin your instrumentation versions, expect attribute churn between releases, and use OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental to opt a given instrumentation into the newest shape rather than its frozen default.

About the Author

Gorakhnath Yadav

Gorakhnath is a passionate developer advocate, working on bridging the gap between developers and the tools they use. He focuses on building communities and creating content that empowers developers to build better software.

Latest From Our Blogs

View all posts

Instrumenting CrewAI Multi-Agent Workflows with OpenTelemetry

How To

CrewAIOpenTelemetryObservability

Instrumenting CrewAI Multi-Agent Workflows with OpenTelemetry

Add real observability to CrewAI: map Crew, Agent, and Task objects to OpenTelemetry spans, tell CrewAI's own anonymous telemetry apart from your own tracing, and send the full multi-agent trace to OpenObserve.

Simran Kumari

2026-07-16

How To

MigrationHeliconeOpenObserve

How to Migrate from Helicone to OpenObserve

Helicone entered maintenance mode after Mintlify's March 2026 acquisition, with new signups closed and the roadmap frozen. Here's how to move LLM observability off Helicone's proxy and onto OpenObserve: replace the base-URL proxy with OpenTelemetry instrumentation, map Properties, Users, and Sessions to gen_ai attributes, and get infra correlation in the same backend.

We Built OpenObserve for Speed. Then We Fixed the UX.

We optimized OpenObserve for speed and cost and let the UI take a backseat. You told us. Here is what we changed, and why we are not done.

Ashish Kolhe

2026-07-14

Pin a Dashboard to Your OpenObserve Home Page (Org-Wide)

How To

DashboardsObservabilityOpenObserve

Pin a Dashboard to Your OpenObserve Home Page (Org-Wide)

You asked, we shipped: make one dashboard the org-wide landing view in OpenObserve. Pin it from the dashboard list or the dashboard header, and everyone on the team sees the same Home tab, server-side and across devices.

Ashish Kolhe

2026-07-13

Tracing a Runaway LLM Token Spike From Session to Trace to RUM

Engineering

LLM ObservabilityOpenTelemetryDistributed Tracing

Tracing a Runaway LLM Token Spike From Session to Trace to RUM

How an AI-governance engineer walks one anomalous LLM turn across three signals in OpenObserve — session, distributed trace, and RUM replay — to pin down cost, cause, and the human action behind a token spike.

Ashish Kolhe

2026-07-13

Instrumenting the OpenAI Agents SDK with OpenTelemetry

How To

OpenAI Agents SDKOpenTelemetryObservability

Instrumenting the OpenAI Agents SDK with OpenTelemetry

Trace the OpenAI Agents SDK with OpenTelemetry: map handoffs, guardrails, and agent spans to OTLP and send the full trace to OpenObserve, not OpenAI's backend.

Gorakhnath Yadav

2026-07-10

Observability Cost Optimization: 12 Tactics That Actually Work

Engineering

ObservabilityCostLogging

Observability Cost Optimization: 12 Tactics That Actually Work

Twelve config-level tactics for observability cost optimization, sampling, pipeline filtering, retention tiers, and cardinality control, with before/after numbers and real config examples for logs, metrics, and traces.

Simran Kumari

2026-07-10

OpenObserve vs Langfuse: Unified Observability vs LLM-Specific Platform (2026)

Engineering

ComparisonsLangfuseOpenObserve

OpenObserve vs Langfuse: Unified Observability vs LLM-Specific Platform (2026)

OpenObserve vs Langfuse in 2026: unified infra+LLM observability vs a dedicated LLM platform. Feature matrix, pricing, and when to use each (or both).

Gorakhnath Yadav

2026-07-10

Engineering

LoggingComparisonsObservability

Best Log Visualization Tools in 2026

Compare the best log visualization tools in 2026: OpenObserve, Kibana, Grafana Loki, Datadog, and Splunk. Covers AI-assisted analysis, dashboard quality, and cost.

Manas Sharma

2026-07-07

Top 10 Datadog Competitors in 2026: In-Depth Comparison for DevOps & SRE Teams

Engineering

ComparisonsObservabilityMonitoring

Top 10 Datadog Competitors in 2026: In-Depth Comparison for DevOps & SRE Teams

Compare the top 10 Datadog competitors in 2026: OpenObserve, Grafana, New Relic, Dynatrace, and Splunk. Pricing breakdowns, feature tables, and migration guidance for DevOps and SRE teams.

Simran Kumari

2026-07-07

Instrumenting Amazon Bedrock, Bedrock Agents, and AgentCore with OpenTelemetry

Ready to get started?

Three ways to call Bedrock, one set of conventions

The gen_ai.* cheat sheet for Bedrock (v1.41)

The demo app and sending telemetry to OpenObserve

Part 1: Tracing raw Bedrock calls with boto3

Part 2: Tracing managed Bedrock Agents

Part 3: Bedrock AgentCore observability

From tokens to dollars: the Bedrock cost layer

A Bedrock cost and latency dashboard in OpenObserve

Keeping prompts and PII out of telemetry

See your Bedrock spend in OpenObserve

Frequently Asked Questions

What is the difference between gen_ai.system and gen_ai.provider.name for Bedrock?

How do I instrument Bedrock Agents and AgentCore with OpenTelemetry?

Should I use InvokeModel or Converse for OpenTelemetry instrumentation?

How do I calculate Amazon Bedrock cost in dollars from OpenTelemetry token attributes?

Are the OpenTelemetry GenAI semantic conventions stable yet?

About the Author

Gorakhnath Yadav

Latest From Our Blogs

Instrumenting CrewAI Multi-Agent Workflows with OpenTelemetry

How to Migrate from Helicone to OpenObserve

We Built OpenObserve for Speed. Then We Fixed the UX.

Pin a Dashboard to Your OpenObserve Home Page (Org-Wide)

Tracing a Runaway LLM Token Spike From Session to Trace to RUM

Instrumenting the OpenAI Agents SDK with OpenTelemetry

Observability Cost Optimization: 12 Tactics That Actually Work

OpenObserve vs Langfuse: Unified Observability vs LLM-Specific Platform (2026)

Best Log Visualization Tools in 2026

Top 10 Datadog Competitors in 2026: In-Depth Comparison for DevOps & SRE Teams