Anthropic (Python) → OpenObserve
Automatically capture token usage, latency, and model metadata for every Claude API call in your Python application.
Prerequisites
- Python 3.8+
- An OpenObserve account (cloud or self-hosted)
- Your OpenObserve organisation ID and Base64-encoded auth token
- An Anthropic API key
Installation
Configuration
Create a .env file in your project root:
# OpenObserve instance URL
# Default for self-hosted: http://localhost:5080
OPENOBSERVE_URL=https://api.openobserve.ai/
# Your OpenObserve organisation slug or ID
OPENOBSERVE_ORG=your_org_id
# Basic auth token — Base64-encoded "email:password"
OPENOBSERVE_AUTH_TOKEN="Basic <your_base64_token>"
# Anthropic API key
ANTHROPIC_API_KEY=your-anthropic-key
Instrumentation
Call AnthropicInstrumentor().instrument() before any Anthropic client is created.
from opentelemetry.instrumentation.anthropic import AnthropicInstrumentor
from openobserve import openobserve_init
# Instrument before importing the Anthropic client
AnthropicInstrumentor().instrument()
openobserve_init()
from anthropic import Anthropic
client = Anthropic()
# Messages API
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "What is distributed tracing?"}],
)
print(response.content[0].text)
System prompt
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=512,
system="You are a concise technical assistant.",
messages=[{"role": "user", "content": "Explain a trace span in one sentence."}],
)
print(response.content[0].text)
Streaming
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=256,
messages=[{"role": "user", "content": "Write a haiku about observability."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Async client
from anthropic import AsyncAnthropic
import asyncio
async_client = AsyncAnthropic()
async def main():
response = await async_client.messages.create(
model="claude-sonnet-4-6",
max_tokens=256,
messages=[{"role": "user", "content": "Hello async!"}],
)
print(response.content[0].text)
asyncio.run(main())
What Gets Captured
| Attribute | Description |
|---|---|
gen_ai_request_model |
Requested model (e.g. claude-haiku-4-5-20251001) |
gen_ai_response_model |
Actual model version used |
gen_ai_response_id |
Unique response ID from Anthropic |
gen_ai_system |
Provider identifier (Anthropic) |
gen_ai_usage_input_tokens |
Input tokens consumed |
gen_ai_usage_output_tokens |
Output tokens generated |
gen_ai_usage_cache_creation_input_tokens |
Tokens written to the prompt cache |
gen_ai_usage_cache_read_input_tokens |
Tokens read from the prompt cache |
llm_usage_tokens_total |
Total tokens consumed |
llm_usage_cost_input |
Estimated input cost in USD |
llm_usage_cost_output |
Estimated output cost in USD |
llm_usage_cost_total |
Estimated total cost in USD |
duration |
End-to-end request latency |
error |
Exception details if the request failed |
Viewing Traces
- Log in to OpenObserve and navigate to Traces in the left sidebar
- Click any span to inspect token counts, latency, and full request metadata

Next Steps
With Anthropic instrumented, every Claude API call in your application is automatically recorded in OpenObserve. From here you can track token usage and cost per conversation turn, monitor cache hit rates to optimise prompt caching savings, and correlate Claude spans with the rest of your application traces.