xAI Grok → OpenObserve
Automatically capture token usage, latency, and model metadata for every xAI Grok inference call in your Python application. xAI exposes an OpenAI-compatible API, so instrumentation uses the standard OpenAI instrumentor pointed at the xAI endpoint.
Prerequisites
- Python 3.9+
- An OpenObserve account (cloud or self-hosted)
- Your OpenObserve organisation ID and Base64-encoded auth token
- An xAI API key
Installation
Configuration
Create a .env file in your project root:
# OpenObserve instance URL
# Default for self-hosted: http://localhost:5080
OPENOBSERVE_URL=https://api.openobserve.ai/
# Your OpenObserve organisation slug or ID
OPENOBSERVE_ORG=your_org_id
# Basic auth token — Base64-encoded "email:password"
OPENOBSERVE_AUTH_TOKEN=Basic <your_base64_token>
# xAI API key
XAI_API_KEY=your-xai-api-key
Instrumentation
Call OpenAIInstrumentor().instrument() before creating the OpenAI client. Point the client at the xAI base URL and pass your xAI API key.
from dotenv import load_dotenv
load_dotenv()
from openinference.instrumentation.openai import OpenAIInstrumentor
from openobserve import openobserve_init
OpenAIInstrumentor().instrument()
openobserve_init()
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["XAI_API_KEY"],
base_url="https://api.x.ai/v1",
)
response = client.chat.completions.create(
model="grok-3-mini",
messages=[{"role": "user", "content": "Explain distributed tracing in one sentence."}],
)
print(response.choices[0].message.content)
What Gets Captured
| Attribute | Description |
|---|---|
llm_model_name |
Model name (e.g. grok-3-mini) |
llm_provider |
xai |
llm_system |
openai (the client library used) |
llm_token_count_prompt |
Tokens in the prompt |
llm_token_count_completion |
Tokens in the response |
llm_token_count_total |
Total tokens consumed |
llm_token_count_completion_details_reasoning |
Reasoning tokens used (for reasoning models) |
llm_token_count_prompt_details_cache_read |
Prompt tokens served from cache |
openinference_span_kind |
LLM |
operation_name |
ChatCompletion |
duration |
End-to-end request latency |
span_status |
OK or error status |
Viewing Traces
- Log in to OpenObserve and navigate to Traces
- Click any span to inspect token counts and the full request payload
- Filter by
llm_model_nameto compare Grok model versions side by side

Next Steps
With xAI Grok instrumented, every inference call is recorded in OpenObserve. From here you can monitor latency per model version, track token usage, and set alerts on error spans.