Introducing the OpenObserve Kubernetes Operator: Observability as Code




Try OpenObserve Cloud today for more efficient and performant observability.
Get Started For Free
TL;DR: The OpenObserve Kubernetes Operator brings Infrastructure as Code principles to your observability stack. Manage alerts, pipelines, functions, destinations, and templates as native Kubernetes resources with GitOps workflows. Available in OpenObserve Enterprise Edition, free for up to 200GB ingestion per day.
Platform teams scaling Kubernetes deployments face a specific problem: managing observability configurations across environments creates operational overhead. Manual UI configuration or API scripts lead to:
Organizations need to manage observability the same way they manage applications: declaratively, with version control, and automated deployments.
The OpenObserve Kubernetes Operator (o2-k8s-operator) transforms observability management into a Kubernetes-native experience. Define your entire observability stack as YAML manifests. Version control everything. Deploy with GitOps tools like ArgoCD or Flux.
Key capabilities:
Fully Declarative: Define alerts, pipelines, functions, templates, and destinations as YAML. No UI clicking or ad-hoc scripts.
GitOps Ready: Version control everything. Review changes through pull requests. Automate deployments with CI/CD pipelines.
Multi-Instance Support: Manage multiple OpenObserve Enterprise instances (dev, test, prod) from a single Kubernetes cluster with isolated configurations.
Real-Time Status: Get instant feedback on sync status, errors, and resource health through Kubernetes status conditions.
Important: The operator works exclusively with OpenObserve Enterprise Edition. Enterprise includes 200GB/day free tier.
The operator introduces six Custom Resource Definitions (CRDs):
Connect to OpenObserve Enterprise instances with secure credential handling:
apiVersion: openobserve.ai/v1alpha1
kind: OpenObserveConfig
metadata:
name: production
spec:
endpoint: https://api.openobserve.ai
organization: my-org
credentialsSecretRef:
name: o2-credentials
tlsVerify: true
Define alerts with SQL or PromQL queries, flexible scheduling, and deduplication:
apiVersion: openobserve.ai/v1alpha1
kind: Alert
metadata:
name: high-error-rate
spec:
configRef:
name: production
streamName: application-logs
streamType: logs
enabled: true
queryCondition:
type: custom
sql: "SELECT COUNT(*) as count FROM default WHERE level='error'"
aggregation:
function: count
having:
column: count
operator: GreaterThan
value: 100
duration: 5
frequency: 1
destinations:
- slack-alerts
Create reusable templates for Slack, PagerDuty, email, or webhooks:
apiVersion: openobserve.ai/v1alpha1
kind: OpenObserveAlertTemplate
metadata:
name: slack-template
spec:
configRef:
name: production
name: slack-webhook-template
type: http
title: "π¨ Alert: {alert_name}"
body: |
{
"text": "Alert Triggered",
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "*Alert:* {alert_name}\n*Stream:* {stream_name}\n*Time:* {triggered_at}"
}
}
]
}
Route alerts to Slack, PagerDuty, email, SNS, Splunk, Elasticsearch, and more:
apiVersion: openobserve.ai/v1alpha1
kind: OpenObserveDestination
metadata:
name: slack-alerts
spec:
configRef:
name: production
name: slack-destination
type: http
url: https://hooks.slack.com/services/YOUR/WEBHOOK/URL
method: post
headers:
Content-Type: application/json
template: slack-template
Write VRL (Vector Remap Language) functions with built-in testing:
apiVersion: openobserve.ai/v1alpha1
kind: OpenObserveFunction
metadata:
name: data-enricher
spec:
configRef:
name: production
name: enrich-logs
function: |
.processed_at = now()
.environment = "production"
if exists(.error) {
.severity = "high"
}
.
test:
enabled: true
input:
- error: "Connection timeout"
message: "Service unavailable"
output:
- error: "Connection timeout"
message: "Service unavailable"
processed_at: "2024-01-01T00:00:00Z"
environment: "production"
severity: "high"
Build data processing pipelines with node-based architecture:
apiVersion: openobserve.ai/v1alpha1
kind: OpenObservePipeline
metadata:
name: error-log-processor
spec:
configRef:
name: production
name: error-log-processor
description: "Process error logs and route to multiple destinations"
enabled: true
org: default
# Real-time source
source:
streamName: "application-logs"
streamType: "logs"
sourceType: "realtime"
# Processing nodes
nodes:
- id: "filter-errors"
type: "condition"
config:
conditions:
or:
- column: "level"
operator: "="
value: "error"
- column: "status_code"
operator: ">="
value: "500"
- id: "enrich-data"
type: "function"
config:
function: "log-enricher"
- id: "error-output"
type: "stream"
config:
org_id: "default"
stream_name: "critical_errors"
stream_type: "logs"
# Data flow
edges:
- source: "source"
target: "filter-errors"
- source: "filter-errors"
target: "enrich-data"
condition: true
- source: "enrich-data"
target: "error-output"
Pipeline capabilities:
Scenario: Platform team maintains consistent alerting across 50+ microservices in dev, test, and production.
Implementation:
git revertResult: Zero configuration drift, full audit trail, 90% reduction in alert management overhead.
Scenario: SaaS platform needs isolated observability per customer environment.
Implementation:
Result: Secure multi-tenancy with simplified operations.
Scenario: DevOps team needs alerts to create PagerDuty incidents, post to Slack, and send email summaries.
Implementation:
Result: Consistent notifications across all channels with zero manual configuration.
Performance tuning (via ConfigMap):
ALERT_CONTROLLER_CONCURRENCY: "5"
O2_RATE_LIMIT_RPS: "50"
O2_MAX_CONNS_PER_HOST: "20"
/healthz, /readyz, /startup/metrics1. Deploy the operator:
git clone https://github.com/openobserve/o2-k8s-operator
cd o2-k8s-operator
./deploy.sh
2. Configure connection:
kubectl apply -f configs/prod/o2prod-config.yaml
3. Deploy your first alert:
kubectl apply -f samples/alerts/high-cpu-alert.yaml
4. Check status:
kubectl get alerts
kubectl describe alert high-cpu-alert
Your alert now syncs automatically with OpenObserve Enterprise.
The operator ensures your desired state (Kubernetes resources) matches actual state (OpenObserve configurations):
Zero-downtime updates:
The operator shifts observability management from manual to automated:
β Manual β Automated β GUI-driven β Code-driven β Scattered β Centralized β Undocumented β Version-controlled β Fragile β Reliable
Platform teams apply the same engineering practices to observability that they use for applications: code review, testing, CI/CD, and automated rollbacks.
Documentation:
Community:
The OpenObserve Kubernetes Operator (v1.0.6) brings observability as code to platform engineering teams. Whether managing a small development cluster or observability at scale across hundreds of services, the operator provides the foundation for reliable, automated, and auditable operations.
Get Started with OpenObserve: https://openobserve.ai/downloads/

I'm a Solution Architect and Observability Engineer with over 10 years of experience helping organizations build resilient, transparent systems. As a Certified Splunk Consultant, I've spent my career turning data into actionable insights that drive real business outcomes. I'm passionate about open source observability tools and believe that robust monitoring is the foundation of modern infrastructure. I share practical strategies, lessons learned, and hands-on guidance from the trenches of enterprise observability