Integration with Apache Airflow
This guide explains how to monitor Apache Airflow using the OpenTelemetry Collector (otelcol
) and export logs, metrics, and traces to OpenObserve for visualization.
Overview
Apache Airflow is a workflow automation and orchestration tool widely used for ETL pipelines, ML workflows, and data engineering tasks. Monitoring Airflow is critical for ensuring workflow reliability, debugging issues, and tracking system performance.
With OpenTelemetry and OpenObserve, you gain real-time observability into Airflow DAG runs, task execution, scheduler activity, and worker performance.
Steps to Integrate
Prerequisites
- OpenObserve account (Cloud or Self-Hosted)
- Apache Airflow installed and running
- Basic understanding of Airflow configs (
airflow.cfg
) - OpenTelemetry Collector installed
Step 1: Configure Airflow for OpenTelemetry
Edit airflow.cfg
to enable OTel metrics:
Restart Airflow services after updating config:
Step 2: Install OpenTelemetry Collector
-
Download and install the OTel Collector:
-
Verify installation:
Step 3: Get OpenObserve Endpoint and Token
- In OpenObserve: go to Data Sources → Otel Collector
- Copy the Ingestion URL and Access Token
Step 4: Configure OpenTelemetry Collector
-
Create/edit config file:
-
Add Airflow configuration:
receivers: filelog/std: include: - /airflow/logs/*/*.log - /airflow/logs/scheduler/*/*/*/*.log start_at: beginning otlp: protocols: grpc: http: processors: batch: exporters: otlphttp/openobserve: endpoint: OPENOBSERVE_ENDPOINT headers: Authorization: "OPENOBSERVE_TOKEN" stream-name: airflow service: pipelines: metrics: receivers: [otlp] processors: [batch] exporters: [otlphttp/openobserve] logs: receivers: [filelog/std, otlp] processors: [batch] exporters: [otlphttp/openobserve] traces: receivers: [otlp] processors: [batch] exporters: [otlphttp/openobserve]
Replace placeholders with your OpenObserve details:
OPENOBSERVE_ENDPOINT
→ API endpoint (e.g.,https://api.openobserve.ai
)OPENOBSERVE_TOKEN
→ Access token
Step 5: Start OpenTelemetry Collector
sudo systemctl start otel-collector
sudo systemctl status otel-collector
journalctl -u otel-collector -f
Check logs to confirm data is being sent to OpenObserve.
Step 6: Visualize Logs in OpenObserve
- Go to Streams → airflow in OpenObserve to query logs.Airflow logs collected include: DAG execution logs, Scheduler logs, Worker logs and Task execution logs
Prebuilt Dashboards
Prebuilt Airflow dashboards are available. You can download the JSON file and import it.
Troubleshooting
-
No Logs in OpenObserve
- Ensure
filelog
receiver paths match your Airflow log directory. - Verify Collector service is running.
- Ensure
-
Metrics Not Visible
- Check
otel_on = True
inairflow.cfg
. - Confirm Airflow is sending metrics to
localhost:4318
.
- Check
-
Collector Fails to Start
- Run dry check:
- Fix syntax or missing receivers.