Ready to get started?

Try OpenObserve Cloud today for more efficient and performant observability.

Table of Contents
Architecture diagram showing n8n sending metrics and traces through an OTel Collector to OpenObserve

n8n Monitoring with OpenTelemetry and OpenObserve

TL:DR;

  • n8n monitoring in production requires more than the built-in execution UI: there are no external metrics, no alerting, and no cross-workflow correlation out of the box.
  • Self-hosted n8n exposes Prometheus metrics via a single environment variable. OTel trace export requires a custom Docker image that loads a tracing bootstrap file before n8n starts.
  • An OTel Collector sits between n8n and OpenObserve, scraping Prometheus and forwarding OTLP traces in one pipeline.
  • For services that trigger n8n workflows via webhooks, the openobserve-telemetry-sdk wraps those calls in OTel spans automatically. This works regardless of whether n8n is self-hosted or Cloud.
  • OpenObserve has an official n8n integration guide covering both the Prometheus metrics path and the webhook instrumentation path.

n8n monitoring architecture: OTLP Traces push from n8n to OTel Collector, Prometheus scrape from OTel Collector to n8n, forwarding to OpenObserve

Why n8n workflows are hard to observe

n8n gives you an execution list and per-run logs in the UI. That covers debugging: click into a failed run, see which node errored, inspect the input and output. For a development environment, that is enough.

In production it falls short. The execution list is not queryable from outside n8n. Alerting on failure rate is not possible. Cross-workflow correlation, queue depth comparisons, and week-over-week trend analysis are all outside its scope. If a node silently times out and triggers a retry, the UI shows the retry succeeded. The underlying latency problem is invisible.

n8n is a Node.js application, which means standard observability tooling applies. The gap is configuration, not capability. For workflow automation at any production scale, full-stack observability connecting logs, metrics, and traces matters the same as for any other service in your stack.

What n8n gives you out of the box

Prometheus metrics endpoint

Self-hosted n8n includes a Prometheus metrics endpoint, disabled by default. Set N8N_METRICS=true to enable it at /metrics. The endpoint exposes counters and histograms covering execution activity across all workflows.

Key metrics the endpoint exposes:

Metric Type What it tells you
n8n_workflow_execution_duration_seconds Histogram Per-execution wall time; status label: success or failed; mode label: manual, webhook, or trigger
n8n_active_workflow_count Gauge Number of active workflows
n8n_scaling_mode_queue_jobs_waiting Gauge Jobs queued but not yet picked up (worker mode)
n8n_scaling_mode_queue_jobs_active Gauge Jobs being processed by workers (worker mode)
n8n_scaling_mode_queue_jobs_completed Counter Total completed jobs since instance start (worker mode)
n8n_scaling_mode_queue_jobs_failed Counter Total failed jobs since instance start (worker mode)

The last four metrics require N8N_METRICS_INCLUDE_QUEUE_METRICS=true set as an environment variable on your n8n container, alongside N8N_METRICS=true.

Log streaming

n8n Enterprise includes log streaming: execution events forwarded in real time to a syslog server or a generic webhook. It gives you event-level data outside the UI and integrates with log aggregators. This is an Enterprise-only feature.

Self-hosted n8n: Prometheus metrics and OTel traces

For self-hosted n8n, you get both metrics and traces. Prometheus metrics need only an environment variable. OTel traces require a small custom Docker image that installs the OTel SDK and loads a bootstrap file before n8n starts. Neither approach modifies n8n's workflow logic.

n8n monitoring architecture: OTLP Traces push from n8n to OTel Collector, Prometheus scrape from OTel Collector to n8n, forwarding to OpenObserve

Step 1: Enable Prometheus metrics

Add these environment variables to your n8n container. No custom image needed for this step:

N8N_METRICS=true
N8N_METRICS_INCLUDE_QUEUE_METRICS=true

Verify the endpoint is live after restarting:

curl http://localhost:5678/metrics | grep n8n_

Step 2: Add OTel auto-instrumentation

n8n does not ship with the OTel SDK initialized, so traces require a bootstrap file loaded before the process starts via NODE_OPTIONS=--require. This means building a small custom image on top of the official one.

Create tracing.js:

'use strict';

const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://otel-collector:4318/v1/traces',
  }),
  instrumentations: [
    getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-fs': { enabled: false },
    }),
  ],
});

sdk.start();

process.on('SIGTERM', () => {
  sdk.shutdown().finally(() => process.exit(0));
});

Filesystem instrumentation is disabled to suppress span noise from n8n's internal file operations.

Create a Dockerfile that installs the OTel packages at the system level where Node.js resolves them via NODE_OPTIONS:

FROM n8nio/n8n:2.19.3

USER root

RUN mkdir -p /opt/otel && npm install --prefix /opt/otel \
  @opentelemetry/sdk-node \
  @opentelemetry/auto-instrumentations-node \
  @opentelemetry/exporter-trace-otlp-http

COPY tracing.js /opt/otel/tracing.js

USER node

Step 3: Configure the OTel Collector

The Collector scrapes the Prometheus endpoint for metrics and receives OTLP spans from n8n. Both streams go to OpenObserve. The OpenTelemetry Collector Contrib distribution includes both the Prometheus receiver and OTLP exporter needed here.

Before writing the config, grab your OTLP endpoint and authorization header from the OpenObserve UI under Data Sources → Custom → Logs/Traces/Metrics → OTEL Collector. The page shows the exact endpoint URL and the pre-encoded Authorization header for your organization, so you don't have to construct them by hand.

OpenObserve Data Sources UI showing the OTEL Collector endpoint and Authorization header to copy into the exporter config

Create otel-collector-config.yaml:

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: n8n
          scrape_interval: 30s
          static_configs:
            - targets: ['n8n:5678']
          metrics_path: /metrics

  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 5s

exporters:
  otlphttp/openobserve:
    endpoint: <your-openobserve-otlp-endpoint>
    headers:
      Authorization: <your-authorization-header>

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch]
      exporters: [otlphttp/openobserve]
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/openobserve]

Paste the endpoint and Authorization header straight from the Data Sources page above. For a full breakdown of OTLP exporter options, see Getting Started with OpenTelemetry OTLP Exporters.

Step 4: Wire it together in Docker Compose

services:
  n8n:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "5678:5678"
    environment:
      - N8N_METRICS=true
      - N8N_METRICS_INCLUDE_QUEUE_METRICS=true
      - NODE_OPTIONS=--require /opt/otel/tracing.js
      - OTEL_SERVICE_NAME=n8n
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318/v1/traces
    volumes:
      - n8n_data:/home/node/.n8n
    depends_on:
      - otel-collector

  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317:4317"
      - "4318:4318"

volumes:
  n8n_data:

Start the stack:

docker compose up -d

Allow 30-60 seconds for the first Prometheus scrape to complete and the first workflow execution spans to appear in OpenObserve.

What you see in OpenObserve

In Metrics, n8n_workflow_execution_duration_seconds_count broken down by status gives you success and failure counts. n8n_workflow_execution_duration_seconds_bucket gives you p50/p95/p99 latency histograms. n8n_scaling_mode_queue_jobs_waiting shows whether your worker queue is building up.

OpenObserve metrics explorer showing n8n_workflow_execution_duration_seconds_count broken down by status

In Traces, each workflow execution appears as a root span with child spans for individual nodes. For workflows calling external services, the outbound HTTP spans appear under the relevant node span, showing exactly how long the downstream call took.

Instrument the services that call n8n

Whether you use n8n Cloud or self-hosted, any service that triggers n8n via webhooks can be instrumented independently using the openobserve-telemetry-sdk. This gives you traces on every webhook invocation: latency, HTTP status codes, and error details, all in OpenObserve without touching n8n's infrastructure.

This is the approach covered in OpenObserve's official n8n integration guide.

Webhook caller instrumentation architecture: service with openobserve-telemetry-sdk sending OTel spans to OpenObserve via n8n webhook calls

Install and configure

pip install openobserve-telemetry-sdk opentelemetry-api requests python-dotenv

Create a .env file:

OPENOBSERVE_URL=https://api.openobserve.ai/
OPENOBSERVE_ORG=your_org_id
OPENOBSERVE_AUTH_TOKEN=Basic <your_base64_token>
N8N_BASE_URL=http://localhost:5678
N8N_WEBHOOK_ID=your-webhook-path

Get OPENOBSERVE_ORG from the OpenObserve UI under your account settings. The AUTH_TOKEN is your Base64-encoded email:password.

Wrap webhook calls in OTel spans

from dotenv import load_dotenv
load_dotenv()

from openobserve import openobserve_init
openobserve_init()

from opentelemetry import trace
import os
import requests

tracer = trace.get_tracer(__name__)

base_url = os.environ.get("N8N_BASE_URL", "http://localhost:5678")
webhook_id = os.environ["N8N_WEBHOOK_ID"]


def trigger_webhook(payload: dict):
    with tracer.start_as_current_span("n8n.webhook_trigger") as span:
        span.set_attribute("n8n.webhook_id", webhook_id)
        span.set_attribute("n8n.payload_keys", str(list(payload.keys())))
        resp = requests.post(
            f"{base_url}/webhook/{webhook_id}",
            headers={"Content-Type": "application/json"},
            json=payload,
            timeout=30,
        )
        span.set_attribute("n8n.status_code", resp.status_code)
        span.set_attribute("span_status", "OK" if resp.ok else "ERROR")
        return resp


result = trigger_webhook({"message": "Hello from my service"})
print(result.status_code, result.text)

openobserve_init() reads the OPENOBSERVE_* env vars and configures the OTel SDK to export directly to OpenObserve. Every call to trigger_webhook produces a span with these attributes:

Attribute Value
n8n.webhook_id The webhook path being called
n8n.payload_keys Field names in the request payload
n8n.status_code HTTP response code from n8n
span_status OK or ERROR

For services that trigger n8n in multiple places, wrap each call site the same way. If your service already uses OpenTelemetry for its own instrumentation, the n8n webhook span becomes a child of the existing trace automatically, giving you end-to-end context from your application into n8n. For a broader look at how OTel context propagation works across services, see What is OpenTelemetry?.

What you see in OpenObserve

In the Traces explorer, filter by span name n8n.webhook_trigger. You can see latency distribution across all webhook calls, which webhook IDs are failing, and full error messages for non-2xx responses.

OpenObserve Traces explorer filtered to n8n.webhook_trigger spans showing latency, status, and webhook_id attributes

This pairs naturally with the self-hosted Prometheus metrics if you run both: metrics show what is happening inside n8n, traces show what your application sees when it calls n8n. For AI workflows, the same SDK and the same pipeline handle LLM observability alongside your workflow traces.

What to alert on

With metrics in OpenObserve, these four signals cover most production failure modes.

Execution failure rate rising. In OpenObserve, create a scheduled query alert:

rate(n8n_workflow_execution_duration_seconds_count{status="failed"}[5m]) > 0.1

A threshold of 0.1 is roughly 6 failures per minute. Start with > 0 to catch any failure during initial setup, then tune to your baseline.

Execution duration spiking. Track p95 latency using the histogram:

histogram_quantile(0.95, rate(n8n_workflow_execution_duration_seconds_bucket[10m]))

Baseline this over a week of normal operation and alert when it exceeds 2x.

Queue backlog building up. For n8n in worker mode:

n8n_scaling_mode_queue_jobs_waiting > 100

A queue that keeps growing without draining means workers are falling behind. Adjust the threshold to your normal operating depth.

Dead-man alert. Alert when no executions have run in a window where you normally see activity:

rate(n8n_workflow_execution_duration_seconds_count[10m]) == 0

Scope this to your expected active periods. A zero execution rate during a busy window means something has stopped scheduling workflows.

For services using the webhook instrumentation path, add an alert on webhook error rate directly in OpenObserve using the n8n.status_code span attribute. Filter traces where span_status = ERROR and alert when the error count crosses your threshold.

OpenObserve alert configuration for n8n execution failure rate using PromQL scheduled query

Try OpenObserve Cloud

OpenObserve Cloud accepts OTLP over HTTP with no infrastructure to manage. Point your OTel Collector exporter or the openobserve-telemetry-sdk at your cloud endpoint, and n8n metrics and traces start appearing immediately. The free tier covers enough ingestion to get started. Sign up at cloud.openobserve.ai.

Frequently Asked Questions

About the Author

Gorakhnath Yadav

Gorakhnath Yadav

TwitterLinkedIn

Gorakhnath is a passionate developer advocate, working on bridging the gap between developers and the tools they use. He focuses on building communities and creating content that empowers developers to build better software.

Latest From Our Blogs

View all posts