Convert Raw Logs into Metrics with OpenObserve Pipelines

Simran Kumari

December 16, 2025

11 min read

Don’t forget to share!

Ready to get started?

Try OpenObserve Cloud today for more efficient and performant observability.

Table of Contents

Most teams begin their observability journey with logs. They’re easy to add, they tell you exactly what happened, and when something breaks, logs are usually the first place you look.

Logs capture individual events, but those events often include metric data points: timestamps, status codes, error flags, and latency values. Each log entry represents a single point in time, and together, they form a time series.

As systems scale, teams want dashboards that show trends: error rates over time, request volume per minute, latency percentiles. While this data exists inside logs, building dashboards directly on raw event streams means aggregating high-volume data repeatedly, which quickly becomes slow and inefficient.

The interesting part is that the problem usually isn’t a lack of data. It’s that teams are asking metric questions while still relying entirely on logs. The better approach is to extract metrics from event logs and store them as first-class time-series data.

In this article, we will cover how to convert logs into metrics using a scheduled pipeline in OpenObserve, step by step.

Need to Convert Logs to Metrics

Logs and metrics are often talked about together, but they exist for very different reasons.

Aspect	Logs	Metrics
What they represent	Individual events	Aggregated summaries
Level of detail	Very detailed (per event / per request)	High-level trends and counts
Cardinality	High	Low
Typical questions answered	What exactly happened?	How often did it happen? How bad is it?
Best used for	Debugging, root cause analysis	Monitoring, alerting, dashboards
Query cost	Expensive at scale	Cheap and fast

When you try to use logs as a substitute for metrics, you end up paying the cost of high cardinality for questions that don’t need that level of detail. The solution isn’t to get rid of logs. It’s to derive the right metrics from them. This is where pipelines come into the picture.

Where OpenObserve pipelines fit in

A pipeline in OpenObserve is a configurable data processing workflow that determines how incoming data is handled after ingestion. In OpenObserve, pipelines broadly fall into two categories, based on when that logic is applied.

Real-time pipelines operate on individual events as they arrive. They’re commonly used for tasks like normalizing fields, enriching records, dropping noisy data, routing events to different streams, or forwarding data to remote destinations. Because they work on a per-event basis, they’re well suited for immediate, stateless decisions.
Scheduled pipelines serve a different purpose. Instead of acting on each event, scheduled pipelines run at fixed intervals and operate over a defined time window. This makes them ideal for aggregation use cases , especially when you want to derive metrics from logs. Metrics are inherently time-based, and scheduled pipelines align naturally with that model by summarizing data over bounded windows and producing stable, reusable results.

For a logs-to-metrics use case, that logic is usually straightforward. You read log events, filter out what matters, aggregate them over time, and write the result as a metric.

The flow from logs to metrics

In practice, the flow looks like this. Your application emits logs, which are ingested and stored in a log stream. A scheduled pipeline runs every minute or every few minutes and reads logs from the previous window. It filters and aggregates those logs and writes the result into a metric stream.

Once that metric stream exists, dashboards and alerts read from it directly. Logs are still there when you need to debug, but they’re no longer powering every operational query.

What a scheduled pipeline actually contains

A scheduled pipeline is simple but explicit. It consists of:

a source, typically a log stream queried using SQL
optional transformation logic for filtering or enrichment
a destination, such as a metric stream
a schedule that defines how often the pipeline runs

The aggregation window is defined by the pipeline schedule itself. Each execution processes logs from the previous window and produces one or more metric datapoints.

Step-by-step: Converting logs into metrics using a scheduled pipeline

Let’s walk through how this actually works in practice, using a scheduled pipeline to convert Kubernetes logs into a metric stream.

Prerequisites:

An OpenObserve instance running ( OpenObserve Cloud or Self-hosted)

Step 1: Start with a log stream

Your application and infrastructure emit logs, which are ingested into OpenObserve and stored in a log stream.

For this demo we will be making use of sample Kubernetes logs:

# Download and extract sample Kubernetes logs
curl -L https://zinc-public-data.s3.us-west-2.amazonaws.com/zinc-enl/sample-k8s-logs/k8slog_json.json.zip -o k8slog_json.json.zip
unzip k8slog_json.json.zip

In a Kubernetes setup, these logs typically contain fields like:

_timestamp
code (HTTP status code)
kubernetes_container_name
kubernetes_labels_app
kubernetes_host
other Kubernetes metadata

At this stage, the data is raw and event-oriented. Each log line represents a single occurrence.

Sample Kubernetes logs

Step 2: Decide what metric you want to derive

Before writing any pipeline, it’s important to be clear about the metric you want.

For example:

“Count HTTP 5xx errors per application per minute”
“Track request volume per container”
“Monitor error rates by Kubernetes app”

This decision determines: which logs you filter, how you aggregate them and which fields become metric labels

Example metrics:

k8s_http_requests_total : total requests per app per minute
k8s_http_errors_total : total 5xx responses per app per minute

Step 3: Create a scheduled pipeline

Create a scheduled pipeline that runs at a fixed interval: for example, every 1 minute. At each run, the pipeline will:

Read logs from the previous 1-minute window
Aggregate them
Write the result into a metric stream

Source Node: Query

Write a SQL query as the pipeline source

For example: HTTP request count per app

SELECT
  'k8s_http_requests_total' AS "__name__",
  'counter' AS "__type__",
  COUNT(*) AS "value",
  kubernetes_labels_app AS app,
  kubernetes_namespace_name AS namespace, 
  MAX(_timestamp) AS _timestamp
FROM kubernetes_logs
GROUP BY
  kubernetes_labels_app,
  kubernetes_namespace_name

Explanation:

__name__ → metric name
__type__ → metric type (counter)
value → number of requests in this window
app & namespace → metric labels

Example: HTTP 5xx error count per app

SELECT
  'k8s_http_errors_total' AS __name__,
  'counter' AS __type__,
  COUNT(*) AS value,
  kubernetes_labels_app AS app,
  kubernetes_namespace_name AS namespace,
  MAX(_timestamp) AS _timestamp
FROM kubernetes_logs
WHERE code >= 500
GROUP BY
  kubernetes_labels_app,
  kubernetes_namespace_name

To filter logs for code >= 500 to only count server errors.

2. Before saving, run the SQL query once to validate the output. You should see rows that include __name__, __type__, and value, along with the expected label fields. Test SQL Query and set Period and Frequency of Query Execution

3. Define the interval/ frequency for the pipeline to run, and save.

Transformation Node : Apply any VRL Function or Filtering

After the source query, you can optionally use the transformation node.

This is where you might:

apply additional filtering
normalize or rename fields
use VRL functions for enrichment or cleanup

For simple logs-to-metrics use cases, the SQL query alone is often sufficient, but transformations give you flexibility when needed.

Destination Node : Define metric stream as destination

Finally, configure the destination node to write the output into a metric stream.

Connect the nodes based on data flow, provide your pipeline a name and save it.

Connect the pipeline nodes

Step 4: Test the pipeline

Ingest new log data and wait for the scheduled pipeline to execute based on the configured interval.

Once the pipeline runs, a new metric stream is created using the destination name you provided. Verify that the metric records contain the expected metric name, type, values, and labels.

Verify the events in destination metric stream

Step 5: Debugging Failures.

If something goes wrong, OpenObserve gives you a few clear places to look.

First, make sure usage reporting is enabled by setting ZO_USAGE_REPORTING_ENABLED=true. This allows OpenObserve to record pipeline execution details and surface meaningful error information. (You can refer to the usage reporting guide for setup details.)

When a scheduled pipeline runs and encounters an error, the failure details are written to the error stream. This is where you’ll find messages about missing fields, invalid metric formats, or query execution issues.

You can also inspect the triggers stream, which records each scheduled execution of the pipeline. This helps you confirm whether the pipeline is running on schedule and whether it’s actually reading data from the source stream.

In the UI, failed runs are highlighted with a pipeline failure indicator, along with the associated error message. This makes it easy to quickly spot what went wrong and iterate on the pipeline configuration. Pipeline Failure Signals and message in the OpenObserve UI

Troubleshooting common issues

When setting up scheduled pipelines to convert logs into metrics, a few common errors can prevent the pipeline from writing data into a metric stream. Most of these issues are related to missing or incorrectly defined metric fields.

Error: `error in ingesting metrics missing name`

This error means the pipeline output does not include the __name__ field, which is required to identify the metric.

How to fix it:

Ensure your SQL query explicitly defines __name__ as a string field.
Verify the field name is exactly __name__ (including underscores).
Run the query manually and confirm the output includes a __name__ column.

Example:

SELECT
  'k8s_http_requests_total' AS "__name__",
  ...

Error: `error in ingesting metrics missing type`

This indicates the metric type is not being set.

How to fix it:

Add the __type__ field to your query output.
Use a valid metric type such as counter or gauge.

Example:

SELECT
  'counter' AS "__type__",
  ...

Error: `error in ingesting metrics missing value`

This error occurs when the metric datapoint itself is missing.

How to fix it:

Ensure your query produces a numeric value field.
Use aggregation functions like COUNT(*), SUM(), or AVG().
Avoid naming mismatches, value must be spelled exactly.

Example:

COUNT(*) AS "value"

Error: `DerivedStream has reached max retries of 3`

This message means the scheduled pipeline failed multiple times due to one or more of the issues above.

What’s happening:

The pipeline execution failed validation
Metric ingestion was rejected
The pipeline paused until the next scheduled run

How to fix it:

Open the pipeline configuration.
Run the source SQL query manually and validate the output.
Confirm __name__, __type__, and value are present and correct.
Save the pipeline and wait for the next scheduled execution.

Once the underlying issue is fixed, the pipeline will automatically resume on its next run.

Issue: No logs available for the scheduled pipeline to process

Sometimes the pipeline runs successfully, but no metrics are produced. In this case, the issue is often not with the pipeline logic itself, but with the source data.

What’s happening:

The scheduled pipeline executes
The source SQL query returns zero rows
As a result, no metric datapoints are written

This typically means there are no logs available in the source stream for the selected time window.

How to fix it:

Go to the source log stream in OpenObserve and Verify that new log data is actually being ingested.
Run the pipeline’s source SQL query manually against the log stream.
Check that logs exist for the time range corresponding to the pipeline schedule.
If needed, widen the time window or temporarily increase the pipeline interval to confirm data flow.

Once logs are confirmed in the source stream and the query returns rows, the scheduled pipeline will begin producing metrics on the next run.

Conclusion

Scheduled pipelines provide a practical bridge between raw logs and meaningful metrics. They let you keep the detail and flexibility of logs while extracting the signals you actually need for dashboards, alerts, and SLOs.

Instead of repeatedly scanning high-cardinality log data, scheduled pipelines summarize it once, over clear time windows, and store the result in a form that scales. This makes operational views faster, alerts more reliable, and system behavior easier to reason about.

Most importantly, this approach doesn’t require new instrumentation or a major redesign. It works with the data you already have. If you find yourself building dashboards or alerts directly on top of logs, that’s usually a sign that it’s time to introduce this missing layer.

Next Steps

Once you’ve successfully converted logs into metrics using a scheduled pipeline, there are a few natural directions to build on this foundation.

About the Author

Simran Kumari

Passionate about observability, AI systems, and cloud-native tools. All in on DevOps and improving the developer experience.

Latest From Our Blogs

View all posts

How We Built XDrain in Rust and Why It Made Log Pattern Detection Actually Fast

Engineering

OpenObserve

How We Built XDrain in Rust and Why It Made Log Pattern Detection Actually Fast

We rewrote the XDrain log pattern extraction algorithm in Rust, achieving 40x performance improvements over Python. Learn how we used prefix trees, systematic sampling, and memory-bounded LRU caches to process 361,000 logs/sec in real-time.

Head-Based vs. Tail-Based Sampling: Which Should You Use and When?

Learn the difference between head-based and tail-based sampling in observability. Compare pros, cons, and use cases to choose the right strategy for tracing.

The Prometheus Cardinality Bomb: How to Prevent It Before It Blows Up

Learn what the Prometheus cardinality bomb is, why high-cardinality metrics break your monitoring, and how to detect, prevent, and fix it effectively.

Top Observability Tools & Platforms in 2026: The Complete Guide

Explore the top observability tools and platforms in 2026. Compare features, use cases, and alternatives to Datadog for logs, metrics, and traces in this complete guide.

Major Product Update: AI Assistant, LLM Observability & v0.70.0 ( March 2026)

AI Assistant and LLM Observability are now live on OpenObserve Cloud. v0.70.0 brings a rebuilt Service Graph, visual query builder, Incident Timeline, and more.

Best Log Visualization Tools in 2026

Why AI-assisted analysis is changing how engineering teams investigate incidents, and why OpenObserve leads the category.

Top 10 Datadog Competitors in 2026: In-Depth Comparison for DevOps & SRE Teams

Evaluating Datadog competitors? Compare OpenObserve, Grafana, New Relic, Dynatrace, Splunk & more with pricing breakdowns, feature tables, and a step-by-step migration guide. Find the best alternative for your stack in 2026

Top Log Management Tools (Compared & Reviewed)

Compare the best log management tools of 2026- Splunk, Datadog, Loki, OpenObserve & more. Features, pricing, and pros/cons in one guide.

Simran Kumari

2026-03-11

Engineering

Datadog Pricing: The Hidden Costs Every Engineering Team Should Know

Datadog's per-host billing, custom metric taxes, and two-part log pricing can turn a modest monitoring setup into a six-figure annual spend. See how OpenObserve's usage-based pricing compares — no host charges, no OTel penalties, no surprise bills.

OpenTelemetry Collector Contrib: A Comprehensive Guide

Learn how to use the OpenTelemetry Collector Contrib distribution to collect, process, and export telemetry data. This guide covers architecture, key components, configuration examples, and practical deployment tips.

Simran Kumari

2026-03-08

Convert Raw Logs into Metrics with OpenObserve Pipelines

Ready to get started?

Need to Convert Logs to Metrics

Where OpenObserve pipelines fit in

The flow from logs to metrics

What a scheduled pipeline actually contains

Step-by-step: Converting logs into metrics using a scheduled pipeline

Step 1: Start with a log stream

Step 2: Decide what metric you want to derive

Step 3: Create a scheduled pipeline

Source Node: Query

Transformation Node : Apply any VRL Function or Filtering

Destination Node : Define metric stream as destination

Step 4: Test the pipeline

Step 5: Debugging Failures.

Troubleshooting common issues

Error: error in ingesting metrics missing __name__

Error: error in ingesting metrics missing __type__

Error: error in ingesting metrics missing value

Error: DerivedStream has reached max retries of 3

Issue: No logs available for the scheduled pipeline to process

Conclusion

Next Steps

About the Author

Simran Kumari

Latest From Our Blogs

How We Built XDrain in Rust and Why It Made Log Pattern Detection Actually Fast

Head-Based vs. Tail-Based Sampling: Which Should You Use and When?

The Prometheus Cardinality Bomb: How to Prevent It Before It Blows Up

Top Observability Tools & Platforms in 2026: The Complete Guide

Major Product Update: AI Assistant, LLM Observability & v0.70.0 ( March 2026)

Best Log Visualization Tools in 2026

Top 10 Datadog Competitors in 2026: In-Depth Comparison for DevOps & SRE Teams

Top Log Management Tools (Compared & Reviewed)

Datadog Pricing: The Hidden Costs Every Engineering Team Should Know

OpenTelemetry Collector Contrib: A Comprehensive Guide

Error: `error in ingesting metrics missing name`

Error: `error in ingesting metrics missing type`

Error: `error in ingesting metrics missing value`

Error: `DerivedStream has reached max retries of 3`