Resources

Monitoring Zookeeper with OpenTelemetry Setup

October 2, 2024 by OpenObserve Team
Zookeeper Receiver

ZooKeeper isn't just another tool; it's the conductor, the timekeeper, and the librarian for distributed applications. It maintains order, synchronizes actions, and ensures everyone has the right information at the right time.

Apache ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services in distributed applications. 

Given its crucial role in managing distributed systems, monitoring ZooKeeper effectively is essential. Proper monitoring helps detect issues early, maintain system health, and optimize resource usage.

Benefits of Using OpenTelemetry

OpenTelemetry is an open-source observability framework that provides a standardized way to collect, process, and export telemetry data, including metrics, logs, and traces. The benefits of using OpenTelemetry for monitoring ZooKeeper include:

  • Vendor-Neutral: OpenTelemetry supports multiple backends, allowing organizations to choose their preferred observability tools without being locked into a single vendor.
  • Unified Data Collection: It enables the collection of various telemetry data types, offering a holistic view of application performance and health.
  • Flexibility and Extensibility: OpenTelemetry can be easily extended to support custom metrics and traces, adapting to specific monitoring needs.
  • Community Support: As a widely adopted framework, OpenTelemetry benefits from a large community, ensuring ongoing development, support, and integration with other tools.

Tools and Technologies Involved in Monitoring

To effectively monitor ZooKeeper using OpenTelemetry, several tools and technologies are involved:

  • Apache ZooKeeper: The core service being monitored, responsible for managing distributed applications.
  • OpenTelemetry Collector: A component that receives, processes, and exports telemetry data from ZooKeeper.
  • Prometheus: A monitoring system and time-series database that can scrape metrics from ZooKeeper and provide alerts.
  • OpenObserve: An observability platform that enables visualization and analysis of telemetry data, enhancing insights into system performance.

OpenObserve’s Expertise & Advantages in OpenTelemetry

OpenObserve is a powerful observability platform that complements OpenTelemetry by providing advanced features for data visualization and analysis. Its advantages include:

  • Cost-Effective Storage: OpenObserve offers significant cost savings on storage compared to traditional solutions, making it an attractive option for managing large volumes of telemetry data.
  • User-Friendly Interface: The platform provides an intuitive interface for exploring and analyzing metrics, making it easier for teams to derive actionable insights.
  • Real-Time Analytics: OpenObserve supports real-time data processing, allowing users to monitor ZooKeeper performance and respond to issues as they arise.
  • Seamless Integration: OpenObserve integrates smoothly with OpenTelemetry, enabling users to leverage the strengths of both tools for comprehensive observability.

Let’s explore each step on monitoring Zookeeper with Open Telemetry in detail.

Prerequisites

Before setting up monitoring for ZooKeeper using OpenTelemetry, ensure that you meet the following prerequisites:

ZooKeeper Version Compatibility

The monitoring setup described in this guide requires ZooKeeper version 3.6.0 or later. Earlier versions may not provide the necessary metrics or compatibility with the Prometheus exporter. Ensure that your ZooKeeper deployment meets this version requirement.

Pre-configuring the OpenTelemetry Collector

The OpenTelemetry Collector is a crucial component in the monitoring setup, responsible for receiving, processing, and exporting telemetry data from ZooKeeper. 

Before proceeding with the configuration, make sure you have the OpenTelemetry Collector installed and pre-configured to handle the necessary tasks.

The pre-configuration steps for the OpenTelemetry Collector include:

  1. Installing the OpenTelemetry Collector: Follow the official installation guide for your operating system to set up the OpenTelemetry Collector.
  2. Configuring the Collector: Create a configuration file (e.g., config.yaml) that defines the receivers, processors, and exporters for your monitoring setup. This file will be used in subsequent steps to configure the ZooKeeper receiver and export metrics to the desired destination, such as OpenObserve or other observability platforms.
  3. Ensuring the Collector is Running: Verify that the OpenTelemetry Collector is running and ready to receive data from ZooKeeper. You can start the Collector using the appropriate command for your operating system, as specified in the installation guide.

By meeting these prerequisites, you'll have a solid foundation for setting up monitoring for ZooKeeper using OpenTelemetry and OpenObserve. 

The next steps will guide you through configuring ZooKeeper to use the Prometheus exporter and setting up the ZooKeeper receiver in the OpenTelemetry Collector.

Configuring ZooKeeper to Use Prometheus Exporter

To enable monitoring of ZooKeeper using OpenTelemetry, you need to configure ZooKeeper to expose metrics in a format that can be scraped by the OpenTelemetry Collector. In this setup, we'll use the Prometheus exporter provided by ZooKeeper.

Adding PrometheusMetricsProvider to zoo.cfg

  1. Open the zoo.cfg configuration file for your ZooKeeper installation.
  2. Add the following line to enable the Prometheus Metrics Provider:

metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider

  1. Set the HTTP port for the Prometheus exporter (default is 7000):

metricsProvider.httpPort=7000

  1. Save the changes to the zoo.cfg file.
  2. Restart ZooKeeper for the changes to take effect.

Reference to Official ZooKeeper Documentation

For detailed configuration steps and additional options, refer to the official ZooKeeper documentation on monitoring:

ZooKeeper Monitoring

This documentation provides more information on enabling the Prometheus exporter, configuring the HTTP port, and understanding the available metrics.

By following these steps and enabling the PrometheusMetricsProvider in ZooKeeper, you'll ensure that the necessary metrics are exposed and ready to be scrapped by the OpenTelemetry Collector for monitoring and analysis in OpenObserve.

Setting Up the ZooKeeper Receiver

To effectively monitor ZooKeeper, you need to set up the ZooKeeper receiver in the OpenTelemetry Collector. This involves configuring the collector to scrape metrics from your ZooKeeper instances.

Installation Locations of the Config File

The configuration file for the OpenTelemetry Collector, typically named config.yaml, is located in different directories depending on your operating system. Ensure you know the correct path to modify the configuration as needed:

  • Windows:Path of config.yaml: C:\Program Files\OpenObserve OpenTelemetry Collector\config.yaml
  • Linux:Path of config.yaml: /opt/OpenObserve-otel-collector/config.yaml

Once you have located the configuration file, you will proceed to configure the ZooKeeper receiver in the OpenTelemetry Collector. 

This involves specifying the necessary attributes and scrape targets to collect metrics from your ZooKeeper instances. The following sections will guide you through the receiver configuration process.

Receiver Configuration

In this section, we'll walk through the steps to configure the Prometheus receiver in the OpenTelemetry Collector configuration file. This receiver will be responsible for scraping metrics from your ZooKeeper instances.

Steps to Configure the Prometheus Receiver

  1. Open the OpenTelemetry Collector configuration file (e.g., config.yaml) located in the appropriate directory based on your operating system.
  2. Under the receivers section, add the configuration for the Prometheus receiver:

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: 'zookeeper'
          static_configs:
            - targets: ['192.168.10.32:7000', '192.168.10.33:7000', '192.168.10.34:7000']

  1. Adjust the targets attribute to include the correct IP addresses and ports of your ZooKeeper instances. Ensure that the port matches the one specified in the zoo.cfg file when enabling the Prometheus exporter (default is 7000).
  2. Save the changes to the configuration file.

Attributes to Set for the Receiver

The Prometheus receiver in the OpenTelemetry Collector supports various attributes to customize its behavior. Some key attributes include:

  • config: Specifies the Prometheus-specific configuration, such as scrape_configs.
  • endpoint: Sets the address on which the receiver should listen for Prometheus scrape requests.
  • read_buffer_size: Adjusts the read buffer size for the HTTP server.

Refer to the OpenTelemetry Collector documentation for a complete list of available attributes and their descriptions.

Example Configuration File

Here's an example of a complete config.yaml file with the Prometheus receiver configured:

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: 'zookeeper'
          static_configs:
            - targets: ['192.168.10.32:7000', '192.168.10.33:7000', '192.168.10.34:7000']

processors:
  batch:
  resourcedetection:

exporters:
  otlp:
    endpoint: api.openobserve.ai:4317
    tls:
      insecure: true

service:
  pipelines:
    metrics:
      receivers: \[prometheus]
      processors: \[batch, resourcedetection]
      exporters: \[otlp]

This configuration sets up the Prometheus receiver to scrape metrics from the specified ZooKeeper instances, processes the metrics using the batch and resource detection processors, and exports the metrics to OpenObserve using the OTLP exporter. 

The following section will guide you through the processor configuration process.

Processor Configuration

In this section, we will configure the processors in the OpenTelemetry Collector to enhance the handling of metrics collected from ZooKeeper. Two essential processors are the Resource Detection Processor and the Batch Processor.

Resource Detection Processor: Distinguishing Metrics from Multiple Hosts

The Resource Detection Processor is crucial for identifying and labeling metrics from different ZooKeeper instances. This processor helps ensure that metrics are tagged with the appropriate resource attributes, making it easier to distinguish between metrics originating from various hosts.

To configure the Resource Detection Processor, add the following to your config.yaml file under the processors section:

processors:
  resourcedetection:
    detectors: \[system]

This configuration uses the system detector, which automatically identifies and labels the host attributes such as hostname, IP address, and other relevant metadata.

Batch Processor: Bundling Metrics from Multiple Receivers

The Batch Processor is used to optimize the processing of metrics by bundling them together before exporting. This can improve performance and reduce the load on the destination by minimizing the number of outgoing requests.

To configure the Batch Processor, add the following to your config.yaml file:

processors:
  batch:
    timeout: 5s
    send_batch_size: 100

In this configuration:

  • timeout: Specifies the maximum time to wait before sending a batch of metrics.
  • send_batch_size: Defines the maximum number of metrics to include in a single batch.

For more detailed information on configuring processors in the OpenTelemetry Collector, refer to the official documentation:

These resources provide comprehensive guidance on the available options and best practices for configuring processors in the OpenTelemetry Collector. 

 The following section will guide you through the exporter configuration process.

Exporter Configuration

In this section, we will configure the OpenTelemetry Collector to export metrics to various destinations, including OpenObserve, using the OpenTelemetry Protocol (OTLP) exporter.

Exporting Metrics to Destinations (e.g., OpenObserve) Using OTLP Exporter

The OTLP exporter is a versatile component that allows the OpenTelemetry Collector to send telemetry data to multiple backends. Here’s how to configure it for both OpenObserve:

  1. OpenObserve Configuration:
    To export metrics to OpenObserve, add the following configuration under the exporters section of your config.yaml file:

exporters:
  otlp/openobserve:
    endpoint: https://api.openobserve.ai/api/<your-org>/v1/metrics
    headers:
      Authorization: "Basic <your_base64_encoded_credentials>"
      stream-name: "default"

Replace <your-org> with your organization name and <your_base64_encoded_credentials> with the appropriate credentials for authentication.Supported Destinations and Their Configurations

The OpenTelemetry Collector supports various destinations for exporting telemetry data. Here are some common configurations:

  • OpenObserve:
    • Protocol: OTLP HTTP or OTLP gRPC
    • Endpoints:
      • OTLP HTTP: https://api.openobserve.ai/api/<your-org>/v1/metrics
      • OTLP gRPC: localhost:5081
    • Authentication: Basic Auth with encoded credentialsFor detailed configuration options and best practices, refer to the following documentation:

By configuring the OTLP exporter correctly, you can ensure that your ZooKeeper metrics are sent to the desired observability platforms.

The following section will guide you through the process of setting up the pipeline.

Setting up the Pipeline

In this section, we will configure the OpenTelemetry Collector pipeline to include the ZooKeeper receiver, ensuring that the collected metrics are processed and exported correctly.

Adding the Configured Receiver to the Pipeline

To set up the pipeline, you need to specify the receivers, processors, and exporters in the service section of your config.yaml file. Here’s how to add the configured ZooKeeper receiver to the pipeline:

  1. Open the config.yaml file where you have defined the Prometheus receiver and processors.
  2. Locate the service section and ensure it includes the pipeline configuration. It should look similar to the following:

service:
  pipelines:
    metrics:
      receivers: \[prometheus]
      processors: \[batch, resourcedetection]
      exporters: \[otlp/openobserve, otlp]

In this configuration:

  • The metrics pipeline is defined to include the prometheus receiver.
  • The batch and resourcedetection processors are applied to the collected metrics.
  • The metrics are exported to both OpenObserveusing the OTLP exporter.

Instructions for Enabling the ZooKeeper Receiver

After configuring the pipeline, follow these steps to ensure the ZooKeeper receiver is enabled and functioning:

  1. Save the Changes: After modifying the config.yaml file, save your changes.
  2. Restart the OpenTelemetry Collector: To apply the new configuration, restart the OpenTelemetry Collector. You can do this using the appropriate command for your operating system, such as:
    • Windows:

bash
net stop "OpenObserve OpenTelemetry Collector"
net start "OpenObserve OpenTelemetry Collector"

  • Linux:

bash
sudo systemctl restart OpenObserve-otel-collector

  1. Verify the Receiver is Active: After restarting, check the logs of the OpenTelemetry Collector to confirm that the ZooKeeper receiver is active and successfully scraping metrics. Look for log entries indicating that the receiver is running and the scrape targets are being processed.
  2. Monitor the Metrics: Once the receiver is enabled, you can navigate to your observability platform (OpenObserve) to verify that ZooKeeper metrics are being received and displayed correctly.

By following these steps, you ensure that metrics are collected, processed, and exported to your chosen observability platforms.

The following section will guide you through the process of validating metrics.

Validating Metrics

After setting up the OpenTelemetry Collector to monitor ZooKeeper, it’s essential to validate that the metrics are being collected and exported correctly. This section outlines how to check the metrics in OpenObserve and ensure everything is functioning as expected.

  1. Log in to OpenObserve: Access your OpenObserve instance by navigating to the web interface and logging in with your credentials.
  2. Navigate to the Metrics Section: Once logged in, locate the metrics dashboard or metrics details page. This is typically found in the main menu under "Metrics" or "Observability."
  3. Select the Relevant Workspace: If you have multiple workspaces, ensure you select the workspace where your ZooKeeper metrics are being sent.

Searching for Specific ZooKeeper Metric Names

  1. Use the Search Functionality: In the metrics dashboard, utilize the search bar to filter for specific ZooKeeper metric names. Common metric names you might look for include:
    • zookeeper_requests
    • zookeeper_latency
    • zookeeper_connections
    • zookeeper_sessions
  2. Examine the Metrics: After entering the metric names, review the displayed metrics to ensure they are being populated with data. You should see time-series graphs or tables showing the metrics over time.
  3. Check for Anomalies: Look for any unusual spikes or drops in the metrics, which could indicate issues with your ZooKeeper instances or the monitoring setup.

Viewing and Editing Receiver's Metadata File for Emitted Metrics

  1. Locate the Metadata File: The OpenTelemetry Collector may generate a metadata file that contains information about the emitted metrics. This file is typically located in the collector's configuration directory or a specified output directory.
  2. Open the Metadata File: Use a text editor to open the metadata file. This file should contain details about the metrics being emitted, including their names, types, and any associated labels.
  3. Edit the Metadata (if necessary): If you need to adjust the metadata for any reason (e.g., renaming metrics or adding labels), make the necessary changes in the metadata file. Ensure that any changes align with your monitoring and observability requirements.
  4. Save Changes and Restart the Collector: After editing the metadata file, save your changes and restart the OpenTelemetry Collector to apply the updates.

By following these steps, you can validate that your ZooKeeper metrics are being collected correctly. 

The following section will guide you through the process of viewing and analyzing metrics.

Viewing and Analyzing Metrics

Once you have validated that metrics are being collected from ZooKeeper, the next step is to view and analyze these metrics effectively. This section will cover the list of metrics you can expect to see and how to create a dashboard using OpenObserve's features.

List of Metrics Scraped by the ZooKeeper Receiver

The ZooKeeper receiver collects a variety of metrics that provide insights into the performance and health of your ZooKeeper instances. Here are some common metrics you may encounter:

  1. zookeeper_requests: The total number of requests processed by the ZooKeeper server.
  2. zookeeper_latency: The latency of requests, typically measured in milliseconds. This can help identify performance bottlenecks.
  3. zookeeper_connections: The number of active connections to the ZooKeeper server.
  4. zookeeper_sessions: The number of active sessions currently managed by the ZooKeeper instance.
  5. zookeeper_leader_election: Metrics related to leader election processes, which can indicate stability within the cluster.
  6. zookeeper_data_size: The size of the data being managed by ZooKeeper, providing insights into resource usage.
  7. zookeeper_watch_count: The number of active watches set by clients, which can affect performance.

These metrics can help you monitor the overall health of your ZooKeeper setup and identify any potential issues that may arise.

Creating a Dashboard for the Metrics Using OpenObserve's Dashboard Features

Creating a dashboard in OpenObserve allows you to visualize and analyze the metrics collected from ZooKeeper effectively. Follow these steps to set up a dashboard:

  1. Access the Dashboard Feature: Log in to your OpenObserve instance and navigate to the dashboard section, usually found in the main menu.
  2. Create a New Dashboard:Click on the option to create a new dashboard.
  3. Add Metrics to the Dashboard:Use the search functionality to find the ZooKeeper metrics you want to include.
  4. Customize the Dashboard Layout:Arrange the metrics on the dashboard according to your preferences. You can resize and move the visualizations to create a layout that best suits your monitoring needs.
  5. Save the Dashboard: Once you have added and arranged the metrics, save the dashboard. You can now access it anytime to monitor the performance of your ZooKeeper instances.

By following these steps, you can effectively view and analyze the metrics scraped by the ZooKeeper receiver in OpenObserve, enabling insights into the performance of your ZooKeeper setup.

The final section will guide you through alerting configuration.

Alerting Configuration

By configuring alerts based on specific metrics, you can proactively address issues before they escalate. This section outlines how to set up alerting rules for ZooKeeper metrics in OpenObserve.

Effective Configuration of Alerts

  1. Identify Key Metrics for Alerting: Determine which ZooKeeper metrics are critical for your application's performance and reliability. Common metrics to monitor include:
  • zookeeper_requests: Total number of requests processed.
  • zookeeper_latency: Average latency of requests.
  • zookeeper_connections: Number of active connections.
  • zookeeper_sessions: Number of active sessions.
  • zookeeper_watch_count: Number of active watches.
  1. Define Alert Thresholds: Establish thresholds for each key metric that, when exceeded, will trigger alerts. For example:
  • Trigger an alert if average latency exceeds 100 milliseconds.
  • Alert if the number of active connections exceeds the maximum allowed (e.g., 60).
  1. Set Alert Severity Levels: Classify alerts by severity (e.g., critical, warning) to prioritize responses. Critical alerts may require immediate action, while warnings can indicate potential issues that should be monitored.

Setting Alerting Rules for ZooKeeper Metrics in OpenObserve

To configure alerting rules for ZooKeeper metrics in OpenObserve, follow these steps:

  1. Access the Alerting Configuration: Log in to your OpenObserve instance and navigate to the alerting configuration section.
  2. Create New Alerting Rules:Click on the option to add a new alert rule.
  • Click on the option to add a new alert rule.
  • Specify the metric you want to monitor (e.g., zookeeper_latency).
  1. Define the Alert Condition: Enter the condition for triggering the alert. For example, to alert on high latency:

alert: HighZooKeeperLatency
expr: zookeeper_latency > 100
for: 5m
labels:
  severity: warning
annotations:
  summary: "High latency detected for ZooKeeper instance {{ $labels.instance }}"
  description: "The average latency for ZooKeeper instance {{ $labels.instance }} has exceeded 100ms."

  1. Set Notification Channels: Configure how you want to be notified when an alert is triggered. This could include email notifications, Slack messages, or integrations with incident management tools.
  2. Save and Activate the Alerting Rules: After defining the rules, save your configurations and activate the alerts. Ensure that the alerting mechanism is functioning correctly by testing it with known conditions.

Example Alerting Rules

Here are some example alerting rules you might consider implementing:

- alert: ZooKeeperServerDown
  expr: up == 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "ZooKeeper server is down"
    description: "The ZooKeeper server at instance {{ $labels.instance }} is not reachable."

- alert: TooManyConnections
  expr: zookeeper_connections > 60
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Too many connections to ZooKeeper"
    description: "The number of active connections to ZooKeeper instance {{ $labels.instance }} exceeds the limit."

- alert: HighZnodeCount
  expr: znode_count > 1000000
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "High znode count"
    description: "The znode count for ZooKeeper instance {{ $labels.instance }} has exceeded 1,000,000."

By effectively configuring alerts for your ZooKeeper metrics, you can ensure timely responses to potential issues.

Conclusion

In this guide, we have explored the benefits and steps involved in monitoring Apache ZooKeeper using OpenTelemetry and OpenObserve. 

By leveraging these powerful tools, you can gain comprehensive insights into the health and performance of your ZooKeeper instances, ensuring the reliability and efficiency of your distributed applications.

Some key takeaways from this guide:

  • OpenTelemetry provides a vendor-neutral and flexible way to collect and export telemetry data from ZooKeeper, while OpenObserve offers a cost-effective and user-friendly platform for visualizing and analyzing this data.
  • By configuring the ZooKeeper receiver in the OpenTelemetry Collector and exporting metrics to OpenObserve, you can monitor critical metrics such as request latency, connection counts, session timeouts, and more.
  • Setting up alerting rules in OpenObserve allows you to proactively detect and address issues before they impact your applications, reducing downtime and ensuring optimal performance.
  • The combination of OpenTelemetry's data collection capabilities and OpenObserve's observability features provides a powerful solution for monitoring ZooKeeper in a variety of environments, from on-premises to cloud-based deployments.

To get started with monitoring your ZooKeeper instances using OpenTelemetry and OpenObserve, sign up for a free account with OpenObserve today

Don't wait until it's too late – take control of your ZooKeeper monitoring and ensure the reliability and performance of your applications with OpenTelemetry and OpenObserve.

Author:

authorImage

The OpenObserve Team comprises dedicated professionals committed to revolutionizing system observability through their innovative platform, OpenObserve. Dedicated to streamlining data observation and system monitoring, offering high performance and cost-effective solutions for diverse use cases.

OpenObserve Inc. © 2024