Performance Monitoring

Performance monitoring helps you understand how fast your application loads and responds for real users. OpenObserve RUM tracks various performance metrics automatically and displays them in an easy-to-understand dashboard.

Performance Summary Dashboard

The Performance Summary dashboard is divided into four main sections:

Overview: High-level performance metrics and session statistics
Web Vitals: Core Web Vitals and user experience metrics
Errors: Error tracking and monitoring
API: API call performance and monitoring

You can access these sections by clicking the respective tabs in the Performance view.

Overview Tab

The Overview tab provides a high-level view of your application's performance, including:

Web Vitals Summary

View the three Core Web Vitals at a glance:

Largest Contentful Paint (LCP)
First Input Delay (FID)
Cumulative Layout Shift (CLS)

These metrics help you understand the user experience quality of your application.

Error Statistics

Total Errors: Total number of errors across all sessions
Session with Errors: Number of sessions that encountered at least one error
Total Unhandled Errors: Errors that weren't caught by error handlers

Session Statistics

Total Sessions: Number of user sessions in the selected time range

Web Vitals Tab

Web Vitals are user-centric performance metrics that measure real-world user experience. OpenObserve tracks all Core Web Vitals as defined by Google.

Core Web Vitals

Largest Contentful Paint (LCP)

What it measures: The time it takes for the largest content element to become visible in the viewport.

Why it matters: LCP measures perceived load speed. A fast LCP helps reassure users that the page is useful.

Good LCP scores:

Good: ≤ 2.5 seconds
Needs Improvement: 2.5 - 4.0 seconds
Poor: > 4.0 seconds

What counts as LCP:

<img> elements
<image> elements inside <svg>
<video> elements with poster images
Background images loaded via url()
Block-level elements containing text

How to improve:

Optimize and compress images
Preload important resources
Reduce server response times
Use a CDN
Remove render-blocking JavaScript and CSS

First Input Delay (FID)

What it measures: The time from when a user first interacts with your page (clicks a link, taps a button) to when the browser responds to that interaction.

Why it matters: FID measures responsiveness. It quantifies the experience users feel when trying to interact with unresponsive pages.

Good FID scores:

Good: ≤ 100 milliseconds
Needs Improvement: 100 - 300 milliseconds
Poor: > 300 milliseconds

Common causes of poor FID:

Long-running JavaScript tasks
Large JavaScript bundles
Heavy parsing and execution of scripts
Third-party scripts

How to improve:

Break up long tasks
Optimize JavaScript execution
Use web workers for heavy computations
Reduce JavaScript bundle size
Implement code splitting
Defer unused JavaScript

Cumulative Layout Shift (CLS)

What it measures: The sum of all unexpected layout shifts that occur during the entire lifespan of the page.

Why it matters: CLS measures visual stability. Unexpected layout shifts can be frustrating and lead to accidental clicks.

Good CLS scores:

Good: ≤ 0.1
Needs Improvement: 0.1 - 0.25
Poor: > 0.25

Common causes of CLS:

Images without dimensions
Ads, embeds, or iframes without dimensions
Dynamically injected content
Web fonts causing FOIT/FOUT
Actions waiting for network response before updating DOM

How to improve:

Always include width and height attributes on images and video elements
Reserve space for ads and embeds
Avoid inserting content above existing content
Use font-display: optional for web fonts
Preload fonts

Additional Web Vitals

First Contentful Paint (FCP)

What it measures: The time from page start to when any part of the page's content is rendered on the screen.

Good FCP scores:

Good: ≤ 1.8 seconds
Needs Improvement: 1.8 - 3.0 seconds
Poor: > 3.0 seconds

Time to First Byte (TTFB)

What it measures: The time from the request start to when the first byte of the response is received.

Good TTFB scores:

Good: ≤ 800 milliseconds
Needs Improvement: 800 - 1800 milliseconds
Poor: > 1800 milliseconds

How to improve:

Optimize server processing
Use a CDN
Implement caching
Reduce database query times
Use HTTP/2 or HTTP/3

Time to Interactive (TTI)

What it measures: The time from page start to when the page becomes fully interactive.

Good TTI scores:

Good: ≤ 3.8 seconds
Needs Improvement: 3.8 - 7.3 seconds
Poor: > 7.3 seconds

Errors Tab

The Errors tab provides insights into frontend errors:

Errors by Time

A timeline chart showing error frequency over the selected time period. This helps you:

Identify error spikes
Correlate errors with deployments
Track error trends over time

Top Error Views

A table showing the most common errors, including:

View URL: The page where errors are occurring
Error Count: How many times each error occurred

Click on any error to: - See detailed error information - View affected sessions - Jump to session replay to debug

API Tab

The API tab monitors the performance of API calls made by your frontend:

Top Slowest Resources

Identifies the slowest API endpoints and resources:

Metric	Description
Resource URL	The API endpoint or resource URL
Duration (ms)	Average response time in milliseconds

This helps you:

Identify slow API endpoints
Find performance bottlenecks
Prioritize optimization efforts

Top Heaviest Resources

Shows resources that transfer the most data:

Metric	Description
Resource URL	The API endpoint or resource URL
Size (kb)	Average response size in kilobytes

This helps you:

Identify large payloads
Optimize data transfer
Reduce bandwidth usage

Top Error Resources

Lists API endpoints with the highest error rates:

Metric	Description
Resource URL	The API endpoint URL
Error Count	Number of failed requests

This helps you:

Identify failing API calls
Track API reliability
Debug integration issues

Filtering and Time Range

Time Range Selection

Change the time range to analyze different periods:

Past 15 Minutes: Real-time monitoring
Past Hour: Recent performance
Past 6 Hours: Short-term trends
Past 24 Hours: Daily patterns
Past 7 Days: Weekly trends
Past 30 Days: Monthly analysis
Custom Range: Specify exact date/time range

Auto-Refresh

Enable auto-refresh to monitor performance in real-time:

Off: Manual refresh only
10s: Refresh every 10 seconds
30s: Refresh every 30 seconds
1m: Refresh every minute
5m: Refresh every 5 minutes

Filters

Apply filters to narrow down your analysis:

Service: Filter by specific service name
Env: Filter by environment (production, staging, development)
Version: Filter by application version
Browser: Filter by browser type (Chrome, Firefox, Safari, etc.)
Device: Filter by device type (Desktop, Mobile, Tablet)
Country: Filter by geographic location

Performance Best Practices

Monitoring Strategy

Set Baselines: Establish baseline metrics for your application
Define Thresholds: Set up alerts for when metrics exceed acceptable thresholds
Regular Review: Review performance metrics weekly
Track Trends: Monitor trends over time, not just absolute values
Correlate with Events: Compare metrics before and after deployments

Optimization Workflow

Identify Issues: Use the Performance dashboard to find slow areas
Prioritize: Focus on metrics that impact the most users
Investigate: Use Session Replay to understand user context
Optimize: Make targeted improvements
Measure: Verify improvements in the dashboard
Iterate: Continue monitoring and improving

Common Performance Issues

Issue	Symptoms	Solutions
Slow page load	High LCP, FCP	Optimize images, reduce bundle size, use CDN
Janky interactions	High FID, long tasks	Split JavaScript, use web workers, defer scripts
Layout shifts	High CLS	Set image dimensions, reserve space for dynamic content
Slow API calls	High API duration	Optimize endpoints, add caching, use pagination
Large payloads	High resource size	Compress responses, optimize data structures

Interpreting Metrics

Percentiles

Performance metrics are typically shown as percentiles:

p50 (Median): 50% of users experience this or better
p75: 75% of users experience this or better
p95: 95% of users experience this or better
p99: 99% of users experience this or better

Focus on p75 and p95 to ensure good experience for most users.

Regional Differences

Performance can vary significantly by region due to:

Network latency
CDN coverage
Server location
Local infrastructure

Use geographic filters to understand regional performance.

Device and Browser Differences

Different devices and browsers have different capabilities:

Mobile devices are generally slower than desktop
Older browsers may have performance limitations
Different browsers implement features differently

Filter by device and browser to understand these differences.

Next Steps

Session Tracking - Learn about user sessions
Error Tracking - Deep dive into error tracking
Session Replay - Use session replay for debugging
Metrics Reference - Complete list of all metrics