Changelog

Release Version: v0.14.2-pb7-mpsc10 | Release Date: Feb 10, 2025

What's update

Fixed the distinct_values keep updating schema
Fixed get schema cache missing issue
Fixed circuitBreaker recover issue

Search without ingester

ZO_FEATURE_QUERY_SKIP_WAL=true

Slow request detail

Add more debug log detail and print the original json
Rollback actix-web from 4.9 to 4.8
also change the print only greater than 5s

ZO_ACTIX_SLOW_LOG_THRESHOLD=5

will print some logs like this: bulk request detail:

total: {} ms, prepare: {} ms, flatten: {} ms, convert_to_uds: {} ms, json_parse: {} ms, format_stream_name: {} ms, get_uds_and_original: {} ms, handle_timestamp: {} ms, before_write: {} ms, write_to_channel: {} ms

also if the before_write over than 5s will print:

original_line: {}

write to channel detail:

[write_logs] total time: {} ms, get_schema: {} ms, check_schema: {} ms, validate: {} ms, get_distinct: {} ms, write_distinct: {} ms, get_writer: {} ms

Circuit Breaker

ZO_ACTIX_SLOW_LOG_THRESHOLD=5 // seconds
ZO_CIRCUIT_BREAKER_ENABLED=true
ZO_CIRCUIT_BREAKER_WATCHING_WINDOW=60 // seconds
ZO_CIRCUIT_BREAKER_SLOW_REQUEST_THRESHOLD=100 // slow requests
ZO_CIRCUIT_BREAKER_RESET_WINDOW_NUM=3 // reset state window num

We also added a new CircuitBreaker to tell the LoadBlancer this node is too heavy please stop give new traffic. that is because we found there is always some node have more connections than others. it looks the LB doesn't know how to control the schedule polity. let's tell it. This new feature will analyze the http request, if we found there are more than 100 slow requests in 60 seconds, then we will set the node state to unSchedulable, then the /schedulez API will return error. this should let the LB stop give new traffic. And then the state will reset in next 3 watching window (3 * 60s), the /schedulez API will return ok again if there is no slow request exceeds the limit.