-
Notifications
You must be signed in to change notification settings - Fork 126
Description
The detail page of an objective shows NaN for short and long burn in the multi burn error alerts table.
This is because all labels of the objective are used on a query on the recording rule, which does not have all labels of the original metric.
Burn Rate Query: http_request_duration_seconds:burnrate3m{handler="/api/v1/recommend",service="content-recommender",slo="content-recommender-latency",status=~"2xx"}
Recording Rule:
(sum(rate(http_request_duration_seconds_count{handler="/api/v1/recommend",service="content-recommender",status=~"2xx"}[3m])) - sum(rate(http_request_duration_seconds_bucket{handler="/api/v1/recommend",le="0.25",service="content-recommender",status=~"2xx"}[3m]))) / sum(rate(http_request_duration_seconds_count{handler="/api/v1/recommend",service="content-recommender",status=~"2xx"}[3m]))
Labels:handler="/api/v1/recommend", service="content-recommender", slo="content-recommender-latency"
, note that the status label is NOT present.
Simple example for reproduction:
spec:
alerting: {}
description: 99% of all request in the last 2 weeks should be below 250 ms
indicator:
latency:
grouping: null
success:
metric: http_request_duration_seconds_bucket{service="content-recommender",
handler="/api/v1/recommend", status=~"2xx", le="0.25"}
total:
metric: http_request_duration_seconds_count{service="content-recommender",
handler="/api/v1/recommend", status=~"2xx"}
The reason why it's working in the Demo is because it uses a negative regex to filter for successful queries code!~"5.."
.
Wanted behaviour:
The query for the burn rate should not contain labels that are not part of the recording rule.