]> git.apps.os.sepia.ceph.com Git - ceph.git/commit
monitoring: Fix "10% OSDs down" alert description 35211/head
authorBenoît Knecht <bknecht@protonmail.ch>
Thu, 30 Apr 2020 08:50:07 +0000 (10:50 +0200)
committerShyukri Shyukriev <shshyukriev@suse.com>
Sat, 23 May 2020 13:02:42 +0000 (16:02 +0300)
commitf901cb6e69372310cc0c6ef053910651fe4c1432
treee5278b8a490af21e023feffdacc002cd5260b070
parentccd9c04f88e53aef7e4f1068ce1221fa3b97450d
monitoring: Fix "10% OSDs down" alert description

The alert was triggered when less than 90% of OSDs were _up_, but then the
description took that value and described it as the percentage of OSDs being
_down_. So with 12% of OSDs down, the alert description would read:

```
88% or 88 of 100 OSDs are down (>=10%).
```

which can be panic-inducing.

This commit changes the alert expression to actually compute the ratio of OSDs
being down, which makes the correct value appear in the description.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 653c3f66823179fc5b9cbb74ff932d61a6c4178c)
monitoring/prometheus/alerts/ceph_default_alerts.yml