]> git.apps.os.sepia.ceph.com Git - ceph.git/commit
monitoring: Fix "10% OSDs down" alert description 34854/head
authorBenoît Knecht <bknecht@protonmail.ch>
Thu, 30 Apr 2020 08:50:07 +0000 (10:50 +0200)
committerBenoît Knecht <bknecht@protonmail.ch>
Wed, 6 May 2020 16:49:26 +0000 (18:49 +0200)
commit653c3f66823179fc5b9cbb74ff932d61a6c4178c
treeb236e4710bb09061c92ea686f1351653c5a642ab
parenta96f9583f4f9d2552e0d326bdac31373faaa0b3e
monitoring: Fix "10% OSDs down" alert description

The alert was triggered when less than 90% of OSDs were _up_, but then the
description took that value and described it as the percentage of OSDs being
_down_. So with 12% of OSDs down, the alert description would read:

```
88% or 88 of 100 OSDs are down (>=10%).
```

which can be panic-inducing.

This commit changes the alert expression to actually compute the ratio of OSDs
being down, which makes the correct value appear in the description.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
monitoring/prometheus/alerts/ceph_default_alerts.yml