]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
monitoring: Fix "10% OSDs down" alert description 35151/head
authorBenoît Knecht <bknecht@protonmail.ch>
Thu, 30 Apr 2020 08:50:07 +0000 (10:50 +0200)
committerLaura Paduano <lpaduano@suse.com>
Wed, 20 May 2020 09:03:09 +0000 (11:03 +0200)
commit27b05fcbaab8894a60682a0f9f212aa7e404cf78
tree69f4dff71fa7c44e32ee98c8a6825c942b89062e
parentd0f2ad7a2ed01ba7030f992febc65bb6c27dde8d
monitoring: Fix "10% OSDs down" alert description

The alert was triggered when less than 90% of OSDs were _up_, but then the
description took that value and described it as the percentage of OSDs being
_down_. So with 12% of OSDs down, the alert description would read:

```
88% or 88 of 100 OSDs are down (>=10%).
```

which can be panic-inducing.

This commit changes the alert expression to actually compute the ratio of OSDs
being down, which makes the correct value appear in the description.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 653c3f66823179fc5b9cbb74ff932d61a6c4178c)
monitoring/prometheus/alerts/ceph_default_alerts.yml