]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ansible.git/commitdiff
dashboard: Add new prometheus alert
authorBoris Ranto <branto@redhat.com>
Tue, 8 Jun 2021 07:43:23 +0000 (09:43 +0200)
committerGuillaume Abrioux <gabrioux@redhat.com>
Thu, 24 Jun 2021 07:02:21 +0000 (09:02 +0200)
It was requested for us to update our alerting definitions to include a
slow OSD Ops health check.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1951664
Signed-off-by: Boris Ranto <branto@redhat.com>
roles/ceph-prometheus/files/ceph_dashboard.yml

index 7a14744163e0a13d6c51b0c8f3cbd037cc566583..0c95d4dafff82013c2ed5a36aa6c5e81991195bc 100644 (file)
@@ -105,3 +105,11 @@ groups:
     annotations:
       summary: "OSD(s) with High PG Count"
       description: "This indicates there are some OSDs with high PG count (275+)."
+  - alert: Slow OSD Ops
+    expr: ceph_healthcheck_slow_ops > 0
+    for: 1m
+    labels:
+      severity: page
+    annotations:
+      summary: "Slow OSD Ops"
+      description: "OSD requests are taking too long to process (osd_op_complaint_time exceeded)"