From: Sage Weil Date: Tue, 9 Oct 2018 12:21:27 +0000 (-0500) Subject: mgr/devicehealth: warn on failing devices at 6 weeks X-Git-Tag: v14.0.1~42^2~2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=c1a9d02c9bc1b7f42b5ae82c18dbe3ca33e7b3bc;p=ceph.git mgr/devicehealth: warn on failing devices at 6 weeks This gives us an interval where we warn before automatically marking an OSD out. That way the operator has an opportunity to preemptively replace the device and incurring only a single rebalance/recovery event (vs two, one to evacutate the failing the device, another to refill the replacement). Signed-off-by: Sage Weil --- diff --git a/src/pybind/mgr/devicehealth/module.py b/src/pybind/mgr/devicehealth/module.py index a7bea09854d9..6f069137881a 100644 --- a/src/pybind/mgr/devicehealth/module.py +++ b/src/pybind/mgr/devicehealth/module.py @@ -47,7 +47,7 @@ class Module(MgrModule): }, { 'name': 'warn_threshold', - 'default': str(86400 * 14 * 2), + 'default': str(86400 * 14 * 6), }, { 'name': 'self_heal',