From 2170bdf6a8678c9358eb19115ad4054e68b117bd Mon Sep 17 00:00:00 2001 From: Sage Weil Date: Tue, 9 Apr 2019 16:45:47 -0500 Subject: [PATCH] doc/rados/operations/devices: document device failure prediction Signed-off-by: Sage Weil (cherry picked from commit 67fadc711aadde093b2ad81442ce5c9e65141af4) --- doc/rados/operations/devices.rst | 37 +++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/doc/rados/operations/devices.rst b/doc/rados/operations/devices.rst index 2815abbd96f5..b3db1e3df850 100644 --- a/doc/rados/operations/devices.rst +++ b/doc/rados/operations/devices.rst @@ -72,7 +72,42 @@ for a specific timestamp) with:: Failure prediction ------------------ -TBD +Ceph can predict life expectancy and device failures based on the +health metrics it collects. There are three modes: + +* *none*: disable device failure prediction. +* *local*: use a pre-trained prediction model from the ceph-mgr daemon +* *cloud*: share device health and performance metrics an external + cloud service run by ProphetStor, using either their free service or + a paid service with more accurate predictions + +The prediction mode can be configured with:: + + ceph config set global device_failure_prediction_mode + +Prediction normally runs in the background on a periodic basis, so it +may take some time before life expectancy values are populated. You +can see the life expectancy of all devices in output from:: + + ceph device ls + +You can also query the metadata for a specific device with:: + + ceph device info + +You can explicitly force prediction of a device's life expectancy with:: + + ceph device predict-life-expectancy + +If you are not using Ceph's internal device failure prediction but +have some external source of information about device failures, you +can inform Ceph of a device's life expectancy with:: + + ceph device set-life-expectancy [] + +Life expectancies are expressed as a time interval so that +uncertainty can be expressed in the form of a wide interval. The +interval end can also be left unspecified. Health alerts ------------- -- 2.47.3