From: Sage Weil Date: Tue, 31 Jul 2018 14:38:39 +0000 (-0500) Subject: doc/mgr/devicehealth: document devicehealth module X-Git-Tag: v14.0.1~723^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=f09a87f9025fca332697f66be4a3c889fe80ccc9;p=ceph.git doc/mgr/devicehealth: document devicehealth module Signed-off-by: Sage Weil --- diff --git a/doc/mgr/devicehealth.rst b/doc/mgr/devicehealth.rst new file mode 100644 index 000000000000..5e0d0012192b --- /dev/null +++ b/doc/mgr/devicehealth.rst @@ -0,0 +1,52 @@ +Devicehealth plugin +=================== + +The *devicehealth* plugin includes code to manage physical devices +that back Ceph daemons (e.g., OSDs). This includes scraping health +metrics (e.g., SMART) and responding to health metrics by migrating +data away from failing devices. + +Enabling +-------- + +The *devicehealth* module is enabled with:: + + ceph mgr module enable devicehealth + +(It is enabled by default.) + +Scraping +-------- + +Health metrics can be scraped from all devices with:: + + ceph device scrape-health-metrics + +A single device can be scraped with:: + + ceph device scrape-health-metrics + +Or a single daemon's devices can be scraped with:: + + ceph device scrape-daemon-health-metrics + + +Health monitoring +----------------- + +By default, the devicehealth module wakes up periodically and checks +the health of all devices in the system. This will raise health +alerts if devices are expected to fail soon. This can be disabled by +turning off the ``mgr/devicehealth/enable_monitoring`` option. + +The ``mgr/devicehealth/warn_threshold`` controls how soon an expected +device failure must be before we generate a health warning. + +If the ``mgr/devicehealth/self_heal`` option is enabled (it is by +default), then for devices that are expected to fail soon the module +will automatically migrate data away from them by marking the devices +"out". + +The ``mgr/devicehealth/mark_out_threshold`` controls how soon an +expected device failure must be before we automatically mark an osd +"out". diff --git a/doc/mgr/index.rst b/doc/mgr/index.rst index ea8c9d48ca1b..e640292f40f4 100644 --- a/doc/mgr/index.rst +++ b/doc/mgr/index.rst @@ -39,3 +39,4 @@ sensible. Telemetry plugin Iostat plugin Crash plugin + Devicehealth plugin