From: Sage Weil <sage@redhat.com>
Date: Wed, 31 Jul 2019 09:57:49 +0000 (-0500)
Subject: doc/rados/operations/health-checks: document MGR_DOWN
X-Git-Tag: v15.1.0~1877^2~7
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=078ef210d585a3e0ad60d7c84b637bb26a41cd20;p=ceph.git

doc/rados/operations/health-checks: document MGR_DOWN

Signed-off-by: Sage Weil <sage@redhat.com>
---

diff --git a/doc/rados/operations/health-checks.rst b/doc/rados/operations/health-checks.rst
index f6ca463bf0e1..0668aa41845e 100644
--- a/doc/rados/operations/health-checks.rst
+++ b/doc/rados/operations/health-checks.rst
@@ -75,6 +75,24 @@ If a monitor is configured to listen for v1 connections on a non-standard port (
 Manager
 -------
 
+MGR_DOWN
+________
+
+All manager daemons are currently down.  The cluster should normally
+have at least one running manager (``ceph-mgr``) daemon.  If no
+manager daemon is running, the cluster's ability to monitor itself will
+be compromised, and parts of the management API will become
+unavailable (for example, the dashboard will not work, and most CLI
+commands that report metrics or runtime state will block).  However,
+the cluster will still be able to perform all IO operations and
+recover from failures.
+
+The down manager daemon should generally be restarted as soon as
+possible to ensure that the cluster can be monitored (e.g., so that
+the ``ceph -s`` information is up to date, and/or metrics can be
+scraped by Prometheus).
+
+
 MGR_MODULE_DEPENDENCY
 _____________________