doc/cephfs: explain the various health messages

author John Spray <john.spray@redhat.com>

Thu, 7 Jul 2016 15:45:08 +0000 (16:45 +0100)

committer John Spray <john.spray@redhat.com>

Thu, 21 Jul 2016 11:32:05 +0000 (12:32 +0100)
author John Spray <john.spray@redhat.com>
Thu, 7 Jul 2016 15:45:08 +0000 (16:45 +0100)
committer John Spray <john.spray@redhat.com>
Thu, 21 Jul 2016 11:32:05 +0000 (12:32 +0100)
diff --git a/doc/cephfs/health-messages.rst b/doc/cephfs/health-messages.rst

new file mode 100644 (file)

index 0000000..2a345bb
--- /dev/null
+++ b/doc/cephfs/health-messages.rst
@@ -0,0 +1,118 @@
+
+======================
+CephFS health messages
+======================
+
+Cluster health checks
+=====================
+
+The Ceph monitor daemons will generate health messages in response
+to certain states of the filesystem map structure (and the enclosed MDS maps).
+
+Message: mds rank(s) *ranks* have failed
+Description: One or more MDS ranks are not currently assigned to
+an MDS daemon; the cluster will not recover until a suitable replacement
+daemon starts.
+
+Message: mds rank(s) *ranks* are damaged
+Description: One or more MDS ranks has encountered severe damage to
+its stored metadata, and cannot start again until it is repaired.
+
+Message: mds cluster is degraded
+Description: One or more MDS ranks are not currently up and running, clients
+may pause metadata IO until this situation is resolved.  This includes
+ranks being failed or damaged, and additionally includes ranks
+which are running on an MDS but have not yet made it to the *active*
+state (e.g. ranks currently in *replay* state).
+
+Message: mds *names* are laggy
+Description: The named MDS daemons have failed to send beacon messages
+to the monitor for at least ``mds_beacon_grace`` (default 15s), while
+they are supposed to send beacon messages every ``mds_beacon_interval``
+(default 4s).  The daemons may have crashed.  The Ceph monitor will
+automatically replace laggy daemons with standbys if any are available.
+
+Daemon-reported health checks
+=============================
+
+MDS daemons can identify a variety of unwanted conditions, and
+indicate these to the operator in the output of ``ceph status``.
+This conditions have human readable messages, and additionally
+a unique code starting MDS_HEALTH which appears in JSON output.
+
+Message: "Behind on trimming..."
+Code: MDS_HEALTH_TRIM
+Description: CephFS maintains a metadata journal that is divided into
+*log segments*.  The length of journal (in number of segments) is controlled
+by the setting ``mds_log_max_segments``, and when the number of segments
+exceeds that setting the MDS starts writing back metadata so that it
+can remove (trim) the oldest segments.  If this writeback is happening
+too slowly, or a software bug is preventing trimming, then this health
+message may appear.  The threshold for this message to appear is for the
+number of segments to be double ``mds_log_max_segments``.
+
+Message: "Client *name* failing to respond to capability release"
+Code: MDS_HEALTH_CLIENT_LATE_RELEASE, MDS_HEALTH_CLIENT_LATE_RELEASE_MANY
+Description: CephFS clients are issued *capabilities* by the MDS, which
+are like locks.  Sometimes, for example when another client needs access,
+the MDS will request clients release their capabilities.  If the client
+is unresponsive or buggy, it might fail to do so promptly or fail to do
+so at all.  This message appears if a client has taken longer than
+``mds_revoke_cap_timeout`` (default 60s) to comply.
+
+Message: "Client *name* failing to respond to cache pressure"
+Code: MDS_HEALTH_CLIENT_RECALL, MDS_HEALTH_CLIENT_RECALL_MANY
+Description: Clients maintain a metadata cache.  Items (such as inodes)
+in the client cache are also pinned in the MDS cache, so when the MDS
+needs to shrink its cache (to stay within ``mds_cache_size``), it
+sends messages to clients to shrink their caches too.  If the client
+is unresponsive or buggy, this can prevent the MDS from properly staying
+within its ``mds_cache_size`` and it may eventually run out of memory
+and crash.  This message appears if a client has taken more than
+``mds_recall_state_timeout`` (default 60s) to comply.
+
+Message: "Client *name* failing to advance its oldest client/flush tid"
+Code: MDS_HEALTH_CLIENT_OLDEST_TID, MDS_HEALTH_CLIENT_OLDEST_TID_MANY
+Description: The CephFS client-MDS protocol uses a field called the
+*oldest tid* to inform the MDS of which client requests are fully
+complete and may therefore be forgotten about by the MDS.  If a buggy
+client is failing to advance this field, then the MDS may be prevented
+from properly cleaning up resources used by client requests.  This message
+appears if a client appears to have more than ``max_completed_requests``
+(default 100000) requests that are complete on the MDS side but haven't
+yet been accounted for in the client's *oldest tid* value.
+
+Message: "Metadata damage detected"
+Code: MDS_HEALTH_DAMAGE,
+Description: Corrupt or missing metadata was encountered when reading
+from the metadata pool.  This message indicates that the damage was
+sufficiently isolated for the MDS to continue operating, although
+client accesses to the damaged subtree will return IO errors.  Use
+the ``damage ls`` admin socket command to get more detail on the damage.
+This message appears as soon as any damage is encountered.
+
+Message: "MDS in read-only mode"
+Code: MDS_HEALTH_READ_ONLY,
+Description: The MDS has gone into readonly mode and will return EROFS
+error codes to client operations that attempt to modify any metadata.  The
+MDS will go into readonly mode if it encounters a write error while
+writing to the metadata pool, or if forced to by an administrator using
+the *force_readonly* admin socket command.
+
+Message: *N* slow requests are blocked"
+Code: MDS_HEALTH_SLOW_REQUEST,
+Description: One or more client requests have not been completed promptly,
+indicating that the MDS is either running very slowly, or that the RADOS
+cluster is not acknowledging journal writes promptly, or that there is a bug.
+Use the ``ops`` admin socket command to list outstanding metadata operations.
+This message appears if any client requests have taken longer than
+``mds_op_complaint_time`` (default 30s).
+
+Message: "Too many inodes in cache"
+Code: MDS_HEALTH_CACHE_OVERSIZED
+Description: The MDS is not succeeding in trimming its cache to comply
+with the limit set by the administrator.  If the MDS cache becomes too large,
+the daemon may exhaust available memory and crash.
+This message appears if the actual cache size (in inodes) is at least 50%
+greater than ``mds_cache_size`` (default 100000).
+
diff --git a/doc/cephfs/index.rst b/doc/cephfs/index.rst

index ece8fcbae5bdfa5fe8eaac7805f529beb95fdd35..76cf92a1cc1362cb0d59126cd971f107ee7682e7 100644 (file)
--- a/doc/cephfs/index.rst
+++ b/doc/cephfs/index.rst
@@ -90,6 +90,7 @@ authentication keyring.
         File layouts <file-layouts>
         Client eviction <eviction>
         Handling full filesystems <full>
+    Health messages <health-messages>
         Troubleshooting <troubleshooting>
         Disaster recovery <disaster-recovery>
         Client authentication <client-auth>
author	John Spray <john.spray@redhat.com>
	Thu, 7 Jul 2016 15:45:08 +0000 (16:45 +0100)
committer	John Spray <john.spray@redhat.com>
	Thu, 21 Jul 2016 11:32:05 +0000 (12:32 +0100)
doc/cephfs/health-messages.rst	[new file with mode: 0644]	patch \| blob
doc/cephfs/index.rst		patch \| blob \| history