doc: Add documentation and release notes

author David Zafman <dzafman@redhat.com>

Thu, 11 Jul 2019 00:05:47 +0000 (00:05 +0000)

committer David Zafman <dzafman@redhat.com>

Fri, 18 Oct 2019 17:49:40 +0000 (10:49 -0700)
author David Zafman <dzafman@redhat.com>
Thu, 11 Jul 2019 00:05:47 +0000 (00:05 +0000)
committer David Zafman <dzafman@redhat.com>
Fri, 18 Oct 2019 17:49:40 +0000 (10:49 -0700)
diff --git a/PendingReleaseNotes b/PendingReleaseNotes

index 0a0c6dc0c7d9d9d442722700eef01671e05b37e0..a20f36098f8b8bc140fdd5dda1b3e91aee106f51 100644 (file)
--- a/PendingReleaseNotes
+++ b/PendingReleaseNotes
@@ -10,3 +10,18 @@
    objects and the other deletes them. Read the troubleshooting section
    of the dynamic resharding docs for details.
  
+* A health warning is now generated if the average osd heartbeat ping
+  time exceeds a configurable threshold for any of the intervals
+  computed.  The OSD computes 1 minute, 5 minute and 15 minute
+  intervals with average, minimum and maximum values.  New configuration
+  option ``mon_warn_on_slow_ping_ratio`` specifies a percentage of
+  ``osd_heartbeat_grace`` to determine the threshold.  A value of zero
+  disables the warning.  New configuration option
+ ``mon_warn_on_slow_ping_time`` specified in microseconds over-rides the
+  computed value, causes a warning
+  when OSD heartbeat pings take longer than the specified amount.
+  New admin command ``ceph daemon mgr.# dump_osd_network [threshold]`` command will
+  list all connections with a ping time longer than the specified threshold or
+  value determined by the config options, for the average for any of the 3 intervals.
+  New admin command ``ceph daemon osd.# dump_osd_network [threshold]`` will
+  do the same but only including heartbeats initiated by the specified OSD.
diff --git a/doc/rados/configuration/mon-config-ref.rst b/doc/rados/configuration/mon-config-ref.rst

index 71e4e54e92ba77f7c983431764ab7d8a2f21e229..a84f459585f4e6e0f7e92b48ef02bee5d6f08e69 100644 (file)
--- a/doc/rados/configuration/mon-config-ref.rst
+++ b/doc/rados/configuration/mon-config-ref.rst
@@ -395,6 +395,25 @@ by setting it in the ``[mon]`` section of the configuration file.
  :Default: True
  
  
+``mon warn on slow ping ratio``
+
+:Description: Issue a ``HEALTH_WARN`` in cluster log if any heartbeat
+              between OSDs exceeds ``mon warn on slow ping ratio``
+              of ``osd heartbeat grace``.  The default is 5%.
+:Type: Float
+:Default: ``0.05``
+
+
+``mon warn on slow ping time``
+
+:Description: Override ``mon warn on slow ping ratio`` with a specific value.
+              Issue a ``HEALTH_WARN`` in cluster log if any heartbeat
+              between OSDs exceeds ``mon warn on slow ping time``
+              microseconds.  The default is 0 (disabled).
+:Type: Integer
+:Default: ``0``
+
+
  ``mon cache target full warn ratio``
  
  :Description: Position between pool's ``cache_target_full`` and
author	David Zafman <dzafman@redhat.com>
	Thu, 11 Jul 2019 00:05:47 +0000 (00:05 +0000)
committer	David Zafman <dzafman@redhat.com>
	Fri, 18 Oct 2019 17:49:40 +0000 (10:49 -0700)
PendingReleaseNotes		patch \| blob \| history
doc/rados/configuration/mon-config-ref.rst		patch \| blob \| history