doc: Add documentation and release notes

author David Zafman <dzafman@redhat.com>

Thu, 11 Jul 2019 00:05:47 +0000 (00:05 +0000)

committer David Zafman <dzafman@redhat.com>

Mon, 4 Nov 2019 22:21:21 +0000 (14:21 -0800)
author David Zafman <dzafman@redhat.com>
Thu, 11 Jul 2019 00:05:47 +0000 (00:05 +0000)
committer David Zafman <dzafman@redhat.com>
Mon, 4 Nov 2019 22:21:21 +0000 (14:21 -0800)
diff --git a/PendingReleaseNotes b/PendingReleaseNotes

index 131801d14de753699596c81382dce27ffa4e9f3e..aa3c36ef59802045897be9704d7997efe16dd870 100644 (file)
--- a/PendingReleaseNotes
+++ b/PendingReleaseNotes
@@ -24,3 +24,19 @@
    bucket reshard in earlier versions of RGW. One subcommand lists such
    objects and the other deletes them. Read the troubleshooting section
    of the dynamic resharding docs for details.
+
+* A health warning is now generated if the average osd heartbeat ping
+  time exceeds a configurable threshold for any of the intervals
+  computed.  The OSD computes 1 minute, 5 minute and 15 minute
+  intervals with average, minimum and maximum values.  New configuration
+  option ``mon_warn_on_slow_ping_ratio`` specifies a percentage of
+  ``osd_heartbeat_grace`` to determine the threshold.  A value of zero
+  disables the warning.  New configuration option
+ ``mon_warn_on_slow_ping_time`` specified in microseconds over-rides the
+  computed value, causes a warning
+  when OSD heartbeat pings take longer than the specified amount.
+  New admin command ``ceph daemon mgr.# dump_osd_network [threshold]`` command will
+  list all connections with a ping time longer than the specified threshold or
+  value determined by the config options, for the average for any of the 3 intervals.
+  New admin command ``ceph daemon osd.# dump_osd_network [threshold]`` will
+  do the same but only including heartbeats initiated by the specified OSD.
diff --git a/doc/rados/configuration/mon-config-ref.rst b/doc/rados/configuration/mon-config-ref.rst

index 9f1fb8fedbd711778c479c432610e203c7fb7a80..2b541d022d76834e3c3243d2070868fcb0a6f9ed 100644 (file)
--- a/doc/rados/configuration/mon-config-ref.rst
+++ b/doc/rados/configuration/mon-config-ref.rst
@@ -393,6 +393,25 @@ by setting it in the ``[mon]`` section of the configuration file.
  :Default: True
  
  
+``mon warn on slow ping ratio``
+
+:Description: Issue a ``HEALTH_WARN`` in cluster log if any heartbeat
+              between OSDs exceeds ``mon warn on slow ping ratio``
+              of ``osd heartbeat grace``.  The default is 5%.
+:Type: Float
+:Default: ``0.05``
+
+
+``mon warn on slow ping time``
+
+:Description: Override ``mon warn on slow ping ratio`` with a specific value.
+              Issue a ``HEALTH_WARN`` in cluster log if any heartbeat
+              between OSDs exceeds ``mon warn on slow ping time``
+              microseconds.  The default is 0 (disabled).
+:Type: Integer
+:Default: ``0``
+
+
  ``mon cache target full warn ratio``
  
  :Description: Position between pool's ``cache_target_full`` and
author	David Zafman <dzafman@redhat.com>
	Thu, 11 Jul 2019 00:05:47 +0000 (00:05 +0000)
committer	David Zafman <dzafman@redhat.com>
	Mon, 4 Nov 2019 22:21:21 +0000 (14:21 -0800)
PendingReleaseNotes		patch \| blob \| history
doc/rados/configuration/mon-config-ref.rst		patch \| blob \| history