From 263ad059380c2d328559ac5f1fb42504ddb1c4ab Mon Sep 17 00:00:00 2001 From: David Zafman Date: Thu, 11 Jul 2019 00:05:47 +0000 Subject: [PATCH] doc: Add documentation and release notes Signed-off-by: David Zafman (cherry picked from commit f4a0be2e8707f921d65bf22a6c1090e402905ad3) Conflicts: PendingReleaseNotes (trivial) --- PendingReleaseNotes | 16 ++++++++++++++++ doc/rados/configuration/mon-config-ref.rst | 19 +++++++++++++++++++ 2 files changed, 35 insertions(+) diff --git a/PendingReleaseNotes b/PendingReleaseNotes index 131801d14de75..aa3c36ef59802 100644 --- a/PendingReleaseNotes +++ b/PendingReleaseNotes @@ -24,3 +24,19 @@ bucket reshard in earlier versions of RGW. One subcommand lists such objects and the other deletes them. Read the troubleshooting section of the dynamic resharding docs for details. + +* A health warning is now generated if the average osd heartbeat ping + time exceeds a configurable threshold for any of the intervals + computed. The OSD computes 1 minute, 5 minute and 15 minute + intervals with average, minimum and maximum values. New configuration + option ``mon_warn_on_slow_ping_ratio`` specifies a percentage of + ``osd_heartbeat_grace`` to determine the threshold. A value of zero + disables the warning. New configuration option + ``mon_warn_on_slow_ping_time`` specified in microseconds over-rides the + computed value, causes a warning + when OSD heartbeat pings take longer than the specified amount. + New admin command ``ceph daemon mgr.# dump_osd_network [threshold]`` command will + list all connections with a ping time longer than the specified threshold or + value determined by the config options, for the average for any of the 3 intervals. + New admin command ``ceph daemon osd.# dump_osd_network [threshold]`` will + do the same but only including heartbeats initiated by the specified OSD. diff --git a/doc/rados/configuration/mon-config-ref.rst b/doc/rados/configuration/mon-config-ref.rst index 9f1fb8fedbd71..2b541d022d768 100644 --- a/doc/rados/configuration/mon-config-ref.rst +++ b/doc/rados/configuration/mon-config-ref.rst @@ -393,6 +393,25 @@ by setting it in the ``[mon]`` section of the configuration file. :Default: True +``mon warn on slow ping ratio`` + +:Description: Issue a ``HEALTH_WARN`` in cluster log if any heartbeat + between OSDs exceeds ``mon warn on slow ping ratio`` + of ``osd heartbeat grace``. The default is 5%. +:Type: Float +:Default: ``0.05`` + + +``mon warn on slow ping time`` + +:Description: Override ``mon warn on slow ping ratio`` with a specific value. + Issue a ``HEALTH_WARN`` in cluster log if any heartbeat + between OSDs exceeds ``mon warn on slow ping time`` + microseconds. The default is 0 (disabled). +:Type: Integer +:Default: ``0`` + + ``mon cache target full warn ratio`` :Description: Position between pool's ``cache_target_full`` and -- 2.39.5