From: David Zafman Date: Wed, 26 Jun 2019 02:59:06 +0000 (+0000) Subject: osd mon: Track heartbeat ping times and report health warning X-Git-Tag: v14.2.5~117^2~20 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=f0a8c65b0aa9c7c672705f5f44a56d4db22c033d;p=ceph.git osd mon: Track heartbeat ping times and report health warning Fixes: http://tracker.ceph.com/issues/40640 Signed-off-by: David Zafman (cherry picked from commit 66d44e7f911a57100d650ad7df9445f88ec70140) Conflicts: src/common/options.cc (trivial) src/mon/PGMap.cc (trivial) src/osd/OSD.cc (trivial) src/osd/OSD.h (trivial) src/osd/osd_types.cc (trivial) src/mon/PGMap.cc manually get rid of extra argument to checks->add src/osd/OSD.cc rename ping_stamp to stamp for backport --- diff --git a/doc/rados/configuration/mon-osd-interaction.rst b/doc/rados/configuration/mon-osd-interaction.rst index e2c2477148148..42be922fec0b5 100644 --- a/doc/rados/configuration/mon-osd-interaction.rst +++ b/doc/rados/configuration/mon-osd-interaction.rst @@ -24,10 +24,8 @@ monitoring the Ceph Storage Cluster. OSDs Check Heartbeats ===================== -Each Ceph OSD Daemon checks the heartbeat of other Ceph OSD Daemons every 6 -seconds. You can change the heartbeat interval by adding an ``osd heartbeat -interval`` setting under the ``[osd]`` section of your Ceph configuration file, -or by setting the value at runtime. If a neighboring Ceph OSD Daemon doesn't +Each Ceph OSD Daemon checks the heartbeat of other Ceph OSD Daemons at random +intervals less than every 6 seconds. If a neighboring Ceph OSD Daemon doesn't show a heartbeat within a 20 second grace period, the Ceph OSD Daemon may consider the neighboring Ceph OSD Daemon ``down`` and report it back to a Ceph Monitor, which will update the Ceph Cluster Map. You may change this grace diff --git a/src/common/options.cc b/src/common/options.cc index f8e7a6b8e0c2b..ec52e4942d6bb 100644 --- a/src/common/options.cc +++ b/src/common/options.cc @@ -1732,6 +1732,11 @@ std::vector