From: Jason Dillaman <dillaman@redhat.com>
Date: Fri, 20 Mar 2020 16:59:14 +0000 (-0400)
Subject: rbd-mirror: leader watcher should not cancel get locker if locker is invalid
X-Git-Tag: v15.2.0~6^2
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=refs%2Fpull%2F34040%2Fhead;p=ceph.git

rbd-mirror: leader watcher should not cancel get locker if locker is invalid

When a new leader acquires the lock, it will send out a lock acquired
notification along with periodic heartbeats. The get locker will attempt to
run immediately, but if a heartbeat arrives before it executes the heartbeat
will cancel the timer and reschedule it for the future. This process repeats
for each periodic heartbeat and the locker is never re-read from the OSD.

This is an issue only for namespace replayers due to the delayed fashion in
which the leader instance id is retrieved.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
---

diff --git a/src/tools/rbd_mirror/LeaderWatcher.cc b/src/tools/rbd_mirror/LeaderWatcher.cc
index 1581319219d..ae705e3c5e2 100644
--- a/src/tools/rbd_mirror/LeaderWatcher.cc
+++ b/src/tools/rbd_mirror/LeaderWatcher.cc
@@ -946,7 +946,7 @@ void LeaderWatcher<I>::handle_heartbeat(Context *on_notify_ack) {
     std::scoped_lock locker{m_threads->timer_lock, m_lock};
     if (is_leader(m_lock)) {
       dout(5) << "got another leader heartbeat, ignoring" << dendl;
-    } else {
+    } else if (!m_locker.cookie.empty()) {
       cancel_timer_task();
       m_acquire_attempts = 0;
       schedule_acquire_leader_lock(1);