From 5b1c6ad4967196cb97afd8c04848b13ee5a198f0 Mon Sep 17 00:00:00 2001 From: Kefu Chai Date: Sat, 27 Aug 2022 23:46:00 +0800 Subject: [PATCH] mon/MgrMonitor: do not propse again for "mgr fail" in 23c3f76018b446fb77bbd71fdd33bddfbae9e06d, the change to fail the mgr is proposed immediately. but `MgrMonitor::prepare_command()` method still returns `true` in this case. its indirect caller of `PaxosService::dispatch()` considers this as a sign that it needs to propose the change with `propose_pending()`. but the pending change has already been proposed by `MgrMonitor::prepare_command()`, and `have_pending` is also cleared by this call. as we don't allow consecutive paxos proposals, the second `propose_pending()` call is delayed with a configured latency. but when the timer is fired, this poseponed call would find itself trying to propose nothing. the change to fail the mgr has been proposed. that's why we have `ceph_assert(have_pending)` assertion failures. in this change, the second proposal is not proposed anymore if the proposal is proposed immediately. this should avoid the assertion failure. this change should address the regression introduced by 23c3f76018b446fb77bbd71fdd33bddfbae9e06d. Fixes: https://tracker.ceph.com/issues/56850 Signed-off-by: Kefu Chai --- src/mon/MgrMonitor.cc | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mon/MgrMonitor.cc b/src/mon/MgrMonitor.cc index c394d3ee76921..55309e203d358 100644 --- a/src/mon/MgrMonitor.cc +++ b/src/mon/MgrMonitor.cc @@ -1272,13 +1272,17 @@ out: getline(ss, rs); if (r >= 0) { + bool do_update = false; if (prefix == "mgr fail" && is_writeable()) { propose_pending(); + do_update = false; + } else { + do_update = true; } // success.. delay reply wait_for_finished_proposal(op, new Monitor::C_Command(mon, op, r, rs, get_last_committed() + 1)); - return true; + return do_update; } else { // reply immediately mon.reply_command(op, r, rs, rdata, get_last_committed()); -- 2.39.5