From: Kefu Chai Date: Sat, 27 Aug 2022 15:46:00 +0000 (+0800) Subject: mon/MgrMonitor: do not propse again for "mgr fail" X-Git-Tag: v17.2.7~395^2~3 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=33914ca9465287fb9870c800cf26b49f3182189c;p=ceph.git mon/MgrMonitor: do not propse again for "mgr fail" in 23c3f76018b446fb77bbd71fdd33bddfbae9e06d, the change to fail the mgr is proposed immediately. but `MgrMonitor::prepare_command()` method still returns `true` in this case. its indirect caller of `PaxosService::dispatch()` considers this as a sign that it needs to propose the change with `propose_pending()`. but the pending change has already been proposed by `MgrMonitor::prepare_command()`, and `have_pending` is also cleared by this call. as we don't allow consecutive paxos proposals, the second `propose_pending()` call is delayed with a configured latency. but when the timer is fired, this poseponed call would find itself trying to propose nothing. the change to fail the mgr has been proposed. that's why we have `ceph_assert(have_pending)` assertion failures. in this change, the second proposal is not proposed anymore if the proposal is proposed immediately. this should avoid the assertion failure. this change should address the regression introduced by 23c3f76018b446fb77bbd71fdd33bddfbae9e06d. Fixes: https://tracker.ceph.com/issues/56850 Signed-off-by: Kefu Chai (cherry picked from commit 5b1c6ad4967196cb97afd8c04848b13ee5a198f0) --- diff --git a/src/mon/MgrMonitor.cc b/src/mon/MgrMonitor.cc index c394d3ee769..55309e203d3 100644 --- a/src/mon/MgrMonitor.cc +++ b/src/mon/MgrMonitor.cc @@ -1272,13 +1272,17 @@ out: getline(ss, rs); if (r >= 0) { + bool do_update = false; if (prefix == "mgr fail" && is_writeable()) { propose_pending(); + do_update = false; + } else { + do_update = true; } // success.. delay reply wait_for_finished_proposal(op, new Monitor::C_Command(mon, op, r, rs, get_last_committed() + 1)); - return true; + return do_update; } else { // reply immediately mon.reply_command(op, r, rs, rdata, get_last_committed());