]> git.apps.os.sepia.ceph.com Git - ceph.git/commitdiff
src/mon/Monitor: Fix set_elector_disallowed_leaders 54004/head
authorKamoltat <ksirivad@redhat.com>
Wed, 11 Oct 2023 21:12:03 +0000 (21:12 +0000)
committerKamoltat <ksirivad@redhat.com>
Fri, 13 Oct 2023 16:10:17 +0000 (16:10 +0000)
Problem:

In the monitors we hold 2 copies of disallowed_leader ...
1. MonMap class 2. Elector class.
When computing the ConnectivityScore for the monitors during
the election, we use the `disallowed_leader` from Elector
class to determine which monitors we shouldn't allow to lead.

Now, we rely on the function `set_elector_disallowed_leaders`
to set the `disallowed_leader` of the Elector class, MonMap
class copy of the `disallowed_leader` contains the
`tiebreaker_monitor` so we inherit that plus we also add the
monitors that are dead due to a zone failure.

Hence, the `adding dead monitors` phase is only allowed if we can
enter stretch_mode. However, there is a problem when failing over a stretch cluster
zone and reviving the entire zone back up, the revived monitors
couldn't enter stretch_mode when they are at the state of "probing"
since PaxosServices like osdmon becomes unreadable (this is expected)

Solution:

We unconditionally add monitors that are in
`monmap->stretch_marked_down_mons` to the
`disallowed_leaders` list in
`Monitor::set_elector_disallowed_leaders` since
if the monitors are in `monmap->stretch_marked_down_mons`
we know that they probably belong in a marked down
zone and is not fit for lead.

This will fix the problem of newly revived monitors
having different disallowed_leaders set
and getting stuck in election.

Fixes: https://tracker.ceph.com/issues/63183
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 5d1b5da21591c57cb0cbbbc8775b6ea0ced953a4)

src/mon/Monitor.cc

index 08a36a2e8fd4347d24ccecc776584f26afcb0665..0f6edb6070191ce32132440c0fd8a70a6d0b20f0 100644 (file)
@@ -6637,14 +6637,16 @@ void Monitor::notify_new_monmap(bool can_change_external_state, bool remove_rank
 void Monitor::set_elector_disallowed_leaders(bool allow_election)
 {
   set<int> dl;
+  // inherit dl from monmap
   for (auto name : monmap->disallowed_leaders) {
     dl.insert(monmap->get_rank(name));
-  }
-  if (is_stretch_mode()) {
-    for (auto name : monmap->stretch_marked_down_mons) {
-      dl.insert(monmap->get_rank(name));
-    }
-    dl.insert(monmap->get_rank(monmap->tiebreaker_mon));
+  } // unconditionally add stretch_marked_down_mons to the new dl copy
+  for (auto name : monmap->stretch_marked_down_mons) {
+    dl.insert(monmap->get_rank(name));
+  } // add the tiebreaker_mon incase it is not in monmap->disallowed_leaders
+  if (!monmap->tiebreaker_mon.empty() &&
+      monmap->contains(monmap->tiebreaker_mon)) {
+      dl.insert(monmap->get_rank(monmap->tiebreaker_mon));
   }
 
   bool disallowed_changed = elector.set_disallowed_leaders(dl);