From: Sage Weil Date: Thu, 14 Aug 2014 23:55:58 +0000 (-0700) Subject: mon/Paxos: verify all new peons are still contiguous at end of round X-Git-Tag: v0.80.8~32^2~3 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=6c5b9a666fcd94e175a8b9771368b55246957efe;p=ceph.git mon/Paxos: verify all new peons are still contiguous at end of round During the collect phase we verify that each peon has overlapping or contiguous versions as us (and can therefore be caught up with some series of transactions). However, we *also* assimilate any new states we get from those peers, and that may move our own first_committed forward in time. This means that an early responder might have originally been contiguous, but a later one moved us forward, and when the round finished they were not contiguous any more. This leads to a crash on the peon when they get our first begin message. For example: - we have 10..20 - first peon has 5..15 - ok! - second peon has 18..30 - we apply this state - we are now 18..30 - we finish the round - send commit to first peon (empty.. we aren't contiguous) - send no commit to second peon (we match) - we send a begin for state 31 - first peon crashes (it's lc is still 15) Prevent this by checking at the end of the round if we are still contiguous. If not, bootstrap. This is similar to the check we do above, but reverse to make sure *we* aren't too far ahead of *them*. Fixes: #9053 Signed-off-by: Sage Weil (cherry picked from commit 3e5ce5f0dcec9bbe9ed4a6b41758ab7802614810) --- diff --git a/src/mon/Paxos.cc b/src/mon/Paxos.cc index dec64c9871fc9..0a5083dc712f7 100644 --- a/src/mon/Paxos.cc +++ b/src/mon/Paxos.cc @@ -524,10 +524,20 @@ void Paxos::handle_last(MMonPaxos *last) mon->timer.cancel_event(collect_timeout_event); collect_timeout_event = 0; - // share committed values? + // is everyone contiguous and up to date? for (map::iterator p = peer_last_committed.begin(); p != peer_last_committed.end(); ++p) { + if (p->second < first_committed && first_committed > 1) { + dout(5) << __func__ + << " peon " << p->first + << " last_committed (" << p->second + << ") is too low for our first_committed (" << first_committed + << ") -- bootstrap!" << dendl; + last->put(); + mon->bootstrap(); + return; + } if (p->second < last_committed) { // share committed values dout(10) << " sending commit to mon." << p->first << dendl;