If we store any new state, we need to refresh the services, even if we
are still in the midst of Paxos recovery. This is because the
subscription path will share any committed state even when paxos is
still recovering. This prevents a race like:
- we have maps 10..20
- we drop out of quorum
- we are elected leader, paxos recovery starts
- we get one LAST with committed states that trim maps 10..15
- we get a subscribe for map 10..20
- we crash because 10 is no longer on disk because the PaxosService
is out of sync with the on-disk state.
Fixes: #6045
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit
981eda9f7787c83dc457f061452685f499e7dd27)
// leader
void Paxos::handle_last(MMonPaxos *last)
{
+ bool need_refresh = false;
+
dout(10) << "handle_last " << *last << dendl;
if (!mon->is_leader()) {
assert(g_conf->paxos_kill_at != 1);
// store any committed values if any are specified in the message
- store_state(last);
+ need_refresh = store_state(last);
assert(g_conf->paxos_kill_at != 2);
dout(10) << "that's everyone. active!" << dendl;
extend_lease();
+ need_refresh = false;
if (do_refresh()) {
finish_round();
dout(10) << "old pn, ignoring" << dendl;
}
+ if (need_refresh)
+ do_refresh();
+
last->put();
}