From: Joao Eduardo Luis Date: Thu, 17 Jan 2013 18:11:23 +0000 (+0000) Subject: mon: Monitor: drop messages from old timecheck epochs X-Git-Tag: v0.57~181 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=c6f8010b1c8e4d54f9fb24b2e4e25ff8a2bde778;p=ceph.git mon: Monitor: drop messages from old timecheck epochs We were asserting when the message's timecheck epoch (which is mapped to the election epoch) was older than the current epoch. However, if a monitor is lagged just enough to not even notice an election happened, then it might eventually answer to old timechecks, which would make the leader assert. Instead, we just drop the message, while warning we did so. Fixes: #3835 Signed-off-by: Joao Eduardo Luis Reviewed-by: Sage Weil --- diff --git a/src/mon/Monitor.cc b/src/mon/Monitor.cc index 9dee4003ee60..4b6f3e155edb 100644 --- a/src/mon/Monitor.cc +++ b/src/mon/Monitor.cc @@ -2343,9 +2343,16 @@ void Monitor::handle_timecheck_leader(MTimeCheck *m) dout(10) << __func__ << " " << *m << dendl; /* handles PONG's */ assert(m->op == MTimeCheck::OP_PONG); - assert(m->epoch == timecheck_epoch); entity_inst_t other = m->get_source_inst(); + if (m->epoch < timecheck_epoch) { + dout(1) << __func__ << " got old timecheck epoch " << m->epoch + << " from " << other + << " curr " << timecheck_epoch + << " -- severely lagged? discard" << dendl; + return; + } + assert(m->epoch == timecheck_epoch); if (m->round < timecheck_round) { dout(1) << __func__ << " got old round " << m->round