From: Samuel Just Date: Mon, 16 Jul 2012 20:11:24 +0000 (-0700) Subject: PG::RecoveryState::Stray::react(LogEvt&): reset last_pg_scrub X-Git-Tag: v0.49~10 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=c7fb964c077d369943bd5c066c5f99da6bd5f37c;p=ceph.git PG::RecoveryState::Stray::react(LogEvt&): reset last_pg_scrub We need to reset the last_pg_scrub data in the osd since we are replacing the info. Probably fixes #2453 In cases like 2453, we hit the following backtrace: 0> 2012-05-19 17:24:09.113684 7fe66be3d700 -1 osd/OSD.h: In function 'void OSD::unreg_last_pg_scrub(pg_t, utime_t)' thread 7fe66be3d700 time 2012-05-19 17:24:09.095719 osd/OSD.h: 840: FAILED assert(last_scrub_pg.count(p)) ceph version 0.46-313-g4277d4d (commit:4277d4d3378dde4264e2b8d211371569219c6e4b) 1: (OSD::unreg_last_pg_scrub(pg_t, utime_t)+0x149) [0x641f49] 2: (PG::proc_primary_info(ObjectStore::Transaction&, pg_info_t const&)+0x5e) [0x63383e] 3: (PG::RecoveryState::ReplicaActive::react(PG::RecoveryState::MInfoRec const&)+0x4a) [0x633eda] 4: (boost::statechart::detail::reaction_result boost::statechart::simple_state, (boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl, boost::statechart::custom_reaction, boost::statechart::custom_reaction >, boost::statechart::simple_state, (boost::statechart::history_mode)0> >(boost::statechart::simple_state, (boost::statechart::history_mode)0>&, boost::statechart::event_base const&, void const*)+0x130) [0x6466a0] 5: (boost::statechart::simple_state, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x81) [0x646791] 6: (boost::statechart::state_machine, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x63dfcb] 7: (boost::statechart::state_machine, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x11) [0x63e0f1] 8: (PG::RecoveryState::handle_info(int, pg_info_t&, PG::RecoveryCtx*)+0x177) [0x616987] 9: (OSD::handle_pg_info(std::tr1::shared_ptr)+0x665) [0x5d3d15] 10: (OSD::dispatch_op(std::tr1::shared_ptr)+0x2a0) [0x5d7370] 11: (OSD::_dispatch(Message*)+0x191) [0x5dd4a1] 12: (OSD::ms_dispatch(Message*)+0x153) [0x5ddda3] 13: (SimpleMessenger::dispatch_entry()+0x863) [0x77fbc3] 14: (SimpleMessenger::DispatchThread::entry()+0xd) [0x746c5d] 15: (()+0x7efc) [0x7fe679b1fefc] 16: (clone()+0x6d) [0x7fe67815089d] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. Because we don't clear the scrub state before reseting info, the last_scrub_stamp state in the info.history structure changes without updating the osd state resulting in the above assert failure. Backport: stable Signed-off-by: Samuel Just --- diff --git a/src/osd/PG.cc b/src/osd/PG.cc index 14d415b6e47bd..678b43519e22e 100644 --- a/src/osd/PG.cc +++ b/src/osd/PG.cc @@ -4475,7 +4475,11 @@ boost::statechart::result PG::RecoveryState::Stray::react(const MLogRec& logevt) if (msg->info.last_backfill == hobject_t()) { // restart backfill + pg->osd->unreg_last_pg_scrub(pg->info.pgid, + pg->info.history.last_scrub_stamp); pg->info = msg->info; + pg->osd->reg_last_pg_scrub(pg->info.pgid, + pg->info.history.last_scrub_stamp); pg->log.claim_log(msg->log); pg->missing.clear(); } else {