Consider a PG that is stray and ends up in ReplicaActive (because it is
participating as a recovery source). If it is marked down wrongly and
then comes back up, then the PG will not reset, because there was not
an interval change (the PG is not part of the up or acting sets).
This can leave the PG in an odd state, leading to questionable behavior.
(For example, a stray might be in ReplicaActive and then ignore some
types of query messages.)
Signed-off-by: Sage Weil <sage@redhat.com>
dout(20) << "new interval newup " << newup
<< " newacting " << newacting << dendl;
return true;
- } else {
- return false;
}
+ if (!lastmap->is_up(osd->whoami) && osdmap->is_up(osd->whoami)) {
+ dout(10) << __func__ << " osd transitioned from down -> up" << dendl;
+ return true;
+ }
+ return false;
}
bool PG::old_peering_msg(epoch_t reply_epoch, epoch_t query_epoch)