When a PG is remapped from OSD `a` to OSD `b`, the objects are
backfilled. When OSD `a` is restarted, objects become degraded
as `a` is no longer queried or considered as a backfill source.
As the PG is degraded, `PG::discover_all_missing` is not called
when a candidate OSD peers with the primary: The PG is already
active, thus `PG::activate` (and in turn missing object discovery)
is not called. Discovery is also not initiated from
`PG::RecoveryState::Active::react(const MNotifyRec& notevt)`
as there are no unfound objects.
This patch adds a call to `discover_all_missing` when
when an OSD sends its `MNotifyRec` message and the PG is degraded.
Fixes: https://tracker.ceph.com/issues/37439
Signed-off-by: Jonas Jelten <jj@stusta.net>
(cherry picked from commit
e152d092f7b7839bb27ac7a5cf1c95f4d3752b32)
<< dendl;
pg->proc_replica_info(
notevt.from, notevt.notify.info, notevt.notify.epoch_sent);
- if (pg->have_unfound()) {
+ if (pg->have_unfound() || (pg->is_degraded() && pg->might_have_unfound.count(notevt.from))) {
pg->discover_all_missing(*context< RecoveryMachine >().get_query_map());
}
}