there is a race be found, when we repair object on clean state,
we queue a DoRecovery peering event, but before the peering event
dequeue,a snaptrim event on the missing object's snap dequeue,
then we will get pass the context< SnapTrimmer >().can_trim()
and go to get the context of the missing object(snapdir)
we can avoid this by clear clean state when we found missing..
Fixes: https://tracker.ceph.com/issues/41348
Signed-off-by: Zengran Zhang <zhangzengran@sangfor.com.cn>
(cherry picked from commit
521f095c6505bbee7570fb3c01b32436bdbf65a4)
Conflicts:
src/osd/PrimaryLogPG.cc
- assert() instead of ceph_assert(), and Feature PR #26942 ("Improvements to
auto repair") is not being backported
if (!eio_errors_to_process) {
eio_errors_to_process = true;
assert(is_clean());
+ state_clear(PG_STATE_CLEAN);
queue_peering_event(
CephPeeringEvtRef(
std::make_shared<CephPeeringEvt>(