osd: Optimized EC missing list not updated on recovering shard
Shards that are recovering (last_complete != last_update) are using
pwlc to advance last_update for writes that did not effect the shard.
However simply incrementing last_update means that the primary doesnt
send the shard log entries that it missed and consequently it cannot
update its missing list.
If the shard is already missing object X at version V1 and there was
a partial write at V2 that did not update the shard, it does not need
to retain the log entry, but it does need to update the missing list
to say it needs V2 rather than V1. This ensures all shards report
a need for an object at the same version and avoids an assert in
MissingLoc::add_active_missing when the primary is trying to
combine the missing lists from all the shards to work out what has
to be recovered.
The fix is to avoid applying pwlc when last_complete != last_update,
this forces the primary to send the log to the recovering shard
which can then update its missing list (and discarding the log
entries as they are partial writes).
Fixes: https://tracker.ceph.com/issues/73249 Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>