]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
osd/ECUtil: Fix erase_after_ro_offset length calculation and add tests 66817/head
authorAlex Ainscow <aainscow@uk.ibm.com>
Fri, 2 Jan 2026 18:47:37 +0000 (18:47 +0000)
committerAlex Ainscow <aainscow@uk.ibm.com>
Thu, 8 Jan 2026 14:01:03 +0000 (14:01 +0000)
commita60c86ae0a8db3f37832de779341d1df7510d9dd
treefaa83acbe15cff0add566e3d66b2fb4c8228df93
parent284ce91772f67195e8c2070148c6add27510e42d
osd/ECUtil: Fix erase_after_ro_offset length calculation and add tests

 System test logs showed EC recovery failures with assertion errors when
 recovering small objects (smaller than stripe width) in EC pools.
 The recovery would fail with "shard_size >= tobj_size" assertions
 because shards that should be empty incorrectly contained data.

 The primary change in this commit fixes a bug in
 shard_extent_map_t::erase_after_ro_offset() where the length
 calculation was incorrect:

   sinfo->ro_range_to_shard_extent_set(ro_offset, ro_end - ro_start, ...)

 Should be:

   sinfo->ro_range_to_shard_extent_set(ro_offset, ro_end - ro_offset, ...)

 When ro_offset < ro_start, the incorrect calculation caused data that
 should be erased to remain on shards, leading to recovery failures.

 Additionally, this commit adds 13 comprehensive unit tests to TestECUtil
 that thoroughly exercise erase_after_ro_offset across various edge cases,
 including the critical scenario of objects smaller than stripe width where
 some shards should remain empty. These tests successfully catch the bug
 when it is re-introduced.

 Note: The unit tests in this commit were generated with assistance from
 an LLM (Large Language Model) and subsequently validated and refined.

Fixes: https://tracker.ceph.com/issues/74329
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
src/osd/ECUtil.cc
src/test/osd/TestECUtil.cc