]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ci.git/commit
osd/ECUtil: Fix erase_after_ro_offset length calculation and add tests
authorAlex Ainscow <aainscow@uk.ibm.com>
Fri, 2 Jan 2026 18:47:37 +0000 (18:47 +0000)
committerAlex Ainscow <aainscow@uk.ibm.com>
Tue, 3 Feb 2026 10:02:32 +0000 (10:02 +0000)
commit09c8fbd2e8d1c92da0b5f9e4dfff1026785e3118
tree08372daf612ab59bc4d93d2e530e7df9f0142d13
parentafb4610d03b7406df2564062de2cbcccae2dcf28
osd/ECUtil: Fix erase_after_ro_offset length calculation and add tests

 System test logs showed EC recovery failures with assertion errors when
 recovering small objects (smaller than stripe width) in EC pools.
 The recovery would fail with "shard_size >= tobj_size" assertions
 because shards that should be empty incorrectly contained data.

 The primary change in this commit fixes a bug in
 shard_extent_map_t::erase_after_ro_offset() where the length
 calculation was incorrect:

   sinfo->ro_range_to_shard_extent_set(ro_offset, ro_end - ro_start, ...)

 Should be:

   sinfo->ro_range_to_shard_extent_set(ro_offset, ro_end - ro_offset, ...)

 When ro_offset < ro_start, the incorrect calculation caused data that
 should be erased to remain on shards, leading to recovery failures.

 Additionally, this commit adds 13 comprehensive unit tests to TestECUtil
 that thoroughly exercise erase_after_ro_offset across various edge cases,
 including the critical scenario of objects smaller than stripe width where
 some shards should remain empty. These tests successfully catch the bug
 when it is re-introduced.

 Note: The unit tests in this commit were generated with assistance from
 an LLM (Large Language Model) and subsequently validated and refined.

Fixes: https://tracker.ceph.com/issues/74329
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit a60c86ae0a8db3f37832de779341d1df7510d9dd)
src/osd/ECUtil.cc
src/test/osd/TestECUtil.cc