From: Shilpa Jagannath Date: Mon, 8 Dec 2025 17:48:53 +0000 (-0500) Subject: rgw/reshard: bucket reshard may race with bucket index update. X-Git-Tag: v21.0.0~311^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=e34067f8ceab85d9eb97eecd7d6f753e82747930;p=ceph.git rgw/reshard: bucket reshard may race with bucket index update. guard_reshard() serves as a wrapper to protect bucket index operations against concurrent resharding. but it only handles -ERR_BUSY_RESHARDING if bucket index update is already in progress with bucket shard assigned, and a concurrent resharding operation deletes this old shard object, there may be a window of operations targeting the old shards failing with -ENOENT. this error is not caught anywhere in the calling functions. as a result we could end up with objects in an inconsistent state. Signed-off-by: Shilpa Jagannath --- diff --git a/src/rgw/driver/rados/rgw_rados.cc b/src/rgw/driver/rados/rgw_rados.cc index 4e861f937767..a1dac761cd9d 100644 --- a/src/rgw/driver/rados/rgw_rados.cc +++ b/src/rgw/driver/rados/rgw_rados.cc @@ -7974,6 +7974,12 @@ int RGWRados::Bucket::UpdateIndex::guard_reshard(const DoutPrefixProvider *dpp, } r = call(bs); + if (r == -ENOENT) { + ldpp_dout(dpp, 10) << "ENOENT in guard_reshard(), likely bucket resharding, retrying" << dendl; + invalidate_bs(); + continue; + } + if (r != -ERR_BUSY_RESHARDING) { break; } @@ -8678,6 +8684,10 @@ int RGWRados::guard_reshard(const DoutPrefixProvider *dpp, } r = call(bs); + if (r == -ENOENT) { + ldpp_dout(dpp, 10) << "ENOENT in guard_reshard(), likely bucket resharding, retrying" << dendl; + continue; + } if (r != -ERR_BUSY_RESHARDING) { break; }