]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
rgw/dedup: This PR extends the RGW dedup split-head feature to support objects that... 68113/head
authorbenhanokh <gbenhano@redhat.com>
Mon, 30 Mar 2026 08:22:51 +0000 (11:22 +0300)
committerbenhanokh <gbenhano@redhat.com>
Thu, 16 Apr 2026 08:24:10 +0000 (11:24 +0300)
commit7b72f06c992beefe5cf5994a87250c317cc90155
tree86cef819ff1d941449857230e21c6b8fd5b8a62f
parenta67aa9b0055a7af30bc74ac4b811137089d00579
rgw/dedup: This PR extends the RGW dedup split-head feature to support objects that already have tail RADOS objects (i.e. objects larger than the head chunk size).
Previously, split-head was restricted to objects whose entire data fit in the head (≤4 MiB).
It also migrates the split-head manifest representation from the legacy explicit-objs format to the prefix+index rules-based format.

Refactored should_split_head():
Now performs a larger set of eligibility checks:
 * d_split_head flag is set
 * single-part object only
 * non-empty head
 * not a legacy manifest
 * not an Alibaba Cloud OSS AppendObject

Explicit skips for unsupported manifest types:
 â€” old-style explicit-objs manifests
 â€” OSS AppendObject manifests (detected via non-empty override_prefix)

New config option: rgw_dedup_split_obj_head:
  Default is true (split-head enabled).
  Setting to false disables split-head entirely.

Tail object lookup via manifest iterator:
  Replaces the old get_tail_ioctx() which manually constructed the tail OID via generate_split_head_tail_name().
  The new function simply calls manifest.obj_begin() and resolves the first tail object location through the standard manifest iterator.

Stats cleanup:
Removed the "Potential Dedup" stats section (small_objs_stat, dup_head_bytes, dup_head_bytes_estimate, ingress_skip_too_small_64KB*)
 which tracked 64KB–4MB objects as potential-but-skipped candidates.
 Since split-head now covers all sizes, this distinction is no longer meaningful. calc_deduped_bytes() is simplified accordingly.

Signed-off-by: benhanokh <gbenhano@redhat.com>
13 files changed:
doc/radosgw/s3_objects_dedup.rst
src/common/options/rgw.yaml.in
src/rgw/driver/rados/rgw_dedup.cc
src/rgw/driver/rados/rgw_dedup.h
src/rgw/driver/rados/rgw_dedup_cluster.cc
src/rgw/driver/rados/rgw_dedup_table.cc
src/rgw/driver/rados/rgw_dedup_table.h
src/rgw/driver/rados/rgw_dedup_utils.cc
src/rgw/driver/rados/rgw_dedup_utils.h
src/rgw/driver/rados/rgw_obj_manifest.cc
src/rgw/driver/rados/rgw_obj_manifest.h
src/rgw/rgw_obj_manifest.cc
src/test/rgw/dedup/test_dedup.py