From: Ville Ojamo <14869000+bluikko@users.noreply.github.com> Date: Wed, 16 Jul 2025 08:21:20 +0000 (+0700) Subject: doc/radosgw: Small improvements in s3_objects_dedup.rst X-Git-Tag: testing/wip-vshankar-testing-20250721.155223-debug~45^2 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=49c0fd0d136d1d37d08cfc1e96ddf48c74474ec1;p=ceph-ci.git doc/radosgw: Small improvements in s3_objects_dedup.rst Fix sentence that had "different same" to just "different" (verified the right one from the original author). Remove colon at the end of section titles. Remove rendered horizontal lines between sections. Use double backticks for command name. Use regular apostrophe in one sentence to be consistent with the rest. Add missing full stop at the end of several sentences. Very small language improvements in a few sentences. Use consistent indent in one line. Remove hyphens from many word pairs and don't capitalize few terms. For consistency with rest of the docs. Fix typos "spliting" to "splitting", "underlined" to "underlying". Spell out "thousands" instead of using an apostrophe after the number. Reformat table to use row separators like rest of the docs instead of empty columns. Separate number and unit with a space. Remove rendered underscores that seemed to be an attempt to imprecisely align cell contents to right. Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com> --- diff --git a/doc/radosgw/s3_objects_dedup.rst b/doc/radosgw/s3_objects_dedup.rst index ed54c61c78e..990806cedb7 100644 --- a/doc/radosgw/s3_objects_dedup.rst +++ b/doc/radosgw/s3_objects_dedup.rst @@ -1,79 +1,78 @@ -====================== -Full RGW Object Dedup: -====================== -Add a radosgw-admin command to collect and report deduplication stats +===================== +Full RGW Object Dedup +===================== +Adds a ``radosgw-admin`` command to collect and report deduplication stats. -.. note:: This utility doesn’t perform dedup and doesn’t make any +.. note:: This utility doesn't perform dedup and doesn't make any change to the existing system and will only collect statistics and report them. ----- - -*************** -Admin commands: -*************** +************** +Admin commands +************** - ``radosgw-admin dedup stats``: - Collects & displays last dedup statistics + Collects & displays last dedup statistics. - ``radosgw-admin dedup pause``: - Pauses active dedup session (dedup resources are not released) + Pauses an active dedup session (dedup resources are not released). - ``radosgw-admin dedup resume``: - Resumes a paused dedup session + Resumes a paused dedup session. - ``radosgw-admin dedup abort``: - Aborts active dedup session and release all resources used by it + Aborts an active dedup session and release all resources used by it. - ``radosgw-admin dedup estimate`` - Starts a new dedup estimate session (aborting first existing session if exists) - ----- + Starts a new dedup estimate session (aborting first existing session if exists). -**************** -Skipped Objects: -**************** -Dedup Estimates skips the following objects: +*************** +Skipped Objects +*************** +Dedup Estimate process skips the following objects: -- Objects smaller than 4MB (unless they are multipart) -- Objects with different placement rules -- Objects with different pools -- Objects with different same storage-classes +- Objects smaller than 4MB (unless they are multipart). +- Objects with different placement rules. +- Objects with different pools. +- Objects with different storage classes. -The Dedup process itself (which will be released later) will also skip +The dedup process itself (which will be released in a later Ceph release) will also skip **compressed** and **user-encrypted** objects, but the estimate process will accept them (since we don't have access to that -information during the estimate process) +information during the estimate process). ----- - -******************** -Estimate Processing: -******************** +******************* +Estimate Processing +******************* The Dedup Estimate process collects all the needed information directly from -the bucket-indices reading one full bucket-index object with 1000's of +the bucket indices reading one full bucket index object with thousands of entries at a time. -The Bucket-Indices objects are sharded between the participating -members so every bucket-index object is read exactly one time. -The sharding allow processing to scale almost linearly spliting the +The bucket indices objects are sharded between the participating +members so every bucket index object is read exactly one time. +The sharding allow processing to scale almost linearly splitting the load evenly between the participating members. The Dedup Estimate process does not access the objects themselves (data/metadata) which means its processing time won't be affected by -the underlined media storing the objects (SSD/HDD) since the bucket-indices are +the underlying media storing the objects (SSD/HDD) since the bucket indices are virtually always stored on a fast medium (SSD with heavy memory -caching) - ----- +caching). -************* -Memory Usage: -************* - +---------------++-----------+ - | RGW Obj Count | Memory | - +===============++===========+ - | | ____1M | | ___8MB | - | | ____4M | | __16MB | - | | ___16M | | __32MB | - | | ___64M | | __64MB | - | | __256M | | _128MB | - | | _1024M( 1G) | | _256MB | - | | _4096M( 4G) | | _512MB | - | | 16384M(16G) | | 1024MB | - +---------------+------------+ +************ +Memory Usage +************ + +---------------+----------+ + | RGW Obj Count | Memory | + +===============+==========+ + | 1M | 8 MB | + +---------------+----------+ + | 4M | 16 MB | + +---------------+----------+ + | 16M | 32 MB | + +---------------+----------+ + | 64M | 64 MB | + +---------------+----------+ + | 256M | 128 MB | + +---------------+----------+ + | 1024M (1G) | 256 MB | + +---------------+----------+ + | 4096M (4G) | 512 MB | + +---------------+----------+ + | 16384M (16G) | 1024 MB | + +---------------+----------+