From: Abhishek Lekshmanan <abhishek@suse.com>
Date: Fri, 8 Mar 2019 15:57:28 +0000 (+0100)
Subject: doc: add troubleshooting notes on reshard admin clis
X-Git-Tag: v15.0.0~215^2~1
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=dee9ac22f19360ba6436e674c93baea9d97ca5da;p=ceph.git

doc: add troubleshooting notes on reshard admin clis

Adding a note on LC fixes and reshard stale instance fixes

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
---

diff --git a/doc/radosgw/dynamicresharding.rst b/doc/radosgw/dynamicresharding.rst
index 4d51cd76ebc..cd1ebdce808 100644
--- a/doc/radosgw/dynamicresharding.rst
+++ b/doc/radosgw/dynamicresharding.rst
@@ -95,3 +95,45 @@ Manual bucket resharding
 ::
 
    # radosgw-admin bucket reshard --bucket <bucket_name> --num-shards <new number of shards>
+
+
+Troubleshooting
+===============
+
+Clusters prior to Luminous 12.2.11 and Mimic 13.2.5 left behind stale bucket
+instance entries that weren't automatically cleaned up. The issue also affected
+LifeCycle policies which weren't applied to resharded buckets anymore. Both of
+these issues can be worked around using a couple of radosgw-admin commands.
+
+Stale Instance Management
+-------------------------
+
+::
+
+   # radosgw-admin reshard stale-instances list
+
+This lists the stale instances in a cluster that are ready to be cleaned up.
+Please note that the cleanup of these instances should be done only on a single
+site cluster. The cleanup can be done by the following command:
+
+::
+
+   # radosgw-admin reshard stale-instances rm
+
+
+Lifecycle fixes
+---------------
+
+For clusters which had resharded instances, it is highly likely that the old
+lifecycle processes would've flagged and deleted lifecycle processing as the
+bucket instance changed during a reshard. While this is fixed for newer clusters
+(from 13.2.6 and 12.2.12), older buckets which had lifecycle policies and
+would've undergone reshard will have to be manually fixed by issuing the following command
+
+::
+
+   # radosgw-admin lc reshard fix --bucket {bucketname}
+
+
+As a convenience wrapper, if the ``--bucket`` argument is dropped then this
+command will try and fix LC policies for all the buckets in the cluster.