--- /dev/null
+:orphan:
+
+==========================================================
+ ceph-diff-sorted -- compare two sorted files line by line
+==========================================================
+
+.. program:: ceph-diff-sorted
+
+Synopsis
+========
+
+| **ceph-diff-sorted** *file1* *file2*
+
+Description
+===========
+
+:program:`ceph-diff-sorted` is a simplifed *diff* utility optimized
+for comparing two files with lines that are lexically sorted.
+
+The output is simplified in comparison to that of the standard `diff`
+tool available in POSIX systems. Angle brackets ('<' and '>') are used
+to show lines that appear in one file but not the other. The output is
+not compatible with the `patch` tool.
+
+This tool was created in order to perform diffs of large files (e.g.,
+containing billions of lines) that the standard `diff` tool cannot
+handle efficiently. Knowing that the lines are sorted allows this to
+be done efficiently with minimal memory overhead.
+
+The sorting of each file needs to be done lexcially. Most POSIX
+systems use the *LANG* environment variable to determine the `sort`
+tool's sorting order. To sort lexically we would need something such
+as:
+
+ $ LANG=C sort some-file.txt >some-file-sorted.txt
+
+Examples
+========
+
+Compare two files::
+
+ $ ceph-diff-sorted fileA.txt fileB.txt
+
+Exit Status
+===========
+
+When complete, the exit status will be set to one of the following:
+
+0
+ files same
+1
+ files different
+2
+ usage problem (e.g., wrong number of command-line arguments)
+3
+ problem opening input file
+4
+ bad file content (e.g., unsorted order or empty lines)
+
+
+Availability
+============
+
+:program:`ceph-diff-sorted` is part of Ceph, a massively scalable,
+open-source, distributed storage system. Please refer to the Ceph
+documentation at http://ceph.com/docs for more information.
+
+See also
+========
+
+:doc:`rgw-orphan-list <rgw-orphan-list>`\(8)
--- /dev/null
+:orphan:
+
+==================================================================
+ rgw-orphan-list -- list rados objects that are not indexed by rgw
+==================================================================
+
+.. program:: rgw-orphan-list
+
+Synopsis
+========
+
+| **rgw-orphan-list**
+
+Description
+===========
+
+:program:`rgw-orphan-list` is an *EXPERIMENTAL* RADOS gateway user
+administration utility. It produces a listing of rados objects that
+are not directly or indirectly referenced through the bucket indexes
+on a pool. It places the results and intermediate files on the local
+filesystem rather than on the ceph cluster itself, and therefore will
+not itself consume additional cluster storage.
+
+In theory orphans should not exist. However because ceph evolves
+rapidly, bugs do crop up, and they may result in orphans that are left
+behind.
+
+In its current form this utility does not take any command-line
+arguments or options. It will list the available pools and prompt the
+user to enter the pool they would like to list orphans for.
+
+Behind the scenes it runs `rados ls` and `radosgw-admin bucket
+radoslist ...` and produces a list of those entries that appear in the
+former but not the latter. Those entries are presumed to be the
+orphans.
+
+Warnings
+========
+
+This utility is currently considered *EXPERIMENTAL*.
+
+This utility will produce false orphan entries for unindexed buckets
+since such buckets have no bucket indices that can provide the
+starting point for tracing.
+
+Options
+=======
+
+At present there are no options.
+
+Examples
+========
+
+Launch the tool::
+
+ $ rgw-orphan-list
+
+Availability
+============
+
+:program:`radosgw-admin` is part of Ceph, a massively scalable, open-source,
+distributed storage system. Please refer to the Ceph documentation at
+http://ceph.com/docs for more information.
+
+See also
+========
+
+:doc:`radosgw-admin <radosgw-admin>`\(8)
+:doc:`ceph-diff-sorted <ceph-diff-sorted>`\(8)
--- /dev/null
+==================================
+Orphan List and Associated Tooling
+==================================
+
+.. version added:: Luminous
+
+.. contents::
+
+Orphans are RADOS objects that are left behind after their associated
+RGW objects are removed. Normally these RADOS objects are removed
+automatically, either immediately or through a process known as
+"garbage collection". Over the history of RGW, however, there may have
+been bugs that prevented these RADOS objects from being deleted, and
+these RADOS objects may be consuming space on the Ceph cluster without
+being of any use. From the perspective of RGW, we call such RADOS
+objects "orphans".
+
+Orphans Find -- DEPRECATED
+--------------------------
+
+The `radosgw-admin` tool has/had three subcommands to help manage
+orphans, however these subcommands are (or will soon be)
+deprecated. These subcommands are:
+
+::
+ # radosgw-admin orphans find ...
+ # radosgw-admin orphans finish ...
+ # radosgw-admin orphans list-jobs ...
+
+There are two key problems with these subcommands, however. First,
+these subcommands have not been actively maintained and therefore have
+not tracked RGW as it has evolved in terms of features and updates. As
+a result the confidence that these subcommands can accurately identify
+true orphans is presently low.
+
+Second, these subcommands store intermediate results on the cluster
+itself. This can be problematic when cluster administrators are
+confronting insufficient storage space and want to remove orphans as a
+means of addressing the issue. The intermediate results could strain
+the existing cluster storage capacity even further.
+
+For these reasons "orphans find" has been deprecated.
+
+Orphan List
+-----------
+
+Because "orphans find" has been deprecated, RGW now includes an
+additional tool -- 'rgw-orphan-list'. When run it will list the
+available pools and prompt the user to enter the name of the data
+pool. At that point the tool will, perhaps after an extended period of
+time, produce a local file containing the RADOS objects from the
+designated pool that appear to be orphans. The administrator is free
+to examine this file and the decide on a course of action, perhaps
+removing those RADOS objects from the designated pool.
+
+All intermediate results are stored on the local file system rather
+than the Ceph cluster. So running the 'rgw-orphan-list' tool should
+have no appreciable impact on the amount of cluster storage consumed.
+
+WARNING: Experimental Status
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The 'rgw-orphan-list' tool is new and therefore currently considered
+experimental. The list of orphans produced should be "sanity checked"
+before being used for a large delete operation.
+
+WARNING: Specifying a Data Pool
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If a pool other than an RGW data pool is specified, the results of the
+tool will be erroneous. All RADOS objects found on such a pool will
+falsely be designated as orphans.
+
+WARNING: Unindexed Buckets
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+RGW allows for unindexed buckets, that is buckets that do not maintain
+an index of their contents. This is not a typical configuration, but
+it is supported. Because the 'rgw-orphan-list' tool uses the bucket
+indices to determine what RADOS objects should exist, objects in the
+unindexed buckets will falsely be listed as orphans.
+
+
+RADOS List
+----------
+
+One of the sub-steps in computing a list of orphans is to map each RGW
+object into its corresponding set of RADOS objects. This is done using
+a subcommand of 'radosgw-admin'.
+
+::
+ # radosgw-admin bucket radoslist [--bucket={bucket-name}]
+
+The subcommand will produce a list of RADOS objects that support all
+of the RGW objects. If a bucket is specified then the subcommand will
+only produce a list of RADOS objects that correspond back the RGW
+objects in the specified bucket.
+
+Note: Shared Bucket Markers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Some administrators will be aware of the coding schemes used to name
+the RADOS objects that correspond to RGW objects, which include a
+"marker" unique to a given bucket.
+
+RADOS objects that correspond with the contents of one RGW bucket,
+however, may contain a marker that specifies a different bucket. This
+behavior is a consequence of the "shallow copy" optimization used by
+RGW. When larger objects are copied from bucket to bucket, only the
+"head" objects are actually copied, and the tail objects are
+shared. Those shared objects will contain the marker of the original
+bucket.
+
+.. _Data Layout in RADOS : ../layout
+.. _Pool Placement and Storage Classes : ../placement