From: Sage Weil <sage@redhat.com>
Date: Fri, 22 Feb 2019 17:54:15 +0000 (-0600)
Subject: doc/releases/nautilus: draft notes
X-Git-Tag: v14.1.1~151^2~2
X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=fdf75b2d2221185985a39e04a300104f29c7f338;p=ceph-ci.git

doc/releases/nautilus: draft notes

Signed-off-by: Sage Weil <sage@redhat.com>
---

diff --git a/PendingReleaseNotes b/PendingReleaseNotes
index 50bc4831ba8..e69de29bb2d 100644
--- a/PendingReleaseNotes
+++ b/PendingReleaseNotes
@@ -1,301 +0,0 @@
-14.0.1
-------
-
-* ceph pg stat output has been modified in json
-  format to match ceph df output:
-  * "raw_bytes" field renamed to "total_bytes"
-  * "raw_bytes_avail" field renamed to "total_bytes_avail"
-  * "raw_bytes_avail" field renamed to "total_bytes_avail"
-  * "raw_bytes_used" field renamed to "total_bytes_raw_used"
-  * "total_bytes_used" field added to represent the space (accumulated over
-    all OSDs) allocated purely for data objects kept at block(slow) device
-  
-* ceph df [detail] output (GLOBAL section) has been modified in plain
-  format:
-  * new 'USED' column shows the space (accumulated over all OSDs) allocated
-    purely for data objects kept at block(slow) device.
-  * 'RAW USED' is now a sum of 'USED' space and space allocated/reserved at
-     block device for Ceph purposes, e.g. BlueFS part for BlueStore.
-
-* ceph df [detail] output (GLOBAL section) has been modified in json
-  format:
-  * 'total_used_bytes' column now shows the space (accumulated over all OSDs)
-    allocated purely for data objects kept at block(slow) device
-  * new 'total_used_raw_bytes' column shows a sum of 'USED' space and space
-    allocated/reserved at block device for Ceph purposes, e.g. BlueFS part for
-    BlueStore.
-
-* ceph df [detail] output (POOLS section) has been modified in plain
-  format:
-  * 'BYTES USED' column renamed to 'STORED'. Represents amount of data
-    stored by the user.
-  * 'USED' column now represent amount of space allocated purely for data
-    by all OSD nodes in KB.
-  * 'QUOTA BYTES', 'QUOTA OBJECTS' aren't showed anumore in non-detailed mode.
-  * new column 'USED COMPR' - amount of space allocated for compressed
-    data. I.e. comrpessed data plus all the allocation, replication and erasure
-    coding overhead.
-  * new column 'UNDER COMPR' - amount of data passed through compression
-   (summed over all replicas) and beneficial enough to be stored in a
-    compressed form.
-  * Some columns reordering
-
-* ceph df [detail] output (POOLS section) has been modified in json
-  format:
-  * 'bytes used' column renamed to 'stored'. Represents amount of data
-    stored by the user.
-  * 'raw bytes used' column renamed to "stored_raw". Totals of user data
-     over all OSD excluding degraded.
-  * new 'bytes_used' column now represent amount of space allocated by 
-    all OSD nodes.
-  * 'kb_used' column - the same as 'bytes_used' but in KB.
-  * new column 'compress_bytes_used' - amount of space allocated for compressed
-    data. I.e. comrpessed data plus all the allocation, replication and erasure
-    coding overhead.
-  * new column 'compress_under_bytes' amount of data passed through compression
-   (summed over all replicas) and beneficial enough to be stored in a
-    compressed form.
-
-* rados df [detail] output (POOLS section) has been modified in plain
-  format:
-  * 'USED' column now shows the space (accumulated over all OSDs) allocated
-    purely for data objects kept at block(slow) device.
-  * new column 'USED COMPR' - amount of space allocated for compressed
-    data. I.e. comrpessed data plus all the allocation, replication and erasure
-    coding overhead.
-  * new column 'UNDER COMPR' - amount of data passed through compression
-   (summed over all replicas) and beneficial enough to be stored in a
-    compressed form.
-
-* rados df [detail] output (POOLS section) has been modified in json
-  format:
-  * 'size_bytes' and 'size_kb' columns now show the space (accumulated
-    over all OSDs) allocated purely for data objects kept at block
-    device.
-  * new column 'compress_bytes_used' - amount of space allocated for compressed
-    data. I.e. comrpessed data plus all the allocation, replication and erasure
-    coding overhead.
-  * new column 'compress_under_bytes' amount of data passed through compression
-   (summed over all replicas) and beneficial enough to be stored in a
-    compressed form.
-
-* ceph pg dump output (totals section) has been modified in json
-  format:
-  * new 'USED' column shows the space (accumulated over all OSDs) allocated
-    purely for data objects kept at block(slow) device.
-  * 'USED_RAW' is now a sum of 'USED' space and space allocated/reserved at
-     block device for Ceph purposes, e.g. BlueFS part for BlueStore.
-
-* The 'ceph osd rm' command has been deprecated.  Users should use
-  'ceph osd destroy' or 'ceph osd purge' (but after first confirming it is
-  safe to do so via the 'ceph osd safe-to-destroy' command).
-
-* The MDS now supports dropping its cache for the purposes of benchmarking.
-
-    ceph tell mds.* cache drop <timeout>
-
-  Note that the MDS cache is cooperatively managed by the clients. It is
-  necessary for clients to give up capabilities in order for the MDS to fully
-  drop its cache. This is accomplished by asking all clients to trim as many
-  caps as possible. The timeout argument to the `cache drop` command controls
-  how long the MDS waits for clients to complete trimming caps. This is optional
-  and is 0 by default (no timeout). Keep in mind that clients may still retain
-  caps to open files which will prevent the metadata for those files from being
-  dropped by both the client and the MDS. (This is an equivalent scenario to
-  dropping the Linux page/buffer/inode/dentry caches with some processes pinning
-  some inodes/dentries/pages in cache.)
-
-* The mon_health_preluminous_compat and mon_health_preluminous_compat_warning
-  config options are removed, as the related functionality is more
-  than two versions old.  Any legacy monitoring system expecting Jewel-style
-  health output will need to be updated to work with Nautilus.
-
-* Nautilus is not supported on any distros still running upstart so upstart
-  specific files and references have been removed.
-
-* The 'ceph pg <pgid> list_missing' command has been renamed to
-  'ceph pg <pgid> list_unfound' to better match its behaviour.
-
-* The 'rbd-mirror' daemon can now retrieve remote peer cluster configuration
-  secrets from the monitor. To use this feature, the 'rbd-mirror' daemon
-  CephX user for the local cluster must use the 'profile rbd-mirror' mon cap.
-  The secrets can be set using the 'rbd mirror pool peer add' and
-  'rbd mirror pool peer set' actions.
-
-* The `ceph mds deactivate` is fully obsolete and references to it in the docs
-  have been removed or clarified.
-
-* The libcephfs bindings added the ceph_select_filesystem function
-  for use with multiple filesystems.
-
-* The cephfs python bindings now include mount_root and filesystem_name
-  options in the mount() function.
-
-* erasure-code: add experimental *Coupled LAYer (CLAY)* erasure codes
-  support. It features less network traffic and disk I/O when performing
-  recovery.
-
-* The 'cache drop' OSD command has been added to drop an OSD's caches:
-
-    - ``ceph tell osd.x cache drop``
-
-* The 'cache status' OSD command has been added to get the cache stats of an
-  OSD:
-
-    - ``ceph tell osd.x cache status'
-
-* The libcephfs added several functions that allow restarted client to destroy
-  or reclaim state held by a previous incarnation. These functions are for NFS
-  servers.
-
-* The `ceph` command line tool now accepts keyword arguments in
-  the format "--arg=value" or "--arg value".
-
-* librados::IoCtx::nobjects_begin() and librados::NObjectIterator now communicate
-  errors by throwing a std::system_error exception instead of std::runtime_error.
-
-* the callback function passed to LibRGWFS.readdir() now accepts a ``flags``
-  parameter. it will be the last parameter passed to  ``readdir()` method.
-
-* The 'cephfs-data-scan scan_links' now automatically repair inotables and
-  snaptable.
-
-* Configuration values mon_warn_not_scrubbed/mon_warn_not_deep_scrubbed have been
-  renamed.  They are now mon_warn_pg_not_scrubbed_ratio/mon_warn_pg_not_deep_scrubbed_ratio
-  respectively.  This is to clarify that these warnings are related to pg scrubbing
-  and are a ratio of the related interval.  These options are now enabled by default.
-
-* The MDS cache trimming is now throttled. Dropping the MDS cache
-  via the `ceph tell mds.<foo> cache drop` command or large reductions in the
-  cache size will no longer cause service unavailability.
-
-* The CephFS MDS behavior with recalling caps has been significantly improved
-  to not attempt recalling too many caps at once, leading to instability.
-  MDS with a large cache (64GB+) should be more stable.
-
-* MDS now provides a config option "mds_max_caps_per_client" (default: 1M) to
-  limit the number of caps a client session may hold. Long running client
-  sessions with a large number of caps have been a source of instability in the
-  MDS when all of these caps need to be processed during certain session
-  events. It is recommended to not unnecessarily increase this value.
-
-* The MDS config mds_recall_state_timeout has been removed. Late client recall
-  warnings are now generated based on the number of caps the MDS has recalled
-  which have not been released. The new configs mds_recall_warning_threshold
-  (default: 32K) and mds_recall_warning_decay_rate (default: 60s) sets the
-  threshold for this warning.
-
->=13.1.0
---------
-
-* The Telegraf module for the Manager allows for sending statistics to
-  an Telegraf Agent over TCP, UDP or a UNIX Socket. Telegraf can then
-  send the statistics to databases like InfluxDB, ElasticSearch, Graphite
-  and many more.
-
-* The graylog fields naming the originator of a log event have
-  changed: the string-form name is now included (e.g., ``"name":
-  "mgr.foo"``), and the rank-form name is now in a nested section
-  (e.g., ``"rank": {"type": "mgr", "num": 43243}``).
-
-* If the cluster log is directed at syslog, the entries are now
-  prefixed by both the string-form name and the rank-form name (e.g.,
-  ``mgr.x mgr.12345 ...`` instead of just ``mgr.12345 ...``).
-
-* The JSON output of the ``osd find`` command has replaced the ``ip``
-  field with an ``addrs`` section to reflect that OSDs may bind to
-  multiple addresses.
-
-* CephFS clients without the 's' flag in their authentication capability
-  string will no longer be able to create/delete snapshots. To allow
-  ``client.foo`` to create/delete snapshots in the ``bar`` directory of
-  filesystem ``cephfs_a``, use command:
-
-    - ``ceph auth caps client.foo mon 'allow r' osd 'allow rw tag cephfs data=cephfs_a' mds 'allow rw, allow rws path=/bar'``
-
-* The ``osd_heartbeat_addr`` option has been removed as it served no
-  (good) purpose: the OSD should always check heartbeats on both the
-  public and cluster networks.
-
-* The ``rados`` tool's ``mkpool`` and ``rmpool`` commands have been
-  removed because they are redundant; please use the ``ceph osd pool
-  create`` and ``ceph osd pool rm`` commands instead.
-
-* The ``auid`` property for cephx users and RADOS pools has been
-  removed.  This was an undocumented and partially implemented
-  capability that allowed cephx users to map capabilities to RADOS
-  pools that they "owned".  Because there are no users we have removed
-  this support.  If any cephx capabilities exist in the cluster that
-  restrict based on auid then they will no longer parse, and the
-  cluster will report a health warning like::
-
-    AUTH_BAD_CAPS 1 auth entities have invalid capabilities
-        client.bad osd capability parse failed, stopped at 'allow rwx auid 123' of 'allow rwx auid 123'
-
-  The capability can be adjusted with the ``ceph auth caps`` command. For example,::
-
-    ceph auth caps client.bad osd 'allow rwx pool foo'
-
-* The ``ceph-kvstore-tool`` ``repair`` command has been renamed
-  ``destructive-repair`` since we have discovered it can corrupt an
-  otherwise healthy rocksdb database.  It should be used only as a last-ditch
-  attempt to recover data from an otherwise corrupted store.
-
-
-* The default memory utilization for the mons has been increased
-  somewhat.  Rocksdb now uses 512 MB of RAM by default, which should
-  be sufficient for small to medium-sized clusters; large clusters
-  should tune this up.  Also, the ``mon_osd_cache_size`` has been
-  increase from 10 OSDMaps to 500, which will translate to an
-  additional 500 MB to 1 GB of RAM for large clusters, and much less
-  for small clusters.
-
-* The ``mgr/balancer/max_misplaced`` option has been replaced by a new
-  global ``target_max_misplaced_ratio`` option that throttles both
-  balancer activity and automated adjustments to ``pgp_num`` (normally as a
-  result of ``pg_num`` changes).  If you have customized the balancer module
-  option, you will need to adjust your config to set the new global option
-  or revert to the default of .05 (5%).
-
-* By default, Ceph no longer issues a health warning when there are
-  misplaced objects (objects that are fully replicated but not stored
-  on the intended OSDs).  You can reenable the old warning by setting
-  ``mon_warn_on_misplaced`` to ``true``.
-
-* The ``ceph-create-keys`` tool is now obsolete.  The monitors
-  automatically create these keys on their own.  For now the script
-  prints a warning message and exits, but it will be removed in the
-  next release.  Note that ``ceph-create-keys`` would also write the
-  admin and bootstrap keys to /etc/ceph and /var/lib/ceph, but this
-  script no longer does that.  Any deployment tools that relied on
-  this behavior should instead make use of the ``ceph auth export
-  <entity-name>`` command for whichever key(s) they need.
-
-* The ``mon_osd_pool_ec_fast_read`` option has been renamed
-  ``osd_pool_default_ec_fast_read`` to be more consistent with other
-  ``osd_pool_default_*`` options that affect default values for newly
-  created RADOS pools.
-
-* The ``mon addr`` configuration option is now deprecated.  It can
-  still be used to specify an address for each monitor in the
-  ``ceph.conf`` file, but it only affects cluster creation and
-  bootstrapping, and it does not support listing multiple addresses
-  (e.g., both a v2 and v1 protocol address).  We strongly recommend
-  the option be removed and instead a single ``mon host`` option be
-  specified in the ``[global]`` section to allow daemons and clients
-  to discover the monitors.
-
-* New command `fs fail` has been added to quickly bring down a file
-  system. This is a single command that unsets the joinable flag on the file
-  system and brings down all of its ranks.
-
-* The `cache drop` admin socket command has been removed. The `ceph tell mds.X
-  cache drop` remains.
-
-Upgrading from Luminous
------------------------
-
-* During the upgrade from luminous to nautilus, it will not be possible to create
-  a new OSD using a luminous ceph-osd daemon after the monitors have been
-  upgraded to nautilus.
-
diff --git a/doc/releases/index.rst b/doc/releases/index.rst
index 9c0089a141e..3cedf838ef2 100644
--- a/doc/releases/index.rst
+++ b/doc/releases/index.rst
@@ -19,6 +19,7 @@ Active Releases
 .. toctree::
    :maxdepth: 1
 
+   Nautilus <nautilus>
    Mimic <mimic>
    Luminous <luminous>
    Jewel <jewel>
diff --git a/doc/releases/nautilus.rst b/doc/releases/nautilus.rst
new file mode 100644
index 00000000000..967db396108
--- /dev/null
+++ b/doc/releases/nautilus.rst
@@ -0,0 +1,518 @@
+v14.1.0 Nautilus
+================
+
+.. note: These are draft notes for the first Nautilus release.
+
+Major Changes from Mimic
+------------------------
+
+- *Dashboard*:
+
+
+- *RADOS*:
+
+  * The new *msgr2* wire protocol brings support for encryption on the wire.
+  * The number of placement groups (PGs) per pool can now be decreased
+    at any time, and the cluster can automatically tune the PG count
+    based on cluster utilization or administrator hints.
+  * Physical storage devices consumed by OSD and Monitor daemons are
+    now tracked by the cluster along with health metrics (i.e.,
+    SMART), and the cluster can apply a pre-trained prediction model
+    or a cloud-based prediction service to warn about expected
+    HDD or SSD failures.
+
+- *RGW*:
+
+
+- *CephFS*:
+
+
+- *RBD*:
+
+
+- *Misc*:
+
+  * Ceph has a new set of :ref:`orchestrator modules
+    <orchestrator-cli-module>` to directly interact with external
+    orchestrators like ceph-ansible, DeepSea and Rook via a consistent
+    CLI (and, eventually, Dashboard) interface.  It also contains an
+    ssh orchestrator to directly deploy services via ssh.
+
+
+Upgrading from Mimic or Luminous
+--------------------------------
+
+Notes
+~~~~~
+
+* During the upgrade from Luminous to nautilus, it will not be
+  possible to create a new OSD using a Luminous ceph-osd daemon after
+  the monitors have been upgraded to Nautilus.  We recommend you avoid adding
+  or replacing any OSDs while the upgrade is in process.
+
+* We recommend you avoid creating any RADOS pools while the upgrade is
+  in process.
+
+* You can monitor the progress of your upgrade at each stage with the
+  ``ceph versions`` command, which will tell you what ceph version(s) are
+  running for each type of daemon.
+
+Instructions
+~~~~~~~~~~~~
+
+#. If your cluster was originally installed with a version prior to
+   Luminous, ensure that it has completed at least one full scrub of
+   all PGs while running Luminous.  Failure to do so will cause your
+   monitor daemons to refuse to join the quorum on start, leaving them
+   non-functional.
+
+   If you are unsure whether or not your Luminous cluster has
+   completed a full scrub of all PGs, you can check your cluster's
+   state by running::
+
+     # ceph osd dump | grep ^flags
+
+   In order to be able to proceed to Nautilus, your OSD map must include
+   the ``recovery_deletes`` and ``purged_snapdirs`` flags.
+
+   If your OSD map does not contain both these flags, you can simply
+   wait for approximately 24-48 hours, which in a standard cluster
+   configuration should be ample time for all your placement groups to
+   be scrubbed at least once, and then repeat the above process to
+   recheck.
+
+   However, if you have just completed an upgrade to Luminous and want
+   to proceed to Mimic in short order, you can force a scrub on all
+   placement groups with a one-line shell command, like::
+
+     # ceph pg dump pgs_brief | cut -d " " -f 1 | xargs -n1 ceph pg scrub
+
+   You should take into consideration that this forced scrub may
+   possibly have a negative impact on your Ceph clients' performance.
+
+#. Make sure your cluster is stable and healthy (no down or
+   recovering OSDs).  (Optional, but recommended.)
+
+#. Set the ``noout`` flag for the duration of the upgrade. (Optional,
+   but recommended.)::
+
+     # ceph osd set noout
+
+#. Upgrade monitors by installing the new packages and restarting the
+   monitor daemons.  For example,::
+
+     # systemctl restart ceph-mon.target
+
+   Once all monitors are up, verify that the monitor upgrade is
+   complete by looking for the ``nautilus`` string in the mon
+   map.  For example::
+
+     # ceph mon dump | grep min_mon_release
+
+   should report::
+
+     min_mon_release 14 (nautilus)
+
+   If it doesn't, that implies that one or more monitors hasn't been
+   upgraded and restarted and the quorum is not complete.
+
+#. Upgrade ``ceph-mgr`` daemons by installing the new packages and
+   restarting all manager daemons.  For example,::
+
+     # systemctl restart ceph-mgr.target
+
+   Verify the ``ceph-mgr`` daemons are running by checking ``ceph
+   -s``::
+
+     # ceph -s
+
+     ...
+       services:
+        mon: 3 daemons, quorum foo,bar,baz
+        mgr: foo(active), standbys: bar, baz
+     ...
+  
+#. Upgrade all OSDs by installing the new packages and restarting the
+   ceph-osd daemons on all hosts::
+
+     # systemctl restart ceph-osd.target
+
+   You can monitor the progress of the OSD upgrades with the
+   ``ceph versions`` or ``ceph osd versions`` command::
+
+     # ceph osd versions
+     {
+        "ceph version 13.2.5 (...) mimic (stable)": 12,
+        "ceph version 14.2.0 (...) nautilus (stable)": 22,
+     }
+
+#. Upgrade all CephFS MDS daemons.  For each CephFS file system,
+
+   #. Reduce the number of ranks to 1.  (Make note of the original
+      number of MDS daemons first if you plan to restore it later.)::
+
+	# ceph status
+	# ceph fs set <fs_name> max_mds 1
+
+   #. Wait for the cluster to deactivate any non-zero ranks by
+      periodically checking the status::
+
+	# ceph status
+
+   #. Take all standby MDS daemons offline on the appropriate hosts with::
+
+	# systemctl stop ceph-mds@<daemon_name>
+
+   #. Confirm that only one MDS is online and is rank 0 for your FS::
+
+	# ceph status
+
+   #. Upgrade the last remaining MDS daemon by installing the new
+      packages and restarting the daemon::
+
+        # systemctl restart ceph-mds.target
+
+   #. Restart all standby MDS daemons that were taken offline::
+
+	# systemctl start ceph-mds.target
+
+   #. Restore the original value of ``max_mds`` for the volume::
+
+	# ceph fs set <fs_name> max_mds <original_max_mds>
+
+#. Upgrade all radosgw daemons by upgrading packages and restarting
+   daemons on all hosts::
+
+     # systemctl restart radosgw.target
+
+#. Complete the upgrade by disallowing pre-Nautilus OSDs and enabling
+   all new Nautilus-only functionality::
+
+     # ceph osd require-osd-release nautilus
+
+#. If you set ``noout`` at the beginning, be sure to clear it with::
+
+     # ceph osd unset noout
+
+#. Verify the cluster is healthy with ``ceph health``.
+
+
+Upgrading from pre-Luminous releases (like Jewel)
+-------------------------------------------------
+
+You *must* first upgrade to Luminous (12.2.z) before attempting an
+upgrade to Nautilus.  In addition, your cluster must have completed at
+least one scrub of all PGs while running Luminous, setting the
+``recovery_deletes`` and ``purged_snapdirs`` flags in the OSD map.
+
+
+Upgrade compatibility notes
+---------------------------
+
+These changes occurred between the Mimic and Nautilus releases.
+
+* ``ceph pg stat`` output has been modified in json
+  format to match ``ceph df`` output:
+
+  - "raw_bytes" field renamed to "total_bytes"
+  - "raw_bytes_avail" field renamed to "total_bytes_avail"
+  - "raw_bytes_avail" field renamed to "total_bytes_avail"
+  - "raw_bytes_used" field renamed to "total_bytes_raw_used"
+  - "total_bytes_used" field added to represent the space (accumulated over
+     all OSDs) allocated purely for data objects kept at block(slow) device
+  
+* ``ceph df [detail]`` output (GLOBAL section) has been modified in plain
+  format:
+
+  - new 'USED' column shows the space (accumulated over all OSDs) allocated
+    purely for data objects kept at block(slow) device.
+  - 'RAW USED' is now a sum of 'USED' space and space allocated/reserved at
+     block device for Ceph purposes, e.g. BlueFS part for BlueStore.
+
+* ``ceph df [detail]`` output (GLOBAL section) has been modified in json
+  format:
+  
+  - 'total_used_bytes' column now shows the space (accumulated over all OSDs)
+    allocated purely for data objects kept at block(slow) device
+  - new 'total_used_raw_bytes' column shows a sum of 'USED' space and space
+    allocated/reserved at block device for Ceph purposes, e.g. BlueFS part for
+    BlueStore.
+
+* ``ceph df [detail]`` output (POOLS section) has been modified in plain
+  format:
+  
+  - 'BYTES USED' column renamed to 'STORED'. Represents amount of data
+    stored by the user.
+  - 'USED' column now represent amount of space allocated purely for data
+    by all OSD nodes in KB.
+  - 'QUOTA BYTES', 'QUOTA OBJECTS' aren't showed anymore in non-detailed mode.
+  - new column 'USED COMPR' - amount of space allocated for compressed
+    data. i.e., compressed data plus all the allocation, replication and erasure
+    coding overhead.
+  - new column 'UNDER COMPR' - amount of data passed through compression
+    (summed over all replicas) and beneficial enough to be stored in a
+    compressed form.
+  - Some columns reordering
+
+* ``ceph df [detail]`` output (POOLS section) has been modified in json
+  format:
+  
+  - 'bytes used' column renamed to 'stored'. Represents amount of data
+    stored by the user.
+  - 'raw bytes used' column renamed to "stored_raw". Totals of user data
+     over all OSD excluding degraded.
+  - new 'bytes_used' column now represent amount of space allocated by 
+    all OSD nodes.
+  - 'kb_used' column - the same as 'bytes_used' but in KB.
+  - new column 'compress_bytes_used' - amount of space allocated for compressed
+    data. i.e., compressed data plus all the allocation, replication and erasure
+    coding overhead.
+  - new column 'compress_under_bytes' amount of data passed through compression
+    (summed over all replicas) and beneficial enough to be stored in a
+    compressed form.
+
+* ``rados df [detail]`` output (POOLS section) has been modified in plain
+  format:
+  
+  - 'USED' column now shows the space (accumulated over all OSDs) allocated
+    purely for data objects kept at block(slow) device.
+  - new column 'USED COMPR' - amount of space allocated for compressed
+    data. i.e., compressed data plus all the allocation, replication and erasure
+    coding overhead.
+  - new column 'UNDER COMPR' - amount of data passed through compression
+    (summed over all replicas) and beneficial enough to be stored in a
+    compressed form.
+
+* ``rados df [detail]`` output (POOLS section) has been modified in json
+  format:
+  
+  - 'size_bytes' and 'size_kb' columns now show the space (accumulated
+    over all OSDs) allocated purely for data objects kept at block
+    device.
+  - new column 'compress_bytes_used' - amount of space allocated for compressed
+    data. i.e., compressed data plus all the allocation, replication and erasure
+    coding overhead.
+  - new column 'compress_under_bytes' amount of data passed through compression
+    (summed over all replicas) and beneficial enough to be stored in a
+    compressed form.
+
+* ``ceph pg dump`` output (totals section) has been modified in json
+  format:
+  
+  - new 'USED' column shows the space (accumulated over all OSDs) allocated
+    purely for data objects kept at block(slow) device.
+  - 'USED_RAW' is now a sum of 'USED' space and space allocated/reserved at
+    block device for Ceph purposes, e.g. BlueFS part for BlueStore.
+
+* The ``ceph osd rm`` command has been deprecated.  Users should use
+  ``ceph osd destroy`` or ``ceph osd purge`` (but after first confirming it is
+  safe to do so via the ``ceph osd safe-to-destroy`` command).
+
+* The MDS now supports dropping its cache for the purposes of benchmarking.::
+
+    ceph tell mds.* cache drop <timeout>
+
+  Note that the MDS cache is cooperatively managed by the clients. It is
+  necessary for clients to give up capabilities in order for the MDS to fully
+  drop its cache. This is accomplished by asking all clients to trim as many
+  caps as possible. The timeout argument to the ``cache drop`` command controls
+  how long the MDS waits for clients to complete trimming caps. This is optional
+  and is 0 by default (no timeout). Keep in mind that clients may still retain
+  caps to open files which will prevent the metadata for those files from being
+  dropped by both the client and the MDS. (This is an equivalent scenario to
+  dropping the Linux page/buffer/inode/dentry caches with some processes pinning
+  some inodes/dentries/pages in cache.)
+
+* The ``mon_health_preluminous_compat`` and
+  ``mon_health_preluminous_compat_warning`` config options are
+  removed, as the related functionality is more than two versions old.
+  Any legacy monitoring system expecting Jewel-style health output
+  will need to be updated to work with Nautilus.
+
+* Nautilus is not supported on any distros still running upstart so upstart
+  specific files and references have been removed.
+
+* The ``ceph pg <pgid> list_missing`` command has been renamed to
+  ``ceph pg <pgid> list_unfound`` to better match its behaviour.
+
+* The *rbd-mirror* daemon can now retrieve remote peer cluster configuration
+  secrets from the monitor. To use this feature, the rbd-mirror daemon
+  CephX user for the local cluster must use the ``profile rbd-mirror`` mon cap.
+  The secrets can be set using the ``rbd mirror pool peer add`` and
+  ``rbd mirror pool peer set`` actions.
+
+* The ``ceph mds deactivate`` is fully obsolete and references to it in the docs
+  have been removed or clarified.
+
+* The libcephfs bindings added the ``ceph_select_filesystem`` function
+  for use with multiple filesystems.
+
+* The cephfs python bindings now include ``mount_root`` and ``filesystem_name``
+  options in the mount() function.
+
+* erasure-code: add experimental *Coupled LAYer (CLAY)* erasure codes
+  support. It features less network traffic and disk I/O when performing
+  recovery.
+
+* The ``cache drop`` OSD command has been added to drop an OSD's caches:
+
+    - ``ceph tell osd.x cache drop``
+
+* The ``cache status`` OSD command has been added to get the cache stats of an
+  OSD:
+
+    - ``ceph tell osd.x cache status``
+
+* The libcephfs added several functions that allow restarted client to destroy
+  or reclaim state held by a previous incarnation. These functions are for NFS
+  servers.
+
+* The ``ceph`` command line tool now accepts keyword arguments in
+  the format ``--arg=value`` or ``--arg value``.
+
+* ``librados::IoCtx::nobjects_begin()`` and
+  ``librados::NObjectIterator`` now communicate errors by throwing a
+  ``std::system_error`` exception instead of ``std::runtime_error``.
+
+* The callback function passed to ``LibRGWFS.readdir()`` now accepts a ``flags``
+  parameter. it will be the last parameter passed to  ``readdir()`` method.
+
+* The ``cephfs-data-scan scan_links`` now automatically repair inotables and
+  snaptable.
+
+* Configuration values ``mon_warn_not_scrubbed`` and
+  ``mon_warn_not_deep_scrubbed`` have been renamed.  They are now
+  ``mon_warn_pg_not_scrubbed_ratio`` and ``mon_warn_pg_not_deep_scrubbed_ratio``
+  respectively.  This is to clarify that these warnings are related to
+  pg scrubbing and are a ratio of the related interval.  These options
+  are now enabled by default.
+
+* The MDS cache trimming is now throttled. Dropping the MDS cache
+  via the ``ceph tell mds.<foo> cache drop`` command or large reductions in the
+  cache size will no longer cause service unavailability.
+
+* The CephFS MDS behavior with recalling caps has been significantly improved
+  to not attempt recalling too many caps at once, leading to instability.
+  MDS with a large cache (64GB+) should be more stable.
+
+* MDS now provides a config option ``mds_max_caps_per_client`` (default: 1M) to
+  limit the number of caps a client session may hold. Long running client
+  sessions with a large number of caps have been a source of instability in the
+  MDS when all of these caps need to be processed during certain session
+  events. It is recommended to not unnecessarily increase this value.
+
+* The MDS config ``mds_recall_state_timeout`` has been removed. Late
+  client recall warnings are now generated based on the number of caps
+  the MDS has recalled which have not been released. The new configs
+  ``mds_recall_warning_threshold`` (default: 32K) and
+  ``mds_recall_warning_decay_rate`` (default: 60s) sets the threshold
+  for this warning.
+
+* The Telegraf module for the Manager allows for sending statistics to
+  an Telegraf Agent over TCP, UDP or a UNIX Socket. Telegraf can then
+  send the statistics to databases like InfluxDB, ElasticSearch, Graphite
+  and many more.
+
+* The graylog fields naming the originator of a log event have
+  changed: the string-form name is now included (e.g., ``"name":
+  "mgr.foo"``), and the rank-form name is now in a nested section
+  (e.g., ``"rank": {"type": "mgr", "num": 43243}``).
+
+* If the cluster log is directed at syslog, the entries are now
+  prefixed by both the string-form name and the rank-form name (e.g.,
+  ``mgr.x mgr.12345 ...`` instead of just ``mgr.12345 ...``).
+
+* The JSON output of the ``ceph osd find`` command has replaced the ``ip``
+  field with an ``addrs`` section to reflect that OSDs may bind to
+  multiple addresses.
+
+* CephFS clients without the 's' flag in their authentication capability
+  string will no longer be able to create/delete snapshots. To allow
+  ``client.foo`` to create/delete snapshots in the ``bar`` directory of
+  filesystem ``cephfs_a``, use command:
+
+    - ``ceph auth caps client.foo mon 'allow r' osd 'allow rw tag cephfs data=cephfs_a' mds 'allow rw, allow rws path=/bar'``
+
+* The ``osd_heartbeat_addr`` option has been removed as it served no
+  (good) purpose: the OSD should always check heartbeats on both the
+  public and cluster networks.
+
+* The ``rados`` tool's ``mkpool`` and ``rmpool`` commands have been
+  removed because they are redundant; please use the ``ceph osd pool
+  create`` and ``ceph osd pool rm`` commands instead.
+
+* The ``auid`` property for cephx users and RADOS pools has been
+  removed.  This was an undocumented and partially implemented
+  capability that allowed cephx users to map capabilities to RADOS
+  pools that they "owned".  Because there are no users we have removed
+  this support.  If any cephx capabilities exist in the cluster that
+  restrict based on auid then they will no longer parse, and the
+  cluster will report a health warning like::
+
+    AUTH_BAD_CAPS 1 auth entities have invalid capabilities
+        client.bad osd capability parse failed, stopped at 'allow rwx auid 123' of 'allow rwx auid 123'
+
+  The capability can be adjusted with the ``ceph auth caps``
+  command. For example,::
+
+    ceph auth caps client.bad osd 'allow rwx pool foo'
+
+* The ``ceph-kvstore-tool`` ``repair`` command has been renamed
+  ``destructive-repair`` since we have discovered it can corrupt an
+  otherwise healthy rocksdb database.  It should be used only as a last-ditch
+  attempt to recover data from an otherwise corrupted store.
+
+
+* The default memory utilization for the mons has been increased
+  somewhat.  Rocksdb now uses 512 MB of RAM by default, which should
+  be sufficient for small to medium-sized clusters; large clusters
+  should tune this up.  Also, the ``mon_osd_cache_size`` has been
+  increase from 10 OSDMaps to 500, which will translate to an
+  additional 500 MB to 1 GB of RAM for large clusters, and much less
+  for small clusters.
+
+* The ``mgr/balancer/max_misplaced`` option has been replaced by a new
+  global ``target_max_misplaced_ratio`` option that throttles both
+  balancer activity and automated adjustments to ``pgp_num`` (normally as a
+  result of ``pg_num`` changes).  If you have customized the balancer module
+  option, you will need to adjust your config to set the new global option
+  or revert to the default of .05 (5%).
+
+* By default, Ceph no longer issues a health warning when there are
+  misplaced objects (objects that are fully replicated but not stored
+  on the intended OSDs).  You can reenable the old warning by setting
+  ``mon_warn_on_misplaced`` to ``true``.
+
+* The ``ceph-create-keys`` tool is now obsolete.  The monitors
+  automatically create these keys on their own.  For now the script
+  prints a warning message and exits, but it will be removed in the
+  next release.  Note that ``ceph-create-keys`` would also write the
+  admin and bootstrap keys to /etc/ceph and /var/lib/ceph, but this
+  script no longer does that.  Any deployment tools that relied on
+  this behavior should instead make use of the ``ceph auth export
+  <entity-name>`` command for whichever key(s) they need.
+
+* The ``mon_osd_pool_ec_fast_read`` option has been renamed
+  ``osd_pool_default_ec_fast_read`` to be more consistent with other
+  ``osd_pool_default_*`` options that affect default values for newly
+  created RADOS pools.
+
+* The ``mon addr`` configuration option is now deprecated.  It can
+  still be used to specify an address for each monitor in the
+  ``ceph.conf`` file, but it only affects cluster creation and
+  bootstrapping, and it does not support listing multiple addresses
+  (e.g., both a v2 and v1 protocol address).  We strongly recommend
+  the option be removed and instead a single ``mon host`` option be
+  specified in the ``[global]`` section to allow daemons and clients
+  to discover the monitors.
+
+* New command ``ceph fs fail`` has been added to quickly bring down a file
+  system. This is a single command that unsets the joinable flag on the file
+  system and brings down all of its ranks.
+
+* The ``cache drop`` admin socket command has been removed. The ``ceph
+  tell mds.X cache drop`` remains.
+
+
+Detailed Changelog
+------------------