From 77a69405f92f9912f75b8e947fc9df8e1e92099c Mon Sep 17 00:00:00 2001
From: Patrick Donnelly <pdonnell@redhat.com>
Date: Thu, 21 Feb 2019 20:23:13 -0800
Subject: [PATCH] doc: update documentation for standby-replay

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
---
 doc/cephfs/standby.rst    | 160 +++++---------------------------------
 doc/releases/nautilus.rst |   6 ++
 2 files changed, 25 insertions(+), 141 deletions(-)

diff --git a/doc/cephfs/standby.rst b/doc/cephfs/standby.rst
index 0aaab9b76c7..d4be67d0a3b 100644
--- a/doc/cephfs/standby.rst
+++ b/doc/cephfs/standby.rst
@@ -58,9 +58,10 @@ forms of the 'fail' command:
 Managing failover
 -----------------
 
-If an MDS daemon stops communicating with the monitor, the monitor will
-wait ``mds_beacon_grace`` seconds (default 15 seconds) before marking
-the daemon as *laggy*.
+If an MDS daemon stops communicating with the monitor, the monitor will wait
+``mds_beacon_grace`` seconds (default 15 seconds) before marking the daemon as
+*laggy*. If a standby is available, the monitor will immediately replace the
+laggy daemon.
 
 Each file system may specify a number of standby daemons to be considered
 healthy. This number includes daemons in standby-replay waiting for a rank to
@@ -76,148 +77,25 @@ Each file system may set the number of standby daemons wanted using:
 Setting ``count`` to 0 will disable the health check.
 
 
-Configuring standby daemons
----------------------------
+Configuring standby-replay
+--------------------------
 
-There are four configuration settings that control how a daemon
-will behave while in standby:
+Each CephFS file system may be configured to add standby-replay daemons.  These
+standby daemons follow the active MDS's metadata journal to reduce failover
+time in the event the active MDS becomes unavailable. Each active MDS may have
+only one standby-replay daemon following it.
 
-::
-
-    mds_standby_replay
-    mds_standby_for_name
-    mds_standby_for_rank
-    mds_standby_for_fscid
-
-These may be set in the ceph.conf on the host where the MDS daemon
-runs (as opposed to on the monitor).  The daemon loads these settings
-when it starts, and sends them to the monitor.
-
-By default, if none of these settings are used, all MDS daemons
-which do not hold a rank will be used as standbys for any rank.
-
-The settings which associate a standby daemon with a particular
-name or rank do not guarantee that the daemon will *only* be used
-for that rank.  They mean that when several standbys are available,
-the associated standby daemon will be used.  If a rank is failed,
-and a standby is available, it will be used even if it is associated
-with a different rank or named daemon.
-
-mds_standby_replay
-~~~~~~~~~~~~~~~~~~
-
-If this is set to true, then the standby daemon will continuously read
-the metadata journal of an up rank.  This will give it
-a warm metadata cache, and speed up the process of failing over
-if the daemon serving the rank fails.
-
-An up rank may only have one standby replay daemon assigned to it,
-if two daemons are both set to be standby replay then one of them
-will arbitrarily win, and the other will become a normal non-replay
-standby.
-
-Once a daemon has entered the standby replay state, it will only be
-used as a standby for the rank that it is following.  If another rank
-fails, this standby replay daemon will not be used as a replacement,
-even if no other standbys are available.
-
-*Historical note:* In Ceph prior to v10.2.1, this setting (when ``false``) is
-always true when ``mds_standby_for_*`` is also set.
-
-mds_standby_for_name
-~~~~~~~~~~~~~~~~~~~~
-
-Set this to make the standby daemon only take over a failed rank
-if the last daemon to hold it matches this name.
-
-mds_standby_for_rank
-~~~~~~~~~~~~~~~~~~~~
-
-Set this to make the standby daemon only take over the specified
-rank.  If another rank fails, this daemon will not be used to
-replace it.
-
-Use in conjunction with ``mds_standby_for_fscid`` to be specific
-about which filesystem's rank you are targeting, if you have
-multiple filesystems.
-
-mds_standby_for_fscid
-~~~~~~~~~~~~~~~~~~~~~
-
-If ``mds_standby_for_rank`` is set, this is simply a qualifier to
-say which filesystem's rank is referred to.
-
-If ``mds_standby_for_rank`` is not set, then setting FSCID will
-cause this daemon to target any rank in the specified FSCID.  Use
-this if you have a daemon that you want to use for any rank, but
-only within a particular filesystem.
-
-mon_force_standby_active
-~~~~~~~~~~~~~~~~~~~~~~~~
-
-This setting is used on monitor hosts.  It defaults to true.
-
-If it is false, then daemons configured with mds_standby_replay=true
-will **only** become active if the rank/name that they have
-been configured to follow fails.  On the other hand, if this
-setting is true, then a daemon configured with mds_standby_replay=true
-may be assigned some other rank.
-
-Examples
---------
-
-These are example ceph.conf snippets.  In practice you can either
-copy a ceph.conf with all daemons' configuration to all your servers,
-or you can have a different file on each server that contains just
-that server's daemons' configuration.
-
-Simple pair
-~~~~~~~~~~~
-
-Two MDS daemons 'a' and 'b' acting as a pair, where whichever one is not
-currently assigned a rank will be the standby replay follower
-of the other.
-
-::
-
-    [mds.a]
-    mds standby replay = true
-    mds standby for rank = 0
-
-    [mds.b]
-    mds standby replay = true
-    mds standby for rank = 0
-
-Floating standby
-~~~~~~~~~~~~~~~~
-
-Three MDS daemons 'a', 'b' and 'c', in a filesystem that has
-``max_mds`` set to 2.
+Configuring standby-replay on a file system is done using:
 
 ::
-    
-    # No explicit configuration required: whichever daemon is
-    # not assigned a rank will go into 'standby' and take over
-    # for whichever other daemon fails.
-
-Two MDS clusters
-~~~~~~~~~~~~~~~~
-
-With two filesystems, I have four MDS daemons, and I want two
-to act as a pair for one filesystem and two to act as a pair
-for the other filesystem.
-
-::
-
-    [mds.a]
-    mds standby for fscid = 1
-
-    [mds.b]
-    mds standby for fscid = 1
 
-    [mds.c]
-    mds standby for fscid = 2
+    ceph fs set <fs name> allow_standby_replay <bool>
 
-    [mds.d]
-    mds standby for fscid = 2
+Once set, the monitors will assign available standby daemons to follow the
+active MDSs in that file system.
 
+Once an MDS has entered the standby-replay state, it will only be used as a
+standby for the rank that it is following. If another rank fails, this
+standby-replay daemon will not be used as a replacement, even if no other
+standbys are available. For this reason, it is advised that if standby-replay
+is used then every active MDS should have a standby-replay daemon.
diff --git a/doc/releases/nautilus.rst b/doc/releases/nautilus.rst
index e55315517da..b77830c73e1 100644
--- a/doc/releases/nautilus.rst
+++ b/doc/releases/nautilus.rst
@@ -428,6 +428,12 @@ These changes occurred between the Mimic and Nautilus releases.
   ``mds_recall_warning_decay_rate`` (default: 60s) sets the threshold
   for this warning.
 
+* The MDS mds_standby_for_*, mon_force_standby_active, and mds_standby_replay
+  configuration options have been obsoleted. Instead, the operator may now set
+  the new "allow_standby_replay" flag on the CephFS file system. This setting
+  causes standbys to become standby-replay for any available rank in the file
+  system.
+
 * The Telegraf module for the Manager allows for sending statistics to
   an Telegraf Agent over TCP, UDP or a UNIX Socket. Telegraf can then
   send the statistics to databases like InfluxDB, ElasticSearch, Graphite
-- 
2.39.5