From ee69f52193996d528410a71c2c67fc27b47dee31 Mon Sep 17 00:00:00 2001
From: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Date: Tue, 18 Mar 2025 15:35:34 -0400
Subject: [PATCH] doc/cephadm: Add PG autoscaler advice to upgrade.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
---
 doc/cephadm/upgrade.rst | 42 +++++++++++++++++++++++++++++++----------
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/doc/cephadm/upgrade.rst b/doc/cephadm/upgrade.rst
index 209fa15bd05..5ab0012cca9 100644
--- a/doc/cephadm/upgrade.rst
+++ b/doc/cephadm/upgrade.rst
@@ -19,7 +19,27 @@ The automated upgrade process follows Ceph best practices.  For example:
 
 .. note:: 
 
-   In case a host of the cluster is offline, the upgrade is paused.
+   If a cluster host is or becomes unavailable the upgrade will be paused
+   until it is restored.
+
+.. note::
+
+   When the PG autoscaler mode for **any** pool is set to ``on``, we recommend
+   disabling the autoscaler for the duration of the upgrade.  This is so that
+   PG splitting or merging in the middle of an upgrade does not unduly delay
+   upgrade progress.  In a very large cluster this could easily increase the
+   time to complete by a whole day or more, especially if the upgrade happens to
+   change PG autoscaler behavior by e.g. changing the default value of
+   the :confval:`mon_target_pg_per_osd`.
+   | 
+   * ``ceph osd pool set noautoscale``
+   * Perform the upgrade
+   * ``ceph osd pool unset noautoscale``
+   | 
+   When pausing autoscaler activity in this fashion, the existing values for
+   each pool's mode, ``off``, ``on``, or ``warn`` are expected to remain.
+   If the new release changes the above target value, there may be splitting
+   or merging of PGs when unsetting after the upgrade.
 
 
 Starting the upgrade
@@ -27,28 +47,28 @@ Starting the upgrade
 
 .. note::
    .. note::
-      `Staggered Upgrade`_ of the mons/mgrs may be necessary to have access
-      to this new feature.
+      `Staggered Upgrade`_ of the Monitors and Managers may be necessary to use
+      the below CephFS upgrade feature.
 
-   Cephadm by default reduces `max_mds` to `1`. This can be disruptive for large
+   Cephadm by default reduces ``max_mds`` to ``1``. This can be disruptive for large
    scale CephFS deployments because the cluster cannot quickly reduce active MDS(s)
    to `1` and a single active MDS cannot easily handle the load of all clients
-   even for a short time. Therefore, to upgrade MDS(s) without reducing `max_mds`,
-   the `fail_fs` option can to be set to `true` (default value is `false`) prior
+   even for a short time. Therefore, to upgrade MDS(s) without reducing ``max_mds``,
+   the ``fail_fs`` option can to be set to ``true`` (default value is ``false``) prior
    to initiating the upgrade:
 
    .. prompt:: bash #
 
-      ceph config set mgr mgr/orchestrator/fail_fs true
+      ceph confg set mgr mgr/orchestrator/fail_fs true
 
    This would:
                #. Fail CephFS filesystems, bringing active MDS daemon(s) to
-                  `up:standby` state.
+                  ``up:standby`` state.
 
                #. Upgrade MDS daemons safely.
 
                #. Bring CephFS filesystems back up, bringing the state of active
-                  MDS daemon(s) from `up:standby` to `up:active`.
+                  MDS daemon(s) from ``up:standby`` to `up:active``.
 
 Before you use cephadm to upgrade Ceph, verify that all hosts are currently online and that your cluster is healthy by running the following command:
 
@@ -145,7 +165,9 @@ The message ``Error ENOENT: Module not found`` appears in response to the comman
 
    Error ENOENT: Module not found
 
-This is possibly caused by invalid JSON in a mgr config-key. See `Redmine tracker Issue #67329 <https://tracker.ceph.com/issues/67329>`_ and `the discussion on the [ceph-users] mailing list <https://www.spinics.net/lists/ceph-users/msg83667.html>`_.
+This is possibly caused by invalid JSON in a mgr config-key.
+See `Redmine tracker Issue #67329 <https://tracker.ceph.com/issues/67329>`_
+and `this discussion on the ceph-users mailing list <https://www.spinics.net/lists/ceph-users/msg83667.html>`_.
 
 UPGRADE_NO_STANDBY_MGR
 ----------------------
-- 
2.39.5