From 7cdb7edad4a45cb8761233a883a2b7f0e00f4aae Mon Sep 17 00:00:00 2001
From: "Matthew N. Heler" <matthew.heler@hotmail.com>
Date: Fri, 15 May 2026 06:11:35 -0500
Subject: [PATCH] doc/rados/configuration: recommend wpq for EC clusters seeing
 slow ops

On large EC clusters, mClock currently routes recovery EC sub-reads
through the immediate queue, skipping throttling. When many OSDs read
from one source during recovery, that source's high-priority queue
saturates and starves client work, producing slow ops. Recommend
falling back to wpq in the mClock config reference until the
scheduler treats those reads as background.

Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
---
 doc/rados/configuration/mclock-config-ref.rst | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/doc/rados/configuration/mclock-config-ref.rst b/doc/rados/configuration/mclock-config-ref.rst
index c205ec14aff..95d6e52c91e 100644
--- a/doc/rados/configuration/mclock-config-ref.rst
+++ b/doc/rados/configuration/mclock-config-ref.rst
@@ -4,6 +4,22 @@
 
 .. index:: mclock; configuration
 
+.. warning:: On large clusters with erasure-coded pools, operators may
+   observe slow ops during recovery or backfill (for example, when an
+   OSD is drained out). Under mClock, EC sub-operation reads issued
+   during recovery are currently routed through the ``immediate``
+   high-priority queue and bypass mClock throttling. When many OSDs
+   read concurrently from a single source OSD, this can saturate that
+   OSD's high-priority queue and starve client and background work.
+   As an interim measure, such deployments are advised to switch to
+   the ``WeightedPriorityQueue`` (``wpq``) scheduler. The change can
+   be applied cluster-wide and takes effect after each OSD is
+   restarted:
+
+   .. prompt:: bash #
+
+     ceph config set osd osd_op_queue wpq
+
 QoS support in Ceph is implemented using a queuing scheduler based on `the
 dmClock algorithm`_. See :ref:`dmclock-qos` section for more details.
 
-- 
2.47.3