From: Jos Collin <jcollin@redhat.com>
Date: Fri, 12 Dec 2025 09:40:44 +0000 (+0530)
Subject: doc: update documentation for dispatch queue throttling
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=92b65ade7dbe2951320faf88619b92137cfeb186;p=ceph-ci.git

doc: update documentation for dispatch queue throttling

Fixes: https://tracker.ceph.com/issues/46226
Signed-off-by: Jos Collin <jcollin@redhat.com>
---

diff --git a/PendingReleaseNotes b/PendingReleaseNotes
index fd171f73f65..9b0e9e294c4 100644
--- a/PendingReleaseNotes
+++ b/PendingReleaseNotes
@@ -13,6 +13,8 @@
 * DASHBOARD: RGW Service form updated to take input regarding QAT compression.
   - QAT compression is an optional field which can be set to 'Hardware' or 'Software' by selecting options from provided dropdwon. If 'None' is selected, compression is removed altogether.
 * CephFS: The `peer_add` command is deprecated in favor of the `peer_bootstrap` command.
+* CephFS: Dispatch Queue Throttling produces a cluster warning every 30s by default and
+  updates the 'ceph status' with a health warning.
 * RADOS: When objects are read during deep scrubs, the data is read in strides,
   and the scrubbing process is delayed between each read in order to avoid monopolizing
   the I/O capacity of the OSD.
diff --git a/doc/rados/configuration/network-config-ref.rst b/doc/rados/configuration/network-config-ref.rst
index 5cefa27c4bb..4827cb5a60f 100644
--- a/doc/rados/configuration/network-config-ref.rst
+++ b/doc/rados/configuration/network-config-ref.rst
@@ -341,6 +341,7 @@ General Settings
 .. confval:: ms_max_backoff
 .. confval:: ms_die_on_bad_msg
 .. confval:: ms_dispatch_throttle_bytes
+.. confval:: ms_dispatch_throttle_log_interval
 .. confval:: ms_inject_socket_failures
 
 
diff --git a/doc/rados/operations/health-checks.rst b/doc/rados/operations/health-checks.rst
index cf7649d6120..afd31db50a4 100644
--- a/doc/rados/operations/health-checks.rst
+++ b/doc/rados/operations/health-checks.rst
@@ -960,6 +960,29 @@ this may be done for specific OSDs or a given mask, for example:
    ceph config set class:ssd bluestore_slow_ops_warn_lifetime 300
    ceph config set class:ssd bluestore_slow_ops_warn_threshold 5
 
+DISPATCH_QUEUE_THROTTLE
+_______________________
+
+Messenger reached Dispatch Queue throttle limit. Hitting the Dispatch Queue
+throttle limit would prevent fast dispatch of critical messages. This warning
+will be reported in ``ceph health detail``. The warning state is cleared when
+the condition clears.
+
+:confval:`ms_dispatch_throttle_bytes` represents the total size of messages
+waiting to be dispatched. This configuration limit messages that are read off
+the network but still being processed.
+
+:confval:`ms_dispatch_throttle_log_interval` is the interval in seconds to
+show the cluster warning and health warning. Setting it to 0 disables the
+cluster warning and health warning.
+
+To change the default values, run a command of the following form:
+
+.. prompt:: bash $
+
+   ceph config set global ms_dispatch_throttle_bytes 50
+   ceph config set global ms_dispatch_throttle_log_interval 5
+
 Device health
 -------------
 
@@ -1938,4 +1961,4 @@ Then to check the ``w`` value for a particular profile use a command of the foll
    ceph osd erasure-code-profile get <name of profile>
 
 The intended solution is to create a new pool with a correct ``w`` value and copy all the objects.
-Then delete the old pool before the data corruption can happen.
\ No newline at end of file
+Then delete the old pool before the data corruption can happen.