From: Jos Collin Date: Fri, 12 Dec 2025 09:40:44 +0000 (+0530) Subject: doc: update documentation for dispatch queue throttling X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=92b65ade7dbe2951320faf88619b92137cfeb186;p=ceph-ci.git doc: update documentation for dispatch queue throttling Fixes: https://tracker.ceph.com/issues/46226 Signed-off-by: Jos Collin --- diff --git a/PendingReleaseNotes b/PendingReleaseNotes index fd171f73f65..9b0e9e294c4 100644 --- a/PendingReleaseNotes +++ b/PendingReleaseNotes @@ -13,6 +13,8 @@ * DASHBOARD: RGW Service form updated to take input regarding QAT compression. - QAT compression is an optional field which can be set to 'Hardware' or 'Software' by selecting options from provided dropdwon. If 'None' is selected, compression is removed altogether. * CephFS: The `peer_add` command is deprecated in favor of the `peer_bootstrap` command. +* CephFS: Dispatch Queue Throttling produces a cluster warning every 30s by default and + updates the 'ceph status' with a health warning. * RADOS: When objects are read during deep scrubs, the data is read in strides, and the scrubbing process is delayed between each read in order to avoid monopolizing the I/O capacity of the OSD. diff --git a/doc/rados/configuration/network-config-ref.rst b/doc/rados/configuration/network-config-ref.rst index 5cefa27c4bb..4827cb5a60f 100644 --- a/doc/rados/configuration/network-config-ref.rst +++ b/doc/rados/configuration/network-config-ref.rst @@ -341,6 +341,7 @@ General Settings .. confval:: ms_max_backoff .. confval:: ms_die_on_bad_msg .. confval:: ms_dispatch_throttle_bytes +.. confval:: ms_dispatch_throttle_log_interval .. confval:: ms_inject_socket_failures diff --git a/doc/rados/operations/health-checks.rst b/doc/rados/operations/health-checks.rst index cf7649d6120..afd31db50a4 100644 --- a/doc/rados/operations/health-checks.rst +++ b/doc/rados/operations/health-checks.rst @@ -960,6 +960,29 @@ this may be done for specific OSDs or a given mask, for example: ceph config set class:ssd bluestore_slow_ops_warn_lifetime 300 ceph config set class:ssd bluestore_slow_ops_warn_threshold 5 +DISPATCH_QUEUE_THROTTLE +_______________________ + +Messenger reached Dispatch Queue throttle limit. Hitting the Dispatch Queue +throttle limit would prevent fast dispatch of critical messages. This warning +will be reported in ``ceph health detail``. The warning state is cleared when +the condition clears. + +:confval:`ms_dispatch_throttle_bytes` represents the total size of messages +waiting to be dispatched. This configuration limit messages that are read off +the network but still being processed. + +:confval:`ms_dispatch_throttle_log_interval` is the interval in seconds to +show the cluster warning and health warning. Setting it to 0 disables the +cluster warning and health warning. + +To change the default values, run a command of the following form: + +.. prompt:: bash $ + + ceph config set global ms_dispatch_throttle_bytes 50 + ceph config set global ms_dispatch_throttle_log_interval 5 + Device health ------------- @@ -1938,4 +1961,4 @@ Then to check the ``w`` value for a particular profile use a command of the foll ceph osd erasure-code-profile get The intended solution is to create a new pool with a correct ``w`` value and copy all the objects. -Then delete the old pool before the data corruption can happen. \ No newline at end of file +Then delete the old pool before the data corruption can happen.