From: Zac Dover Date: Thu, 7 Aug 2025 06:10:49 +0000 (+1000) Subject: doc/cephfs: edit troubleshooting.rst X-Git-Tag: testing/wip-vshankar-testing-20250820.113147-reef-debug~27^2 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=a2f95a9c2ac43859099582d7d48709bf21129735;p=ceph-ci.git doc/cephfs: edit troubleshooting.rst Edit the section "Slow/Stuck Operations" in doc/cephfs/troubleshooting.rst. Signed-off-by: Zac Dover (cherry picked from commit 57e7be73d8c121a3a06155217bb6f850faa4293f) --- diff --git a/doc/cephfs/troubleshooting.rst b/doc/cephfs/troubleshooting.rst index 0c2795df69b..ecb1e01e242 100644 --- a/doc/cephfs/troubleshooting.rst +++ b/doc/cephfs/troubleshooting.rst @@ -5,21 +5,33 @@ Slow/stuck operations ===================== -If you are experiencing apparent hung operations, the first task is to identify -where the problem is occurring: in the client, the MDS, or the network connecting -them. Start by looking to see if either side has stuck operations -(:ref:`slow_requests`, below), and narrow it down from there. +Sometimes CephFS operations hang. The first step in troubleshooting them is to +locate the problem causing the operations to hang. Problems present in three +places: -We can get hints about what's going on by dumping the MDS cache :: +#. in the client +#. in the MDS +#. in the network that connects the client to the MDS - ceph daemon mds. dump cache /tmp/dump.txt +First, use the procedure in :ref:`slow_requests` to determine if the client has +stuck operations or the MDS has stuck operations. -.. note:: The file `dump.txt` is on the machine executing the MDS and for systemd - controlled MDS services, this is in a tmpfs in the MDS container. - Use `nsenter(1)` to locate `dump.txt` or specify another system-wide path. +Dump the MDS cache. The contents of the MDS cache will be used to diagnose the +nature of the problem. Run the following command to dump the MDS cache: -If high logging levels are set on the MDS, that will almost certainly hold the -information we need to diagnose and solve the issue. +.. prompt:: bash # + + ceph daemon mds. dump cache /tmp/dump.txt + +.. note:: MDS services that are not controlled by systemd dump the file + ``dump.txt`` to the machine that runs the MDS. MDS services that are + controlled by systemd dump the file ``dump.txt`` to a tmpfs in the MDS + container. Use `nsenter(1)` to locate ``dump.txt`` or specify another + system-wide path. + +If high logging levels have been set on the MDS, ``dump.txt`` can be expected +to hold the information needed to diagnose and solve the issue causing the +CephFS operations to hang. Stuck during recovery =====================