From b4e2b3991774fc15f677752a344181920fc1216c Mon Sep 17 00:00:00 2001 From: John Mulligan Date: Mon, 11 Sep 2023 17:55:36 -0400 Subject: [PATCH] doc/cephadm: extend container debugging to cover live processes Signed-off-by: John Mulligan --- doc/cephadm/troubleshooting.rst | 75 +++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) diff --git a/doc/cephadm/troubleshooting.rst b/doc/cephadm/troubleshooting.rst index 1ea1534244615..3d4965deda034 100644 --- a/doc/cephadm/troubleshooting.rst +++ b/doc/cephadm/troubleshooting.rst @@ -390,3 +390,78 @@ You can then run debugger commands at gdb's prompt. #1 0x00007fa910d7f8f0 in std::condition_variable::wait(std::unique_lock&) () from /lib64/libstdc++.so.6 #2 0x00007fa913d3f48f in AsyncMessenger::wait() () from /usr/lib64/ceph/libceph-common.so.2 #3 0x0000563085ca3d7e in main () + + +Running repeated debugging sessions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When using ``cephadm shell``, like in the example above, the changes made to +the container the shell command spawned are ephemeral. Once the shell session +exits all of the files that were downloaded and installed are no longer +available. One can simply re-run the same commands every time ``cephadm shell`` +is invoked, but in order to save time and resources one can create a new +container image and use it for repeated debugging sessions. + +In the following example we create a simple file for constructing the +container image. The command below uses podman but it should work correctly +if ``podman`` is replaced with ``docker``. + +.. prompt:: bash + + cat >Containerfile < to customize the base image + +The result should be a new local image named ``ceph:debugging``. This image can +be used on the same machine that built it. Later, the image could be pushed to +a container repository, or saved and copied to a node runing other ceph +containers. Please consult the documentation for ``podman`` or ``docker`` for +more details on the general container workflow. + +Once the image has been built it can be used to initiate repeat debugging +sessions without having to re-install the debug tools and debuginfo packages. +To debug a core file using this image, in the same way as previously described, +run: + +.. prompt:: bash # + + cephadm --image ceph:debugging shell --mount /var/lib/system/coredump + + +Debugging live processes +~~~~~~~~~~~~~~~~~~~~~~~~ + +The gdb debugger has the ability to attach to running processes to debug them. +For a containerized process this can be accomplished by using the debug image +and attaching it to the same PID namespace as the process to be debugged. + +This requires running a container command with some custom arguments. We can generate a script that can debug a process in a running container. + +.. prompt:: bash # + + cephadm --image ceph:debugging shell --dry-run > /tmp/debug.sh + +This creates a script with the container command cephadm would use to create a +shell. Now, modify the script by removing the ``--init`` argument and replace +that with the argument to join to the namespace used for a running running +container. For example, let's assume we want to debug the MGR, and have +determnined that the MGR is running in a container named +``ceph-bc615290-685b-11ee-84a6-525400220000-mgr-ceph0-sluwsk``. The new +argument +``--pid=container:ceph-bc615290-685b-11ee-84a6-525400220000-mgr-ceph0-sluwsk`` +should be used. + +Now, we can run our debugging container with ``sh /tmp/debug.sh``. Within the shell +we can run commands such as ``ps`` to get the PID of the MGR process. In the following +example this will be ``2``. Running gdb, we can now attach to the running process: + +.. prompt:: bash (gdb) + + attach 2 + info threads + bt -- 2.39.5