From 469fa23f25193bc5862a4572087ffeb58a2b23d0 Mon Sep 17 00:00:00 2001 From: Sebastian Wagner Date: Fri, 21 Feb 2020 14:39:16 +0100 Subject: [PATCH] doc/cephadm: Add Troubleshooting Signed-off-by: Sebastian Wagner --- doc/cephadm/administration.rst | 64 ++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/doc/cephadm/administration.rst b/doc/cephadm/administration.rst index 4e9ed4c174f..30340d0fbb6 100644 --- a/doc/cephadm/administration.rst +++ b/doc/cephadm/administration.rst @@ -180,3 +180,67 @@ Adoption Process #. Check the ``ceph health detail`` output for cephadm warnings about stray cluster daemons or hosts that are not yet managed. + +Troubleshooting +=============== + +Sometimes there is a need to investigate why a cephadm command failed or why +a specific service no longer runs properly. + +As cephadm deploys daemons as containers, troubleshooting daemons is slightly +different. Here are a few tools and commands to help investigating issues. + +Gathering log files +------------------- + +Use journalctl to gather the log files of all daemons: + +.. note:: By default cephadm now stores logs in journald. This means + that you will no longer find daemon logs in ``/var/log/ceph/``. + +To read the log file of one specific daemon, run:: + + cephadm logs --name + +To fetch all log files of all daemons on a given host, run:: + + for name in $(cephadm ls | jq -r '.[].name') ; do + cephadm logs --name "$name" > $name; + done + +Collecting systemd status +------------------------- + +To print the state of a systemd unit, run:: + + systemctl status "ceph-$(cephadm shell ceph fsid)@.service"; + + +To fetch all state of all daemons of a given host, run:: + + fsid="$(cephadm shell ceph fsid)" + for name in $(cephadm ls | jq -r '.[].name') ; do + systemctl status "ceph-$fsid@$name.service" > $name; + done + + +List all downloaded container images +------------------------------------ + +To list all container images that are downloaded on a host: + +.. note:: ``Image`` might also be called `ImageID` + +:: + + podman ps -a --format json | jq '.[].Image' + "docker.io/library/centos:8" + "registry.opensuse.org/opensuse/leap:15.2" + + +Manually running containers +--------------------------- + +cephadm writes small wrappers that run a containers. Refer to +``/var/lib/ceph///unit.run`` for the container execution command. +to execute a container. -- 2.39.5