From: Ilya Dryomov Date: Sun, 1 Mar 2026 21:55:52 +0000 (+0100) Subject: qa/workunits/rbd: short-circuit status() if "ceph -s" fails X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=82717e43a08a1262987f5e271fd72d4433c4fb3b;p=ceph.git qa/workunits/rbd: short-circuit status() if "ceph -s" fails In mirror-thrash tests, status() can be invoked after one of the clusters is effectively stopped due to a watchdog bark: 2026-03-01T22:27:38.633 INFO:tasks.daemonwatchdog.daemon_watchdog:thrasher.rbd_mirror.[cluster2] failed 2026-03-01T22:27:38.633 INFO:tasks.daemonwatchdog.daemon_watchdog:BARK! unmounting mounts and killing all daemons ... 2026-03-01T22:32:46.964 INFO:tasks.workunit.cluster1.client.mirror.trial199.stderr:+ status 2026-03-01T22:32:46.964 INFO:tasks.workunit.cluster1.client.mirror.trial199.stderr:+ local cluster daemon image_pool image_ns image 2026-03-01T22:32:46.964 INFO:tasks.workunit.cluster1.client.mirror.trial199.stderr:+ for cluster in ${CLUSTER1} ${CLUSTER2} In this scenario all commands that are invoked from the loop body are going to time out anyway. Signed-off-by: Ilya Dryomov --- diff --git a/qa/workunits/rbd/rbd_mirror_helpers.sh b/qa/workunits/rbd/rbd_mirror_helpers.sh index a069853fb71..f5d7fe92624 100755 --- a/qa/workunits/rbd/rbd_mirror_helpers.sh +++ b/qa/workunits/rbd/rbd_mirror_helpers.sh @@ -514,7 +514,11 @@ status() for cluster in ${CLUSTER1} ${CLUSTER2} do echo "${cluster} status" - CEPH_ARGS='' ceph --cluster ${cluster} -s + # if "ceph -s" fails, assume that the cluster is broken or + # unavailable and skip gathering details for it + CEPH_ARGS='' ceph --cluster ${cluster} -s || continue + + echo "${cluster} service status" CEPH_ARGS='' ceph --cluster ${cluster} service dump CEPH_ARGS='' ceph --cluster ${cluster} service status echo