From: Sage Weil <sage@redhat.com>
Date: Wed, 26 Feb 2020 19:45:25 +0000 (-0600)
Subject: qa/tasks/ceph_manager: increase CLI command timeout
X-Git-Tag: v14.2.10~193^2
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=refs%2Fpull%2F33560%2Fhead;p=ceph.git

qa/tasks/ceph_manager: increase CLI command timeout

There is a problem with mimic releases where pg_creates can stall for a
long time doing the build_pg_history while holding osd_lock.  That lock
is also used by the tell command processing queue, which means that
commands like 'flush_pg_stats' can block for long periods...and time out.

This is currently happening with mimic->nautilus upgrades.  Note that
the problem is mostly fixed in nautilus and totally fixed in octopus, so
this is just a matter of tolerating slow behavior in old releases for the
purposes of the upgrade tests.

Work around this by increasing the timeout from 120s -> 900s.

Fixes: https://tracker.ceph.com/issues/43914
Signed-off-by: Sage Weil <sage@redhat.com>
---

diff --git a/qa/tasks/ceph_manager.py b/qa/tasks/ceph_manager.py
index f4cbbf6e08d2..ae3bc56b4ecc 100644
--- a/qa/tasks/ceph_manager.py
+++ b/qa/tasks/ceph_manager.py
@@ -1146,7 +1146,7 @@ class CephManager:
             'ceph-coverage',
             '{tdir}/archive/coverage'.format(tdir=testdir),
             'timeout',
-            '120',
+            '900',
             'ceph',
             '--cluster',
             self.cluster,
@@ -1169,7 +1169,7 @@ class CephManager:
             'ceph-coverage',
             '{tdir}/archive/coverage'.format(tdir=testdir),
             'timeout',
-            '120',
+            '900',
             'ceph',
             '--cluster',
             self.cluster,