From: Nathan Cutler Date: Fri, 21 Apr 2017 09:05:05 +0000 (+0200) Subject: tests: rados: sleep before ceph tell osd.0 flush_pg_stats after restart X-Git-Tag: v10.2.8~1^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=f46ccf2cb4701cd93cd9b15a4e57b5b97798b947;p=ceph.git tests: rados: sleep before ceph tell osd.0 flush_pg_stats after restart Even though we wait for HEALTH_OK after restarting the daemons, they are not ready to respond to flush_pg_stats. The reason why the osd is not ready for "tell" command after "ceph health" shows that the cluster is "HEALTH_OK" is that the monitor fails to be notified that the osd in question is not up in "heatbeat_interval". Because infernalis does not have the osd_fast_fail_on_connection_refused support, the monitor needs longer to detect that an osd is down, and osd_heartbeat_grace is used to determine if an osd is down. References: http://tracker.ceph.com/issues/16239 Signed-off-by: Nathan Cutler Signed-off-by: Kefu Chai --- diff --git a/qa/suites/rados/singleton/all/ec-lost-unfound-upgrade.yaml b/qa/suites/rados/singleton/all/ec-lost-unfound-upgrade.yaml index 9228b30568b3..c9add9ff4e78 100644 --- a/qa/suites/rados/singleton/all/ec-lost-unfound-upgrade.yaml +++ b/qa/suites/rados/singleton/all/ec-lost-unfound-upgrade.yaml @@ -26,5 +26,7 @@ tasks: - print: "upgraded mon.a and friends" - ceph.restart: daemons: [mon.a, mon.b, mon.c, osd.0, osd.1, osd.2] +- sleep: + duration: 20 # http://tracker.ceph.com/issues/16239 - ec_lost_unfound: parallel_bench: false