From 2c8732fffe5f206c9618a137113bc195bfefd0a7 Mon Sep 17 00:00:00 2001 From: Kefu Chai Date: Wed, 1 Jul 2020 19:33:35 +0800 Subject: [PATCH] mon/PGMap: do not consider changing pg stuck there is chance that we have a PG just created but fails to peered before a mgr module retrieves the health report from mgr. in that case, the "last_peered" field is not set, as that pg has not peered. but normally, the newly created PG will be active+clean in couple seconds which is way under the default setting of mon_pg_stuck_threshold (60 seconds). so in this change, if the "last_whatever" is not set, we also use the "last_changed" as a reference to see if the PG is healthy, and only consider PG stuck if the last_changed is also too old. Fixes: https://tracker.ceph.com/issues/45717 Signed-off-by: Kefu Chai (cherry picked from commit 34e1df66cdf9ac4aeea338a8f3d5b9a10fa5983a) --- src/mon/PGMap.cc | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mon/PGMap.cc b/src/mon/PGMap.cc index a340fd0562a17..57796651c6e3d 100644 --- a/src/mon/PGMap.cc +++ b/src/mon/PGMap.cc @@ -2525,7 +2525,11 @@ void PGMap::get_health_checks( if (pg_response.stuck_since) { // Delayed response, check for stuckness utime_t last_whatever = pg_response.stuck_since(pg_info); - if (last_whatever >= cutoff) { + if (last_whatever.is_zero() && + pg_info.last_change >= cutoff) { + // still moving, ignore + continue; + } else if (last_whatever >= cutoff) { // Not stuck enough, ignore. continue; } else { -- 2.39.5