From: Joao Eduardo Luis Date: Thu, 7 Jan 2016 11:20:36 +0000 (+0000) Subject: mon: Monitor: get rid of weighted clock skew reports X-Git-Tag: v10.1.0~209^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=17d8ff429c7dca8fc1ada6e7cc8a7c4924a22e28;p=ceph.git mon: Monitor: get rid of weighted clock skew reports By weighting the reports we were making it really hard to get rid of a clock skew warning once the cause had been fixed. Instead, as soon as we get a clean bill of health, let's run a new round and soon as possible and ascertain whether that was a transient fix or for realsies. That should be better than the alternative of waiting for an hour or something (for a large enough skew) for the warning to go away - and with it, the admin's sanity ("WHAT AM I DOING WRONG???"). Fixes: #14175 Signed-off-by: Joao Eduardo Luis --- diff --git a/src/mon/Monitor.cc b/src/mon/Monitor.cc index 174f35b7635b..2a56e2fbae68 100644 --- a/src/mon/Monitor.cc +++ b/src/mon/Monitor.cc @@ -3927,6 +3927,10 @@ void Monitor::timecheck_check_skews() dout(1) << __func__ << " no clock skews found after " << timecheck_rounds_since_clean << " rounds" << dendl; + // make sure the skews are really gone and not just a transient success + // this will run just once if not in the presence of skews again. + timecheck_rounds_since_clean = 1; + timecheck_reset_event(); timecheck_rounds_since_clean = 0; } @@ -4130,11 +4134,7 @@ void Monitor::handle_timecheck_leader(MonOpRequestRef op) << " delta " << delta << " skew_bound " << skew_bound << " latency " << latency << dendl; - if (timecheck_skews.count(other) == 0) { - timecheck_skews[other] = skew_bound; - } else { - timecheck_skews[other] = (timecheck_skews[other]*0.8)+(skew_bound*0.2); - } + timecheck_skews[other] = skew_bound; timecheck_acks++; if (timecheck_acks == quorum.size()) {