From: Ilya Dryomov Date: Mon, 17 May 2021 19:16:16 +0000 (+0200) Subject: mon/MonClient: tolerate a rotating key that is slightly out of date X-Git-Tag: v14.2.22~33^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=c6a5c059a2a340b876f8a982d88d09b605e442bf;p=ceph.git mon/MonClient: tolerate a rotating key that is slightly out of date Commit 918c12c2ab5d ("monclient: avoid key renew storm on clock skew") made wait_auth_rotating() wait for a key set with a valid "current" key (instead of any key set, including with all keys expired if the clocks are skewed). While a good idea in general, this is a bit too stringent because the monitors will hand out key sets with "current" key that is _just_ about to expire. There is nothing wrong with that as "next" key is also there, valid for the entire auth_service_ticket_ttl. So even if the daemon is talking to the leader, it is possible to get a key set with an expired "current" key. If the daemon is talking to a peon, it is pretty easy to run into in practice. This, coupled with the fact that _check_auth_rotating() explicitly allows the keys to go slightly out of date, can lead to wait_auth_rotating() stalling the boot for up to 30 seconds: 15:41:11.824+0000 1 ... ==== auth_reply(proto 2 0 (0) Success) 15:41:41.824+0000 0 monclient: wait_auth_rotating timed out after 30 15:41:41.824+0000 -1 mds.b unable to obtain rotating service keys; retrying Apply the same 30 second or less tolerance in wait_auth_rotating(). Fixes: https://tracker.ceph.com/issues/50390 Signed-off-by: Ilya Dryomov (cherry picked from commit 6160ed75fcc2a648da4b696fd0ec20b95c4a0a61) Conflicts: src/mon/MonClient.cc [ commit 85157d5aae3d ("mon: s/Mutex/ceph::mutex/") not in nautilus ] --- diff --git a/src/mon/MonClient.cc b/src/mon/MonClient.cc index 317917cf6e91..5f245721b067 100644 --- a/src/mon/MonClient.cc +++ b/src/mon/MonClient.cc @@ -971,6 +971,8 @@ int MonClient::wait_auth_rotating(double timeout) { std::lock_guard l(monc_lock); utime_t now = ceph_clock_now(); + utime_t cutoff = now; + cutoff -= std::min(30.0, cct->_conf->auth_service_ticket_ttl / 4.0); utime_t until = now; until += timeout; @@ -984,7 +986,7 @@ int MonClient::wait_auth_rotating(double timeout) return 0; while (auth_principal_needs_rotating_keys(entity_name) && - rotating_secrets->need_new_secrets(now)) { + rotating_secrets->need_new_secrets(cutoff)) { if (now >= until) { ldout(cct, 0) << __func__ << " timed out after " << timeout << dendl; return -ETIMEDOUT; @@ -992,6 +994,8 @@ int MonClient::wait_auth_rotating(double timeout) ldout(cct, 10) << __func__ << " waiting (until " << until << ")" << dendl; auth_cond.WaitUntil(monc_lock, until); now = ceph_clock_now(); + cutoff = now; + cutoff -= std::min(30.0, cct->_conf->auth_service_ticket_ttl / 4.0); } ldout(cct, 10) << __func__ << " done" << dendl; return 0;