From: Sage Weil Date: Fri, 2 May 2014 21:48:35 +0000 (-0700) Subject: mon/MonClient: remove stray _finish_hunting() calls X-Git-Tag: v0.67.9~3 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=3b8ab41e1ec86f2ab5c6b4bee3fb4030077e2c21;p=ceph.git mon/MonClient: remove stray _finish_hunting() calls Callig _finish_hunting() clears out the bool hunting flag, which means we don't retry by connection to another mon periodically. Instead, we send keepalives every 10s. But, since we aren't yet in state HAVE_SESSION, we don't check that the keepalives are getting responses. This means that an ill-timed connection reset (say, after we get a MonMap, but before we finish authenticating) can drop the monc into a black hole that does not retry. Instead, we should *only* call _finish_hunting() when we complete the authentication handshake. Fixes: #8278 Backport: firefly, dumpling Signed-off-by: Sage Weil Reviewed-by: Joao Eduardo Luis (cherry picked from commit 77a6f0aefebebf057f02bfb95c088a30ed93c53f) --- diff --git a/src/mon/MonClient.cc b/src/mon/MonClient.cc index d726f88dc6fd..245feef2241d 100644 --- a/src/mon/MonClient.cc +++ b/src/mon/MonClient.cc @@ -255,8 +255,6 @@ void MonClient::handle_monmap(MMonMap *m) if (!monmap.get_addr_name(cur_con->get_peer_addr(), cur_mon)) { ldout(cct, 10) << "mon." << cur_mon << " went away" << dendl; _reopen_session(); // can't find the mon we were talking to (above) - } else { - _finish_hunting(); } map_cond.Signal(); @@ -686,8 +684,6 @@ void MonClient::_renew_subs() void MonClient::handle_subscribe_ack(MMonSubscribeAck *m) { - _finish_hunting(); - if (sub_renew_sent != utime_t()) { sub_renew_after = sub_renew_sent; sub_renew_after += m->interval / 2.0;