From: Sage Weil Date: Fri, 2 May 2014 21:48:35 +0000 (-0700) Subject: mon/MonClient: remove stray _finish_hunting() calls X-Git-Tag: v0.81~85 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=77a6f0aefebebf057f02bfb95c088a30ed93c53f;p=ceph.git mon/MonClient: remove stray _finish_hunting() calls Callig _finish_hunting() clears out the bool hunting flag, which means we don't retry by connection to another mon periodically. Instead, we send keepalives every 10s. But, since we aren't yet in state HAVE_SESSION, we don't check that the keepalives are getting responses. This means that an ill-timed connection reset (say, after we get a MonMap, but before we finish authenticating) can drop the monc into a black hole that does not retry. Instead, we should *only* call _finish_hunting() when we complete the authentication handshake. Fixes: #8278 Backport: firefly, dumpling Signed-off-by: Sage Weil Reviewed-by: Joao Eduardo Luis --- diff --git a/src/mon/MonClient.cc b/src/mon/MonClient.cc index f30be1b05f55..3a6dda46a923 100644 --- a/src/mon/MonClient.cc +++ b/src/mon/MonClient.cc @@ -327,8 +327,6 @@ void MonClient::handle_monmap(MMonMap *m) if (!monmap.get_addr_name(cur_con->get_peer_addr(), cur_mon)) { ldout(cct, 10) << "mon." << cur_mon << " went away" << dendl; _reopen_session(); // can't find the mon we were talking to (above) - } else { - _finish_hunting(); } map_cond.Signal(); @@ -756,8 +754,6 @@ void MonClient::_renew_subs() void MonClient::handle_subscribe_ack(MMonSubscribeAck *m) { - _finish_hunting(); - if (sub_renew_sent != utime_t()) { sub_renew_after = sub_renew_sent; sub_renew_after += m->interval / 2.0;