]> git.apps.os.sepia.ceph.com Git - ceph.git/commit
msg/async: add timeout for connections which are not yet ready 27337/head
authorxie xingguo <xie.xingguo@zte.com.cn>
Sat, 2 Mar 2019 08:23:12 +0000 (16:23 +0800)
committerxie xingguo <xie.xingguo@zte.com.cn>
Mon, 8 Apr 2019 01:19:59 +0000 (09:19 +0800)
commit7209cc6aa3263d1dfb5cb19d57dd1f6b56aa2804
treef2ccb388d30c309471a6d55f68b93a8dbd823762
parent1d464221d998376561c102c83d3127e646815bbb
msg/async: add timeout for connections which are not yet ready

There could be various corner cases that may cause an async
connection stuck in the connecting stage (e.g., by manually
creating some loop back connections on the switches of our test cluster,
we can almost 100% reproduce http://tracker.ceph.com/issues/37499).

In 61b9432ef9a3847eceb96f8d5a854567c49bbf61 I try to employ the
existing keep_alive mechanism to get those stuck connections out of the
trap but it does not work if the corresponding connection
is not yet ready, since we always require the underlying connection to be
**ready** in order to send out a keep_alive message.

Fix by making a more general connecting timeout strategy.
If a connecting process can not be finished within a specific interval,
then we simply cut it off and retry.

Fixes: http://tracker.ceph.com/issues/37499
Fixes: http://tracker.ceph.com/issues/38493
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
src/common/legacy_config_opts.h
src/common/options.cc
src/msg/async/AsyncConnection.cc
src/msg/async/AsyncConnection.h
src/msg/async/ProtocolV1.cc
src/msg/async/ProtocolV2.cc