We want to avoid a situation like:
- host.A consists of OSDs from 0 to 10
- cut off network of host.A from the rest of the cluster
- osd.1 is marked down when enough votes have been
collected by mon
- osd.1 re-selects osd.0,2,3,..., and two extra
osds from two different hosts as heartbeat peers
- osd.1 has more than 1/3 heartbeat peers becoming pingable,
e.g., because they belongs to the same host.A, and will
try to mark itself as up again
which as a result may cause a longer client op latency now.
Fix by (always) trying to select as many heartbeat peers
from different subtrees as possible instead.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
// subtree level (e.g., hosts) for fast failure detection.
auto min_down = cct->_conf.get_val<uint64_t>("mon_osd_min_down_reporters");
auto subtree = cct->_conf.get_val<string>("mon_osd_reporter_subtree_level");
+ auto limit = std::max(min_down, (uint64_t)cct->_conf->osd_heartbeat_min_peers);
osdmap->get_random_up_osds_by_subtree(
- whoami, subtree, min_down, want, &want);
+ whoami, subtree, limit, want, &want);
for (set<int>::iterator p = want.begin(); p != want.end(); ++p) {
dout(10) << " adding neighbor peer osd." << *p << dendl;