]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
osd/OSD.cc: choose heartbeat peers by failure domain
authorxie xingguo <xie.xingguo@zte.com.cn>
Wed, 8 Aug 2018 09:52:29 +0000 (17:52 +0800)
committerxie xingguo <xie.xingguo@zte.com.cn>
Thu, 9 Aug 2018 00:44:58 +0000 (08:44 +0800)
commitbcc11541b86762d5821b24ef220b6ef079da2515
tree16778028b9c5df13f97a88b926c5bd0814515059
parent108583c6bed0fa53e489fe71537e3be875635f3c
osd/OSD.cc: choose heartbeat peers by failure domain

By default, monitor requires at least two valid failure votes/reports from
different hosts to mark an OSD down, which turns out to be impossible sometimes
for a replicated-pool of size of 2 in those clusters made up of hosts
with contiguous labeled OSDs.

This patch instead does a breadth-first search based on the highest level of
failure domain at cluster-wide, to try to make heartbeat peers can cover all failure domains
whenever possible, which can hopefully help accelerating osd failure detection
in the above case..

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
src/crush/CrushWrapper.cc
src/crush/CrushWrapper.h
src/osd/OSD.cc
src/osd/OSDMap.cc
src/osd/OSDMap.h