From: Sage Weil Date: Wed, 17 Jan 2018 16:38:29 +0000 (-0600) Subject: osd: only exit if *latest* map(s) say we are destroyed X-Git-Tag: v13.0.2~458^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=refs%2Fpull%2F19988%2Fhead;p=ceph.git osd: only exit if *latest* map(s) say we are destroyed It's possible our current map is older, we were destroyed then, but in newer maps our osd was recreated. This happens when the oldest map after a recreated osd happens to land on an epoch where the osd was marked destroyed. Fix by only exiting if one of the newest maps says we are (still) destroyed. Fixes: http://tracker.ceph.com/issues/22673 Signed-off-by: Sage Weil --- diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc index 1bc8822c7739..a413204cb2ff 100644 --- a/src/osd/OSD.cc +++ b/src/osd/OSD.cc @@ -5402,8 +5402,12 @@ void OSD::_preboot(epoch_t oldest, epoch_t newest) if (osdmap->get_epoch() == 0) { derr << "waiting for initial osdmap" << dendl; } else if (osdmap->is_destroyed(whoami)) { - derr << "osdmap says I am destroyed, exiting" << dendl; - exit(0); + derr << "osdmap says I am destroyed" << dendl; + // provide a small margin so we don't livelock seeing if we + // un-destroyed ourselves. + if (osdmap->get_epoch() > newest - 1) { + exit(0); + } } else if (osdmap->test_flag(CEPH_OSDMAP_NOUP) || osdmap->is_noup(whoami)) { derr << "osdmap NOUP flag is set, waiting for it to clear" << dendl; } else if (!osdmap->test_flag(CEPH_OSDMAP_SORTBITWISE)) {