It's possible our current map is older, we were destroyed then, but in
newer maps our osd was recreated. This happens when the oldest map after
a recreated osd happens to land on an epoch where the osd was marked
destroyed.
Fix by only exiting if one of the newest maps says we are (still)
destroyed.
Fixes: http://tracker.ceph.com/issues/22673
Signed-off-by: Sage Weil <sage@redhat.com>
if (osdmap->get_epoch() == 0) {
derr << "waiting for initial osdmap" << dendl;
} else if (osdmap->is_destroyed(whoami)) {
- derr << "osdmap says I am destroyed, exiting" << dendl;
- exit(0);
+ derr << "osdmap says I am destroyed" << dendl;
+ // provide a small margin so we don't livelock seeing if we
+ // un-destroyed ourselves.
+ if (osdmap->get_epoch() > newest - 1) {
+ exit(0);
+ }
} else if (osdmap->test_flag(CEPH_OSDMAP_NOUP) || osdmap->is_noup(whoami)) {
derr << "osdmap NOUP flag is set, waiting for it to clear" << dendl;
} else if (!osdmap->test_flag(CEPH_OSDMAP_SORTBITWISE)) {