If we are sitting around waiting until we are able to ping our "up" peers,
we need to be sure that our notion of "up" is still correct and we're not
just stuck on an old, stale OSDMap.
Fixes: http://tracker.ceph.com/issues/21121
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
fbafa659dae94faba435ca449ee5e77b51108b4b)
dout(1) << "start_waiting_for_healthy" << dendl;
set_state(STATE_WAITING_FOR_HEALTHY);
last_heartbeat_resample = utime_t();
+
+ // subscribe to osdmap updates, in case our peers really are known to be dead
+ osdmap_subscribe(osdmap->get_epoch() + 1, false);
}
bool OSD::_is_healthy()