From: Kefu Chai Date: Wed, 13 May 2026 11:09:37 +0000 (+0800) Subject: crimson/osd: fix crash in committed_osd_maps when an OSD is removed X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=refs%2Fpull%2F68884%2Fhead;p=ceph.git crimson/osd: fix crash in committed_osd_maps when an OSD is removed OSDMap::is_down(osd) is defined as !is_up(osd), and is_up() gates on exists(osd). This means is_down() returns true for OSDs that have been *removed* from the map (EXISTS flag cleared), not just marked down. committed_osd_maps() iterates over epochs [first, last], and for each epoch over all OSDs in old_map, calling get_cluster_addrs() for any OSD that was up in old_map and is_down() in the current epoch. get_cluster_addrs() asserts exists(osd), so when that OSD has been removed the assertion fires. Reproducer (3 crimson OSDs running: osd.0, osd.1, osd.2): # Two rapid OSDMap changes; the monitor batches them into one message. ceph osd down 2 ceph osd purge 2 --yes-i-really-mean-it # osd.0 and osd.1 call committed_osd_maps(N, N+1). Before this fix # old_map is set once before the loop and never updated, so in # iteration N+1 the comparison is still old_map(N-1) vs osdmap(N+1): # # old_map->is_up(2)=true (osd.2 was up at N-1) # osdmap->exists(2)=false (purged in N+1) # osdmap->is_down(2)=true (!is_up, since !exists -> true) # -> get_cluster_addrs(2) asserts -> crash # # OSDMap.h: ceph_assert(exists(osd)) [in get_cluster_addrs()] # Signal 6 (SIGABRT) Note: 'ceph osd destroy' does NOT clear the EXISTS flag; it only sets CEPH_OSD_DESTROYED. The EXISTS flag is cleared by 'osd rm', which 'osd purge' calls internally after 'osd destroy'. Fix: advance old_map at the end of each iteration so the comparison is pairwise (N-1 vs N, then N vs N+1, ...), matching classic OSD::advance_map at src/osd/OSD.cc:8615. In the reproducer, iteration N marks osd.2 down using osdmap(N) (where osd.2 still exists), then sets old_map = osdmap(N). Iteration N+1 starts with old_map(N)->is_up(2)=false (osd.2 was DOWN in N), so the condition short-circuits and get_cluster_addrs() is never called on the new map. No explicit !exists branch is needed. The monitor produces a separate epoch for each of 'osd down' / 'osd destroy' / 'osd rm', so an OSD can only transition UP -> REMOVED through at least one intermediate DOWN epoch in any batched MOSDMap, and the pairwise comparison short-circuits before the assert can fire. Signed-off-by: Kefu Chai --- diff --git a/src/crimson/osd/osd.cc b/src/crimson/osd/osd.cc index 9ad33274b10..90e6809f7b5 100644 --- a/src/crimson/osd/osd.cc +++ b/src/crimson/osd/osd.cc @@ -1297,15 +1297,14 @@ seastar::future<> OSD::committed_osd_maps( old_map->get_all_osds(old_osds); co_await seastar::coroutine::parallel_for_each(old_osds, [this, FNAME, old_map](auto &osd_id) -> seastar::future<> { - DEBUG("osd.{}: whoami ? {}, old up ? {} , now down ? {}", - osd_id, osd_id != whoami, - old_map->is_up(osd_id), osdmap->is_down(osd_id)); - if (osd_id != whoami && - old_map->is_up(osd_id) && - osdmap->is_down(osd_id)) { - DEBUG("osd.{}: mark osd.{} down", whoami, osd_id); - co_await cluster_msgr->mark_down(osdmap->get_cluster_addrs(osd_id).front()); + if (osd_id == whoami || + !old_map->is_up(osd_id) || + !osdmap->is_down(osd_id)) { + co_return; } + DEBUG("osd.{}: mark osd.{} down", whoami, osd_id); + co_await cluster_msgr->mark_down( + osdmap->get_cluster_addrs(osd_id).front()); }); co_await pg_shard_manager.update_map(std::move(o)); @@ -1317,6 +1316,7 @@ seastar::future<> OSD::committed_osd_maps( boot_epoch = osdmap->get_epoch(); } } + old_map = osdmap; } if (osdmap->is_up(whoami)) {