During shutdown, the MDS sends a `MSG_MDS_BEACON` with
`MDSMap::STATE_DNE` (in `MDSDaemon::suicide()`) and then waits for a
`MSG_MDS_BEACON` reply from the MON.
The MON, however, suppresses replies to `STATE_DNE`; in
`MDSMonitor::preprocess_beacon()`, it returns early on `STATE_DNE` and
`MDSMonitor::prepare_beacon()` silently evicts the dying MDS without
any reply.
This delays the MDS shutdown until the MDS times out.
Since `MDSDaemon::suicide()` has code to wait for a beacon reply, I
figure that the MON reply was suppressed accidently, therefore I
suggest adding it.
Fixes: https://tracker.ceph.com/issues/68761
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
if (state == MDSMap::STATE_DNE) {
dout(1) << __func__ << ": DNE from " << info << dendl;
+
+ /* send a beacon reply so MDSDaemon::suicide() finishes the
+ Beacon::send_and_wait() call */
+ auto beacon = make_message<MMDSBeacon>(mon.monmap->fsid,
+ m->get_global_id(), m->get_name(), get_fsmap().get_epoch(),
+ m->get_state(), m->get_seq(), CEPH_FEATURES_SUPPORTED_DEFAULT);
+ mon.send_reply(op, beacon.detach());
+
goto evict;
}