`AsyncMessenger::shutdown()` called WorkerProcessor::stop() first,
killing the worker threads, then queued a C_drain callback via
stack->drain(). If a worker had already exited its event loop it never
processed the callback, so drain.wait() blocked forever and the monitor
shutdown hung for minutes.
Move stack->drain() ahead of the processors->stop() loop. With the new
order the workers are still alive to acknowledge the drain.
Fixes: https://tracker.ceph.com/issues/71303
Signed-off-by: Nitzan Mordechai <nmordec@redhat.com>
(cherry picked from commit
5fbb9c5e464e3a2227f0c4729b2e6a1bc2f6f9d6)
{
ldout(cct,10) << __func__ << " " << get_myaddrs() << dendl;
+ stack->drain();
// done! clean up.
for (auto &&p : processors)
p->stop();
stop_cond.notify_all();
stopped = true;
lock.unlock();
- stack->drain();
+
return 0;
}