rbd-mirror: resume pending shutdown on error in snapshot replayer
If a shutdown is requested, e.g. by update_pool_replayers() because
remote RADOS instance got blocklisted, and Replayer::shut_down() pends
it on completion of current snapshot sync, it gets stuck if replayer
encounters an error in the interim. This is particularly likely in the
blocklist case: a higher layer may detect that client got blocklisted
and request a shutdown first, and then when replayer sees EBLOCKLISTED
in turn, it calls handle_replay_complete() -- which does not resume
a pending shutdown. Because update_pool_replayers() blocks on shutdown
with Mirror::m_lock held, eventually the entire daemon hangs in
perpetuity.
Fixes: https://tracker.ceph.com/issues/56154
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit
fc4cc575bc53f62f88ee3faf0daba8906bc1c6c1)