]> git.apps.os.sepia.ceph.com Git - ceph.git/commit
librbd: fix a deadlock on image_lock caused by Mirror::image_disable() 62127/head
authorIlya Dryomov <idryomov@gmail.com>
Sun, 2 Mar 2025 08:24:52 +0000 (09:24 +0100)
committerIlya Dryomov <idryomov@gmail.com>
Wed, 5 Mar 2025 12:28:55 +0000 (13:28 +0100)
commitbea89977612c725318f320dd1015fbe62c29de7f
treef454229ba052e6b76d05d620b4a8fb59a9d3bc58
parent86dddfdc3f99ecefe54aee62c4ddb929b753f0d1
librbd: fix a deadlock on image_lock caused by Mirror::image_disable()

With Mirror::image_disable() taking image_lock for write and calling
list_children() under it, the following deadlock is possible:

1. Mirror::image_disable() takes image_lock for write and calls
   list_children()
2. AbstractWriteLog::periodic_stats() timer fires (it runs every
   5 seconds) and ImageCacheState::write_image_cache_state() is called
   under a global timer_lock
3. ImageCacheState::write_image_cache_state() successfully takes
   owner_lock and blocks attempting to take image_lock for read because
   it's already held for write by Mirror::image_disable()
4. list_children() blocks inside of a call to ImageState::close() on
   a descendant image
5. The descendant image close can't proceed because TokenBucketThrottle
   requires a global timer_lock to complete QosImageDispatch shutdown
6. safe_timer thread which is holding timer_lock can't proceed because
   ImageCacheState::write_image_cache_state() is effectively blocked on
   the descendant image close through Mirror::image_disable()

Until commit 281a64acf920 ("librbd: remove snapshot mirror image-meta
when disabling"), Mirror::image_disable() was taking image_lock only for
read meaning that this deadlock wasn't possible.  The only other change
that commit 281a64acf920 made to the code block protected by image_lock
was using child_mirror_image_internal for cls_client::mirror_image_get()
call on descendant images instead of mirror_image_internal to preserve
the value of mirror_image_internal for later.  Both are local variables
that have nothing to do with image_lock, so I'm going back and making
Mirror::image_disable() take image_lock only for read again.

Fixes: https://tracker.ceph.com/issues/70190
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ff9aa20bc358775bf372052787322b1452886c13)
src/librbd/api/Mirror.cc