git.apps.os.sepia.ceph.com Git

mgr/rbd_support: avoid losing a schedule on load vs add race

If load_schedules() (i.e. periodic refresh) races with add_schedule()
invoked by the user for a fresh image, that image's schedule may get
lost until the next rebuild (not refresh!) of the queue:

1. periodic refresh invokes load_schedules()
2. load_schedules() creates a new Schedules instance and loads
   schedules from rbd_mirror_snapshot_schedule object
3. add_schedule() is invoked for a new image (an image that isn't
   present in self.images) by the user
4. before load_schedules() can grab self.lock, add_schedule() commits
   the new schedule to rbd_mirror_snapshot_schedule object and adds it
   to self.schedules
5. load_schedules() grabs self.lock and reassigns self.schedules with
   Schedules instance that is now stale
6. periodic refresh invokes load_pool_images() which discovers the new
   image; eventually it is added to self.images
7. periodic refresh invokes refresh_queue() which attempts to enqueue()
   the new image; this fails because a matching schedule isn't present

The next periodic refresh recovers the discarded schedule from
rbd_mirror_snapshot_schedule object but no attempt to enqueue() that
image is made since it is already "known" at that point.  Despite the
schedule being in place, no snapshots are created until the queue is
rebuilt from scratch or rbd_support module is reloaded.

To fix that, extend self.lock critical sections so that add_schedule()
and remove_schedule() can't get stepped on by load_schedules().

Fixes: https://tracker.ceph.com/issues/56090
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 95a0ec7b428c87294ca4a96ff6afcf613bc67144)

Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py [ commit
  e4a16e261370 ("mgr/rbd_support: add type annotation") not in
  pacific ]
src/pybind/mgr/rbd_support/trash_purge_schedule.py [ ditto ]

author	Ilya Dryomov <idryomov@gmail.com>
	Fri, 17 Jun 2022 12:03:20 +0000 (14:03 +0200)
committer	Ilya Dryomov <idryomov@gmail.com>
	Tue, 21 Jun 2022 16:35:50 +0000 (18:35 +0200)
commit	32763307451ca58dfeebdc9475aaad5b24766e71
tree	0d0d540c7291bfa9d8e338895e08a7711183626a	tree \| snapshot
parent	196a3cdbc9164fa58cc91e5a11800065997fa87f	commit \| diff

src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py		diff \| blob \| history
src/pybind/mgr/rbd_support/trash_purge_schedule.py		diff \| blob \| history