git-server-git.apps.pok.os.sepia.ceph.com Git

author	Rishabh Dave <ridave@redhat.com>
	Mon, 17 Jun 2024 19:03:28 +0000 (00:33 +0530)
committer	Rishabh Dave <ridave@redhat.com>
	Fri, 12 Jul 2024 14:43:00 +0000 (20:13 +0530)
commit	c1f9090ee6422d3be746c7d827aae391629930d1
tree	ca9bbd3e57ac7a10fdf5bdd5e916b231b258ffdf	tree \| snapshot
parent	5a72f2d50b8feea5e5709f463a79899bb53fcc61	commit \| diff

mgr/vol: handle case where clone index entry goes missing

In `async_cloner.py`, clone index entry is fetched to get next clone job
that needs to be executed. It might happen that the clone job was
cancelled just when it was going to be picked for execution (IOW, when
it was about to move from pending state to in-progress state).

Currently, MGR hangs in such a case because exception `ObjectNotFound`
from CephFS Python bindings is raised and is left uncaught. To prevent
this issue catch the exception, log it and return None to tell
`get_job()` of `async_job.py` to look for next job in the queue.

Increase the scope of try-except in method `get_oldest_clone_entry()` of
`async_cloner.py` so that when exception `cephfs.Error` or any exception
under it is thrown by `self.fs.lstat()` is not left uncaught.

FS object is also passed to the method `list_one_entry_at_a_time()`, so
increasing scope of try-except is useful as it will not allow exceptions
raised in other calls to CephFS Python binding methods to be left
uncaught.

Fixes: https://tracker.ceph.com/issues/66560
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 3cff7251c86a4670768721f924b11b3de33f807b)