From: Kotresh HR <khiremat@redhat.com>
Date: Fri, 16 Jul 2021 10:28:37 +0000 (+0530)
Subject: mgr/volumes: Fix a race during clone cancel
X-Git-Tag: v17.1.0~1039^2
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=e1ccf367dff0b9d57bd5c50477cfa60c76c64762;p=ceph-ci.git

mgr/volumes: Fix a race during clone cancel

Issue:
The race is that the cancelled clone can still go ahead and sync the data to cloned subvolume.

Here is the sequence how this can happen.
1. Subvolume clone is created and it's in PENDING state (it's the initial state for a clone)
2. The clone job is picked up by the cloner thread, started the state machine (i.e.start_clone_sm)
   and queried the clone state, which is PENDING. So this has local copy of the state at this point.
3. The 'clone cancel' is called which just removes the tracker from index as the state is still PENDING.
   This moves the clone state from PENDING to CANCEL.
4. The cloner thread proceeds further from PENDING (local copy of the state) to IN-PROGRESS.

Fix:
Along with checking for PENDING state, also check whether the job is picked by thread with in the
lock. This guarantees that none of the cloner threads has picked it up for processing and plain
removal of index is sufficient.

Fixes: https://tracker.ceph.com/issues/51805
Signed-off-by: Kotresh HR <khiremat@redhat.com>
---

diff --git a/src/pybind/mgr/volumes/fs/async_cloner.py b/src/pybind/mgr/volumes/fs/async_cloner.py
index 06b73dab657..6bdb5fd72be 100644
--- a/src/pybind/mgr/volumes/fs/async_cloner.py
+++ b/src/pybind/mgr/volumes/fs/async_cloner.py
@@ -329,10 +329,14 @@ class Cloner(AsyncJobs):
                         if not track_idx:
                             log.warning("cannot lookup clone tracking index for {0}".format(clone_subvolume.base_path))
                             raise VolumeException(-errno.EINVAL, "error canceling clone")
-                        if SubvolumeOpSm.is_init_state(SubvolumeTypes.TYPE_CLONE, clone_state):
-                            # clone has not started yet -- cancel right away.
-                            self._cancel_pending_clone(fs_handle, clone_subvolume, clonename, groupname, status, track_idx)
-                            return
+                        clone_job = (track_idx, clone_subvolume.base_path)
+                        jobs = [j[0] for j in self.jobs[volname]]
+                        with lock_timeout_log(self.lock):
+                            if SubvolumeOpSm.is_init_state(SubvolumeTypes.TYPE_CLONE, clone_state) and not clone_job in jobs:
+                                logging.debug("Cancelling pending job {0}".format(clone_job))
+                                # clone has not started yet -- cancel right away.
+                                self._cancel_pending_clone(fs_handle, clone_subvolume, clonename, groupname, status, track_idx)
+                                return
             # cancelling an on-going clone would persist "canceled" state in subvolume metadata.
             # to persist the new state, async cloner accesses the volume in exclusive mode.
             # accessing the volume in exclusive mode here would lead to deadlock.
diff --git a/src/pybind/mgr/volumes/fs/async_job.py b/src/pybind/mgr/volumes/fs/async_job.py
index b79a3b52544..dca95ff2286 100644
--- a/src/pybind/mgr/volumes/fs/async_job.py
+++ b/src/pybind/mgr/volumes/fs/async_job.py
@@ -251,7 +251,8 @@ class AsyncJobs(threading.Thread):
         canceled = False
         log.info("canceling job {0} for volume {1}".format(job, volname))
         try:
-            if not volname in self.q and not volname in self.jobs and not job in self.jobs[volname]:
+            vol_jobs = [j[0] for j in self.jobs.get(volname, [])]
+            if not volname in self.q and not job in vol_jobs:
                 return canceled
             for j in self.jobs[volname]:
                 if j[0] == job: