From 1b075dd44030f17f95dfc1ade6259f049cc7bdc7 Mon Sep 17 00:00:00 2001
From: "J. Eric Ivancich" <ivancich@redhat.com>
Date: Fri, 3 Nov 2017 09:15:13 -0400
Subject: [PATCH] rgw: fix BZ 1500904, Stale bucket index entry remains after
 object deletion

We have a race condition:

 1. RGW client #1: requests an object be deleted.
 2. RGW client #1: sends a prepare op to bucket index OSD #1.
 3. OSD #1:        prepares the op, adding pending ops to the bucket dir entry
 4. RGW client #2: sends a list bucket to OSD #1
 5. RGW client #2: sees that there are pending operations on bucket
                   dir entry, and calls check_disk_state
 6. RGW client #2: check_disk_state sees that the object still exists, so it
                   sends CEPH_RGW_UPDATE to bucket index OSD (#1)
 7. RGW client #1: sends a delete object to object OSD (#2)
 8. OSD #2:        deletes the object
 9. RGW client #2: sends a complete op to bucket index OSD (#1)
10. OSD #1:        completes the op
11. OSD #1:        receives the CEPH_RGW_UPDATE and updates the bucket index
                   entry, thereby **RECREATING** it

Solution implemented:

At step #5 the object's dir entry exists. If we get to beginning of
step #11 and the object's dir entry no longer exists, we know that the
dir entry was just actively being modified, and ignore the
CEPH_RGW_UPDATE operation, thereby NOT recreating it.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit b33f529e79b74314a2030231e1308ee225717743)

Conflicts: (backported substantial changes only; omitted cleanups)
        src/cls/rgw/cls_rgw.cc
	src/rgw/rgw_rados.cc
---
 src/cls/rgw/cls_rgw.cc | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/src/cls/rgw/cls_rgw.cc b/src/cls/rgw/cls_rgw.cc
index 03314a17ac406..6bd01b06af30a 100644
--- a/src/cls/rgw/cls_rgw.cc
+++ b/src/cls/rgw/cls_rgw.cc
@@ -1982,6 +1982,18 @@ int rgw_dir_suggest_changes(cls_method_context_t hctx, bufferlist *in, bufferlis
         }
         break;
       case CEPH_RGW_UPDATE:
+	if (!cur_disk.exists) {
+	  // this update would only have been sent by the rgw client
+	  // if the rgw_bucket_dir_entry existed, however between that
+	  // check and now the entry has diappeared, so we were likely
+	  // in the midst of a delete op, and we will not recreate the
+	  // entry
+	  CLS_LOG(10,
+		  "CEPH_RGW_UPDATE not applied because rgw_bucket_dir_entry"
+		  " no longer exists\n");
+	  break;
+	}
+
         CLS_LOG(10, "CEPH_RGW_UPDATE name=%s instance=%s total_entries: %" PRId64 " -> %" PRId64 "\n",
                 cur_change.key.name.c_str(), cur_change.key.instance.c_str(), stats.num_entries, stats.num_entries + 1);
         stats.num_entries++;
-- 
2.39.5