os/bluestore: preserve source collection cache during split
OSD split transactions look something like
mkcoll new
split old
...
omap_rmkey_range old
omap_setkeys old
omap_setkeys new
The last part splits the log into two pieces. The
problem is that the rmkey_range needs to wait on old
omap transactions to flush, and those are linked to the
old onode, and split clears the cache. The result is
that we don't wait, rmkeyrange leaves some recent pg log
keys behind, and on OSD restart we get an error because
the object doesn't belong to the (old) collection.
Fix this by preserving objects in the old collection and
only clear out objects that are moving to the newly
split collections. This will include the pgmeta object
that we care about.
(Note that we are one step closer to preserving the
cache contents across the split, but not quite there
yet: at this point we don't have all of the destination
collections. A change in the ObjectStore interface is
probably needed to make that not be extremely awkward.)