os/bluestore: avoid race between split_cache and get/put pin/unpin
The Onode get() and put() methods may call pin() and unpin() when we
transition between 1 and >1 ref. To do this they make use of the
OSDShard *s pointer without taking any additional locks. This runs afoul
of Collection::split_cache(), which moves an onode between shards.
It would be very complicated to address this race head-on: we ultimately
need to be under the protction of the OSDShard lock to do the pin/unpin,
but if OSDShard *s is changing, we don't know which lock to take. And if
it is null, what do we do? It might be null when we test but then get
set by split_cache. And what if there is a put() followed by a get(),
and they managed to acquire the appropriate lock(s), but the get() thread
gets it first? And so on.
We can avoid this whole mess by preventing a put() or get() from making
this transition (and looking at OSDShard *s) at all.
The only reason nref was *ever* < 2 is because the sequence was
- remove from old collection onode_map
- move onode to new shard
- add to new collection onode_map
The fix is to simply
- remove from old colleciton onode_map
- add to new collection onode_map
- adjust onode shard
That ensures that the onode's nref is >= 2 at all times.
At the same time, improve this code so that we don't _rm and _add when
the src and dest shard are the same.
Fixes: https://tracker.ceph.com/issues/43147 Fixes: https://tracker.ceph.com/issues/43131 Signed-off-by: Sage Weil <sage@redhat.com>