mds: handle fragment notify race
In the nornal case, mds does not trim dir inode whose child dirfrags
are likely being fragmented (see trim_inode()). But when fragmenting
subtree roots, following race can happen:
- mds.a (auth mds of dirfrag) sends fragment_notify message to
mds.c and drops wrlock on dirfragtreelock.
- mds.b (auth mds of dir inode) changes dirfragtreelock state to
SYNC and send lock message mds.c
- mds.c receives the lock message and changes dirfragtreelock state
to SYNC
- mds.c trim dirfrag and dir inode from its cache
- mds.c receives the fragment_notify message
The fix is asking replicas to ack fragment_notify message, unlocking
dirfragtreelock after mds gets all acks.
Fixes: http://tracker.ceph.com/issues/36035
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit
72887980a7d6517740a2841d6014a92a2d1c3063)
Conflicts:
src/mds/MDCache.cc
src/mds/MDCache.h
src/mds/Locker.cc
src/messages/MMDSFragmentNotify.h
src/msg/Message.h
src/msg/Message.cc