From 3ea5c50829d0ca80016af8c6f26119cc42d8c22f Mon Sep 17 00:00:00 2001 From: Xiubo Li Date: Thu, 11 Apr 2024 09:53:04 +0800 Subject: [PATCH] mds: do remove the cap when seqs equal or larger than last issue There is a race in case of: MDS rw Client - Issue the 'Asx' caps to rw client - Adds the cap, then removes it later by queuing it to the cap release list. But the cap->seq may have been updated by previous cap grant requests. And the cap grant request won't increase the 'last_issue' seq in MDS. - ro client's lookup request comes and the MDS sends a 'Ax' caps revoke request to rw client by increasing the 'seq'. - The revoke request just finds that the cap doesn't exist, then queues a new cap release immediately with the new 'seq'. Then trigger to flush the pending cap releases to MDS. - Just receives the cap release request but the 'seq' > cap's 'last_issue', then MDS will skip removing the cap. And then the _do_cap_release() will issue the 'Ax' caps back to rw client. Then wakes up the ro client's lookup request, while the lookup request will try to revoke the 'Ax' caps again from the rw client. This will cause a spinlock infinitely in mds side. Fixes: https://tracker.ceph.com/issues/64977 Signed-off-by: Xiubo Li (cherry picked from commit 345978e7607e227854ccc78b066b274b97940391) --- src/mds/Locker.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mds/Locker.cc b/src/mds/Locker.cc index 7b0c2eabac53c..055cdbecd956e 100644 --- a/src/mds/Locker.cc +++ b/src/mds/Locker.cc @@ -4035,8 +4035,8 @@ void Locker::_do_cap_release(client_t client, inodeno_t ino, uint64_t cap_id, new C_Locker_RetryCapRelease(this, client, ino, cap_id, mseq, seq)); return; } - if (seq != cap->get_last_issue()) { - dout(7) << " issue_seq " << seq << " != " << cap->get_last_issue() << dendl; + if (seq < cap->get_last_issue()) { + dout(7) << " issue_seq " << seq << " < " << cap->get_last_issue() << dendl; // clean out any old revoke history cap->clean_revoke_from(seq); eval_cap_gather(in); -- 2.39.5