From: Max Kellermann Date: Mon, 30 Mar 2026 08:43:19 +0000 (+0200) Subject: ceph: clear s_cap_reconnect when ceph_pagelist_encode_32() fails X-Git-Tag: ceph-for-7.1-rc1~6 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=cc5643095419d45927a1dee9cb3da7c2f9e779f6;p=ceph-client.git ceph: clear s_cap_reconnect when ceph_pagelist_encode_32() fails This MDS reconnect error path leaves s_cap_reconnect set. send_mds_reconnect() sets the bit at the beginning of the reconnect, but the first failing operation after that, ceph_pagelist_encode_32(), can jump to `fail:` without clearing it. __ceph_remove_cap() consults that flag to decide whether cap releases should be queued. A reconnect-preparation failure therefore leaves the session in reconnect mode from the cap-release path's point of view and can strand release work until some later state transition repairs it. Signed-off-by: Max Kellermann Reviewed-by: Viacheslav Dubeyko Signed-off-by: Ilya Dryomov --- diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index b1746273f186..4fa471d9b3b2 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4956,7 +4956,7 @@ static void send_mds_reconnect(struct ceph_mds_client *mdsc, /* placeholder for nr_caps */ err = ceph_pagelist_encode_32(recon_state.pagelist, 0); if (err) - goto fail; + goto fail_clear_cap_reconnect; if (test_bit(CEPHFS_FEATURE_MULTI_RECONNECT, &session->s_features)) { recon_state.msg_version = 3; @@ -5046,6 +5046,10 @@ static void send_mds_reconnect(struct ceph_mds_client *mdsc, ceph_pagelist_release(recon_state.pagelist); return; +fail_clear_cap_reconnect: + spin_lock(&session->s_cap_lock); + session->s_cap_reconnect = 0; + spin_unlock(&session->s_cap_lock); fail: ceph_msg_put(reply); up_read(&mdsc->snap_rwsem);