From: Alex Markuze Date: Thu, 7 May 2026 10:05:41 +0000 (+0000) Subject: selftests: ceph: wire up Ceph reset kselftests and documentation X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=2cf924db7f198404669ec589d6127b892b1ad8f2;p=ceph-client.git selftests: ceph: wire up Ceph reset kselftests and documentation Wire the CephFS reset test suite into the kselftest build: - Add filesystems/ceph to the top-level selftests Makefile. - Add the per-suite Makefile with run_validation.sh as TEST_PROGS. - Add the settings file (kselftest timeout). - Add the MAINTAINERS entry for the test directory. - Add README with prerequisites, usage, and troubleshooting. Signed-off-by: Alex Markuze Reviewed-by: Viacheslav Dubeyko Signed-off-by: Viacheslav Dubeyko --- diff --git a/MAINTAINERS b/MAINTAINERS index c2c6d79275c6..441b556b9671 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5909,6 +5909,7 @@ B: https://tracker.ceph.com/ T: git https://github.com/ceph/ceph-client.git F: Documentation/filesystems/ceph.rst F: fs/ceph/ +F: tools/testing/selftests/filesystems/ceph/ CERTIFICATE HANDLING M: David Howells diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index a516e93a9db1..c423be330395 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2359,6 +2359,7 @@ struct flush_dump_entry { static void dump_cap_flushes(struct ceph_mds_client *mdsc, u64 want_tid) { struct ceph_client *cl = mdsc->fsc->client; + int i; struct flush_dump_entry entries[CEPH_CAP_FLUSH_MAX_DUMP_ENTRIES]; struct ceph_cap_flush *cf; int n = 0, remaining = 0; @@ -2388,7 +2389,7 @@ static void dump_cap_flushes(struct ceph_mds_client *mdsc, u64 want_tid) pr_info_client(cl, "still waiting for cap flushes through %llu:\n", want_tid); - for (int i = 0; i < n; i++) { + for (i = 0; i < n; i++) { struct flush_dump_entry *e = &entries[i]; if (e->ci_null) diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index b1a0621cd37e..731d6ad04956 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -121,6 +121,7 @@ static inline bool ceph_reset_is_idle(struct ceph_client_reset_state *st) { return READ_ONCE(st->phase) == CEPH_CLIENT_RESET_IDLE; } + struct ceph_mds_cap_match { s64 uid; /* default to MDS_AUTH_UID_ANY */ u32 num_gids; diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 6e59b8f63e41..ab254ae793a9 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -32,6 +32,7 @@ TARGETS += exec TARGETS += fchmodat2 TARGETS += filesystems TARGETS += filesystems/binderfs +TARGETS += filesystems/ceph TARGETS += filesystems/epoll TARGETS += filesystems/fat TARGETS += filesystems/overlayfs diff --git a/tools/testing/selftests/filesystems/ceph/Makefile b/tools/testing/selftests/filesystems/ceph/Makefile new file mode 100644 index 000000000000..4ad3e8d40d90 --- /dev/null +++ b/tools/testing/selftests/filesystems/ceph/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0 + +TEST_PROGS := run_validation.sh +TEST_FILES := reset_stress.sh reset_corner_cases.sh \ + validate_consistency.py README settings + +include ../../lib.mk diff --git a/tools/testing/selftests/filesystems/ceph/README b/tools/testing/selftests/filesystems/ceph/README new file mode 100644 index 000000000000..eb0092b38f80 --- /dev/null +++ b/tools/testing/selftests/filesystems/ceph/README @@ -0,0 +1,84 @@ +# CephFS Client Reset Test Suite + +Test suite for the CephFS kernel client manual session reset feature. +This trimmed set contains the single-client stress test, the targeted +corner-case test, and the one-shot validation harness used during +feature bring-up. + +## Prerequisites + +- Linux kernel with the CephFS client reset feature (this branch) +- A running Ceph cluster with at least one MDS +- Root access (debugfs requires it) +- Python 3 (for validators) +- flock utility (for lock tests, usually in util-linux) + +## Test inventory + +| Test | Script(s) | What it covers | +|------|-----------|----------------| +| Single-client stress | `reset_stress.sh` | I/O + resets + data integrity on one mount | +| Corner cases | `reset_corner_cases.sh` | EBUSY, dirty caps, flock reclaim, unmount-during-reset | +| Validation harness | `run_validation.sh` | baseline + corner cases + moderate/aggressive stress + final status check | + +## Quick start + +Stress run: + + sudo ./reset_stress.sh --mount-point /mnt/cephfs --profile moderate + +Corner cases: + + sudo ./reset_corner_cases.sh --mount-point /mnt/cephfs + +End-to-end validation: + + sudo ./run_validation.sh --mount-point /mnt/cephfs + +## Stress profiles + + baseline - no resets, 1 IO + 1 rename, 600s + moderate - reset every 5-15s, 2 IO + 1 rename, 900s + aggressive - reset every 1-5s, 4 IO + 2 rename, 900s + soak - reset every 5-15s, 2 IO + 1 rename, 3600s + +## Key options (all scripts) + + --mount-point PATH CephFS mount point (required) + --client-id ID Debugfs client id (auto-detected if one) + +reset_stress.sh additionally accepts: + + --profile NAME baseline|moderate|aggressive|soak + --duration-sec N Override profile runtime + --no-reset Disable reset injection + --out-dir PATH Artifact directory + +## Corner case tests + + [1/4] ebusy_rejection Second reset rejected while first in-flight + [2/4] dirty_caps_at_reset Reset with unflushed dirty caps + [3/4] flock_after_reset Stale lock EIO + fresh lock after holder exit + [4/4] unmount_during_reset umount during active reset (destroy-path wakeup) + +Test 4 requires creating a second CephFS mount instance and SKIPs if +the host cannot do so. See `--help` output for details. + +## Troubleshooting + +**No writable Ceph reset interface found:** +Kernel lacks the reset feature, debugfs not mounted, or not root. +Check: `ls /sys/kernel/debug/ceph/*/reset/` + +**Multiple Ceph clients found:** +Use `--client-id` to select one. +List: `ls /sys/kernel/debug/ceph/` + +## Files + +| File | Role | +|------|------| +| `reset_stress.sh` | Single-client stress test runner | +| `validate_consistency.py` | Single-client post-run validator | +| `reset_corner_cases.sh` | Corner case harness (4 sequential tests) | +| `run_validation.sh` | One-shot validation harness | diff --git a/tools/testing/selftests/filesystems/ceph/settings b/tools/testing/selftests/filesystems/ceph/settings new file mode 100644 index 000000000000..79b65bdf05db --- /dev/null +++ b/tools/testing/selftests/filesystems/ceph/settings @@ -0,0 +1 @@ +timeout=1200