Darrick J. Wong [Fri, 30 Dec 2022 22:19:39 +0000 (14:19 -0800)]
fuzzy: don't fuzz xattr namespace flags and values
Extended attribute namespace flags are controlled by userspace, and
there is no validation imposed on the values. Don't bother fuzzing
either of these things.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:19:39 +0000 (14:19 -0800)]
fuzzy: don't fuzz the log sequence number
Don't bother filtering log sequence numbers since xfs_db doesn't have
the ability to tell us the range of LSNs that would actually cause
validation failures.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:19:36 +0000 (14:19 -0800)]
populate: take a snapshot of the filesystem if creation fails
There have been a few bug reports filed about people not being able to
use the filesystem metadata population code to create filesystems with
all types of metadata on them. Right now this is super-annoying to
debug because we don't capture a metadump for easy debugging. Fix that.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:19:18 +0000 (14:19 -0800)]
xfs: race fsstress with online repair for special file metadata
For each XFS_SCRUB_TYPE_* that looks at symbolic link and special file
metadata, create a test that runs that repairer in the foreground and
fsstress in the background.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:19:18 +0000 (14:19 -0800)]
xfs: ensure that online file data fork repairs don't hit EDQUOT
Add a test to ensure that the sysadmin doesn't get EDQUOT if they try to
repair file data fork metadata when we've already exceeded a quota limit
somewhere.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:19:18 +0000 (14:19 -0800)]
xfs: race fsstress with online repair for inode and fork metadata
For each XFS_SCRUB_TYPE_* that looks at inode and data/attr/cow fork
metadata, create a test that runs that repairer in the foreground and
fsstress in the background.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:19:18 +0000 (14:19 -0800)]
xfs: test rebuilding xattrs when the data fork is btree format
Make sure we handle the case of rebuilding extended attributes properly
when the data fork is in btree format and we therefore cannot zap the
attr fork.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:19:09 +0000 (14:19 -0800)]
fuzzy: use FORCE_REBUILD over injecting force_repair
For stress testing online repair, try to use the FORCE_REBUILD ioctl
flag over the error injection knobs whenever possible because the knobs
are very noisy and are not always available.
[zlang: do not export the SCRUBSTRESS_USE_FORCE_REBUILD]
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Thu, 16 Feb 2023 16:21:50 +0000 (16:21 +0000)]
generic/604: fix test to actually create dirty inodes
The test case generic/604 aims to test a scenario where at unmount time we
have many dirty inodes, however the test does not actually creates any
files, because it calls xfs_io without the -f argument, so xfs_io fails
but any error is ignored because stderr is redirected to /dev/null.
Fix this by passing -f to xfs_io and also stop redirecting stderr to
/dev/null, so that in case of any unexpected failure creating files, the
test fails.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Anand Jain [Fri, 17 Feb 2023 23:57:58 +0000 (07:57 +0800)]
fstests: btrfs/249: add _wants_kernel_commit and _fixed_by_git_commit
Add the _wants_kernel_commit tag for kernel and _fixed_by_git_commit tag
for the btrfs-progs for the benefit of testing on the older kernels and
the older btrfs-progs.
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Anand Jain [Wed, 15 Feb 2023 07:51:22 +0000 (15:51 +0800)]
fstests: btrfs/185, 198 and 219 add _fixed_by_kernel_commit
Recently, these test cases were added to the auto group. To ensure we have
some clues if they fail in older kernels, add "_fixed_by_kernel_commit"
for the fix and update the test summary.
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
xfs/080 is not dangerous, isn't a known fail and runs very quickly.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
generic/251 isn't dangerous, doesn't takes overly long to run and doesn't
produce spurious failures, so add it to the auto group.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
This test is not dangerous and passes reliably. Add it to the auto
group.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
generic/042 was removed from the auto group in 2015 by commit 7721b8501608 ("generic/042: remove from the 'auto' group") because it
always failed on XFS and wasn't run for other file systems back then.
Since then XFS fixed the problem it reproduces, and ext4 and f2fs
have grown shutdown support and also pass it reliably.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 7 Feb 2023 17:00:12 +0000 (09:00 -0800)]
common/xfs: use whole-word matching for _require_xfsrestore_xflag
On my system, the path to the xfsrestore binary is:
/code/xfsdump/build-x86_64/restore/xfsrestore
The grep command in _require_xfsrestore_xflag matches on the "build-x86"
part, even though my version of xfsrestore does not actually have a -x
flag. Fix the string parsing to match entire words so that we only look
for -x in the help output.
(Maybe someone should patch xfsrestore -h to report basename(argv[0])
instead of argv[0]...)
Fixes: 1ffa16c573 ("xfs: test for fixing wrong root inode number in dump") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Gabriel Niebler [Fri, 3 Feb 2023 17:54:37 +0000 (18:54 +0100)]
common: Chown mount even if already idmapped to account for remounts
This is a logical consequence of introducing the chown check in _idmapped_mount,
since now a read-only mount can be made idmapped successfully. But if the mount
is then remounted rw the chown never happens, as _idmapped_mount sees that it's
already idmapped and bows out early.
This patch fixes that by simply moving the chown ahead of the idmapped check,
so it will be performed in any case, even on already idmapped mounts.
Signed-off-by: Gabriel Niebler <gniebler@suse.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Gabriel Niebler [Fri, 3 Feb 2023 14:35:45 +0000 (15:35 +0100)]
common: Do not chown ro mountpoint when creating idmapped mount
The function _idmapped_mount tries to change the ownership of the mountpoint
for which it aims to create an idmapped mount, to ensure that the mapped UID
and GID can actually create objects within it. Some tests set up a read-only
mount, however, which lets the chown call fail. This patch fixes the
function to check whether the mount is read-only and skip the chown, if so.
Signed-off-by: Gabriel Niebler <gniebler@suse.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Thu, 2 Feb 2023 15:58:09 +0000 (15:58 +0000)]
btrfs: add a stress test for send v2 streams
Currently we don't have any test case in fstests to do randomized and
stress testing of the send stream v2, added in kernel 6.0 and support for
it in btrfs-progs v5.19. For the send v2 stream, we only have btrfs/281
that exercises a specific scenario which used to trigger a bug.
So add a test that uses fsstress to generate a filesystem and exercise
both full and incremental send operations using the v2 send stream with
compressed extents, and then receive the streams without and with
decompression, to verify they work and produce the same results as in
the original filesystem. This is the same base idea as btrfs/007, but
for the send v2 stream with compressed data.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
generic/676: Unstable d_type handling for NFS READDIR
The NFS client may send READDIR or READDIRPLUS to populate the dentry
cache, and switch between them to optimize for least RPC calls based on the
process' behavior. When using READDIR, dentries will have d_type =
DT_UNKNOWN but with READDIRPLUS d_type will be set from the mode.
This heuristic will cause generic/676 to fail when comparing dentries
cached from one or the other call, since we compare d_type directly. Fix
this by bypassing the comparison of d_type if any entry is loaded with
DT_UNKNOWN.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Jan Kara [Tue, 31 Jan 2023 12:39:59 +0000 (13:39 +0100)]
common: Improve blocksize support for udf
Add better support for blocksize selection in _scratch_mkfs_sized
(accept another variant of mkfs options, select correct default block
size if not specified). Also add blocksize selection support to
_scratch_mkfs_blocksized.
For _check_udf_filesystem to keep working when blocksize is not
specified in MKFS_OPTIONS, add detection of blocksize from a mounted
filesystem.
Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Jan Kara [Tue, 31 Jan 2023 12:39:57 +0000 (13:39 +0100)]
common: Provide blocksize and ecclength to udf fsck
udf_test program used for verifying filesystem is not able to determine
filesystem blocksize. Provide it in the options together with disabling
ecclength as it is not used on harddrives.
Reviewed-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Zorro Lang <zlang@kernel.org>
Make it so that xfs_scrub stress tests can select what kind of fsstress
operations they want to run. This will make it easier for, say,
directory scrubbers to configure fsstress to exercise directory tree
changes while skipping file data updates, because those are irrelevant.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 7 Feb 2023 17:01:41 +0000 (09:01 -0800)]
fuzzy: add a custom xfs find utility for scrub stress tests
Create a new find(1) like utility that doesn't crash on directory tree
changes (like find does due to bugs in its loop detector) and actually
implements the custom xfs attribute predicates that we need for scrub
stress tests. This program will be needed for a future patch where we
add stress tests for scrub and repair of file metadata.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:19:06 +0000 (14:19 -0800)]
xfs: race fsstress with online scrubbers for AG and fs metadata
For each XFS_SCRUB_TYPE_* that looks at AG or filesystem metadata,
create a test that runs that scrubber in the foreground and fsstress in
the background.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
xfs/191: fix xattr leaf block emptying on 64k blocksized fses
This test is failing on filesystems with 64k blocksize since the leaf
hdr.firstused field is 16 bit and as such trying to reset it to $dbsize
overflows and is rejected by xfs_db. The leaf is never properly resetted
and the discrepancy is picked up by xfs_repair, thus failing the test.
Fix it by setting it to XFS_ATTR3_LEAF_NULLOFF (0) as this is the proper
on-disk value to indicate an empty leaf on 64k blocksized fses.
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Anand Jain [Sun, 29 Jan 2023 02:42:31 +0000 (10:42 +0800)]
fstests: fstest.c, fix compile warnings replace sprintf with snprintf
Fixes the buffer overflow warnings, by using snprintf instead of
sprintf.
fstest.c:95:20: warning: '/file' directive writing 5 bytes into a region of size between 1 and 1024 [-Wformat-overflow=]
sprintf(fname, "%s/file%d", dir, fnum);
^~~~~
fstest.c:166:20: warning: '/file' directive writing 5 bytes into a region of size between 1 and 1024 [-Wformat-overflow=]
sprintf(fname, "%s/file%d", dir, fnum);
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 1 Feb 2023 00:51:40 +0000 (16:51 -0800)]
generic/500: skip this test if formatting fails
This testcase exercises what happens when we race a filesystem
perforing discard operations against a thin provisioning device that has
run out of space. To constrain runtime, it creates a 128M thinp volume
and formats it.
However, if that initial format fails because (say) the 128M volume is
too small, then the test fails. This is really a case of test
preconditions not being satisfied, so let's make the test _notrun when
this happens.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 1 Feb 2023 00:51:35 +0000 (16:51 -0800)]
generic/038: set a maximum runtime on this test
This test races multiple FITRIM calls against multiple programs creating
200k small files to ensure that there are no concurrency problems with
the allocator and the FITRIM code. This is not necessarily quick, and
the test itself does not contain any upper bound on the runtime. On my
system that simulates storage with DRAM this takes ~5 minutes to run; on
my cloud system with newly provided discard support, I killed the test
after 27 hours.
Constrain the runtime to about the customary 30s * TIME_FACTOR.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Jan Kara [Tue, 31 Jan 2023 12:54:16 +0000 (13:54 +0100)]
generic/707: Test moving directory while being grown
Test how the filesystem can handle moving a directory to a different
directory (so that parent pointer gets updated) while it is grown. Ext4
and UDF had a bug where if the directory got converted to a different
type due to growth while rename is running, the filesystem got
corrupted.
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Hironori Shiina [Mon, 30 Jan 2023 22:56:43 +0000 (17:56 -0500)]
xfs: test xfsrestore on multi-level dumpfiles with wrong root
While developing `xfsrestore -x`, we hit an issue at restoring a
renamed file in the cumulative mode (multi-level dumps):
https://lore.kernel.org/linux-xfs/e61ae295-a331-d36a-cae1-646022dc2a6e@gmail.com/
Then, this patch adds test cases where '-x' flag is used in the
cumulative mode with various file operations referring to existing tests.
Signed-off-by: Hironori Shiina <shiina.hironori@fujitsu.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Hironori Shiina [Mon, 30 Jan 2023 22:56:42 +0000 (17:56 -0500)]
xfs: add helper to create fake root inode
xfsdump used to cause a problem when there is an inode whose number is
lower than the root inode number. This patch adds a helper function to
reproduce such a situation for regression tests.
Signed-off-by: Hironori Shiina <shiina.hironori@fujitsu.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Wed, 25 Jan 2023 11:07:54 +0000 (11:07 +0000)]
btrfs: test send optimal cloning behaviour
Test that send operations do the best cloning decisions when we have
extents that are shared but some files refer to the full extent while
others refer to only a section of the extent.
This exercises an optimization that was added to kernel 6.2, by the
following commit:
c7499a64dcf6 ("btrfs: send: optimize clone detection to increase extent sharing")
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 18 Jan 2023 00:44:18 +0000 (16:44 -0800)]
populate: improve attr creation runtime
Replace the file creation loops with a python script that does
everything we want from a single process. This reduces the runtime of
_scratch_xfs_populate substantially by avoiding thousands of execve
overhead. This patch builds on the previous one by reducing the runtime
of xfs/349 from ~45s to ~15s.
For people who don't have python3, use setfattr's "restore" mode to bulk
create xattrs. This reduces runtime to about ~25s.
[zlang: add popattr.py into src/Makefile install list]
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 18 Jan 2023 00:44:02 +0000 (16:44 -0800)]
populate: remove file creation loops that take forever
Replace the file creation loops with a perl script that does everything
we want from a single process. This reduces the runtime of
_scratch_xfs_populate substantially by avoiding thousands of execve
overhead. On my system, this reduces the runtime of xfs/349 (with scrub
enabled) from ~140s to ~45s.
[zlang: add popdir.pl into src/Makefile install list]
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Dave Chinner [Wed, 18 Jan 2023 00:43:47 +0000 (16:43 -0800)]
populate: ensure btree directories are created reliably
The population function creates an XFS btree format directory by
polling the extent count of the inode and creating new dirents until
the extent count goes over the limit that pushes it into btree
format.
It then removes every second dirent to create empty space in the
directory data to ensure that operations like metadump with
obfuscation can check that they don't leak stale data from deleted
dirents.
Whilst this does not result in directory data blocks being freed, it
does not take into account the fact that the dabtree index has half
the entries removed from it and that can result in btree nodes
merging and extents being freed. This causes the extent count to go
down, and the inode is converted back into extent form. The
population checks then fail because it should be in btree form.
Fix this by counting the number of directory data extents rather than
the total number of extents in the data fork. We can do this simply
by using xfs_bmap and counting the number of extents returned as it
does not report extents beyond EOF (which is where the dabtree is
located). As the number of data blocks does not change with the
dirent removal algorithm used, this will ensure that the inode data
fork remains in btree format.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: make this patch first in line] Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 18 Jan 2023 00:43:16 +0000 (16:43 -0800)]
xfs/{080,329,434,436}: add missing check for fallocate support
Don't run this test if the filesystem doesn't support fallocate. This
is only ever the case if always_cow is enabled.
The same logic applies to xfs/329, though it's more subtle because the
test itself does not explicitly invoke fallocate; rather, it is xfs_fsr
that requires fallocate.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 18 Jan 2023 00:43:00 +0000 (16:43 -0800)]
xfs: skip fragmentation tests when alwayscow mode is enabled
If the always_cow debugging flag is enabled, all file writes turn into
copy writes. This dramatically ramps up fragmentation in the filesystem
(intentionally!) so there's no point in complaining about fragmentation.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 18 Jan 2023 00:42:44 +0000 (16:42 -0800)]
xfs/182: fix spurious direct write failure
This test has some weird behavior that causes regressions when fsdax and
reflink are enabled. The goal of this test is to set a cow extent size
hint, perform some random directio writes, perform a directio rewrite of
the entire file, and make sure that the file content (and extent count)
are sane afterwards.
Most of the time, the random directio writes will never touch the
8388609th byte, though if they do randomly select that EOF block, they'd
end up extending the file by $real_blksz bytes and causing spurious test
failures.
Then, the rewrite does this:
pwrite -S 0x63 -b $real_blksz 0 $((filesize + 1))
Note that we previously set filesize=8388608, which means that we're
asking for a series of direct writes that fill the first 8388608 bytes
with 'c'. The last write in the series becomes a single byte direct
write. For regular file access mode, this last write will fail with
EINVAL, since block devices do not support byte granularity writes and
XFS does not fall back to the pagecache for unaligned direct wites.
Hence we never wrote the 8388609th byte of the file.
However, fsdax *does* allow byte-granularity direct writes, which means
that the single-byte write succeeds. There is no EINVAL return code,
and the 8388609th byte of the file is now 'c' instead of 'a'. As a
result, the md5 of file2 is different.
Since fsdax+reflink is the newcomer, amend the direct writes in this
test so that they always end at the 8388608th byte, since we were never
really testing that last byte anyway. This makes the test behavior
consistent across both access modes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Ojaswin Mujoo [Wed, 11 Jan 2023 20:58:28 +0000 (12:58 -0800)]
generic/692: generalize the test for non-4K Merkle tree block sizes
Due to the assumption of the Merkle tree block size being 4K, the file
size calculated for the second test case was taking way too long to hit
EFBIG in case of larger block sizes like 64K. Fix this by generalizing
the calculation to any Merkle tree block size >= 1K.
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Over mutiple kernel releases we have reworked setgid inheritance
significantly due to long-standing security issues, security issues that
were reintroduced after they were fixed, and the subtle and difficult
inheritance rules that plagued individual filesystems. We have lifted
setgid inheritance into the VFS proper in earlier patches. Starting with
kernel v6.2 we have made setgid inheritance consistent between the write
and setattr (ch{mod,own}) paths.
The gist of the requirement is that in order to inherit the setgid bit
the user needs to be in the group of the file or have CAP_FSETID in
their user namespace. Otherwise the setgid bit will be dropped
irregardless of the file's executability. Remove the obsolete tests as
they're not a security issue and will cause spurious warnings on older
distro kernels.
Note, that only with v6.2 setgid inheritance works correctly for
overlayfs in the write path. Before this the setgid bit was always
retained.
Darrick J. Wong [Fri, 30 Dec 2022 22:12:58 +0000 (14:12 -0800)]
xfs: race fsmap with readonly remounts to detect crash or livelock
Add a new test that races the GETFSMAP ioctl with ro/rw remounting to
make sure we don't livelock on the empty transaction that fsmap uses to
avoid deadlocking on rmap btree cycles.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:55 +0000 (14:12 -0800)]
fuzzy: delay the start of the scrub loop when stress-testing scrub
By default, online fsck stress testing kicks off the loops for fsstress
and online fsck at the same time. However, in certain debugging
scenarios it can help if we let fsstress get a head-start in filling up
the filesystem. Plumb in a means to delay the start of the scrub loop.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:55 +0000 (14:12 -0800)]
fuzzy: allow substitution of AG numbers when configuring scrub stress test
Allow the test program to use the metavariable '%agno%' when passing
scrub commands to the scrub stress loop. This makes it easier for tests
to scrub or repair every AG in the filesystem without a lot of work.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:54 +0000 (14:12 -0800)]
fuzzy: clean up frozen fses after scrub stress testing
Some of our scrub stress tests involve racing scrub, fsstress, and a
program that repeatedly freeze and thaws the scratch filesystem. The
current cleanup code suffers from the deficiency that it doesn't
actually wait for the child processes to exit. First, change it to do
that.
However, that exposes a second problem: there's a race condition with a
freezer process that leads to the stress test exiting with a frozen fs.
If the freezer process is blocked trying to acquire the unmount or
sb_write locks, the receipt of a signal (even a fatal one) doesn't cause
it to abort the freeze. This causes further problems with fstests,
since ./check doesn't expect to regain control with the scratch fs
frozen.
Fix both problems by making the cleanup function smarter.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:54 +0000 (14:12 -0800)]
fuzzy: increase operation count for each fsstress invocation
For online fsck stress testing, increase the number of filesystem
operations per fsstress run to 2 million, now that we have the ability
to kill fsstress if the user should push ^C to abort the test early.
This should guarantee a couple of hours of continuous stress testing in
between clearing the scratch filesystem.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:54 +0000 (14:12 -0800)]
fuzzy: clear out the scratch filesystem if it's too full
If the online fsck stress tests run for long enough, they'll fill up the
scratch filesystem completely. While it is interesting to test repair
functionality on a *nearly* full filesystem undergoing a heavy workload,
a totally full filesystem is really only exercising the ENOSPC handlers
in the kernel. That's not what we came here to test, so change the
fsstress loop to detect a nearly full filesystem and erase everything
before starting fsstress again.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:54 +0000 (14:12 -0800)]
fuzzy: abort scrub stress testing if the scratch fs went down
There's no point in continuing a stress test of online fsck if the
filesystem goes down. We can't query that kind of state directly, so as
a proxy we try to stat the mountpoint and interpret any error return as
a sign that the fs is down.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:54 +0000 (14:12 -0800)]
fuzzy: make scrub stress loop control more robust
Currently, each of the scrub stress testing background threads
open-codes logic to decide if it should exit the loop. This decision is
based entirely on TIME_FACTOR*30 seconds having gone by, which means
that we ignore external factors, such as the user pressing ^C, which (in
theory) will invoke cleanup functions to tear everything down.
This is not a great user experience, so refactor the loop exit test into
a helper function and establish a sentinel file that must be present to
continue looping. If the user presses ^C, the cleanup function will
remove the sentinel file and kill the background thread children, which
should be enough to stop everything more or less immediately.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:54 +0000 (14:12 -0800)]
fuzzy: test the scrub stress subcommands before looping
Before we commit to running fsstress and scrub commands in a loop for
some time, we should check that the provided commands actually work on
the scratch filesystem. The _require_xfs_io_command predicate only
detects the presence of the scrub ioctl, not any particular subcommand.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:53 +0000 (14:12 -0800)]
fuzzy: rework scrub stress output filtering
Rework the output filtering functions for scrub stress tests: first, we
should use _filter_scratch to avoid leaking the scratch fs details to
the output. Second, for scrub and repair, change the filter elements to
reflect outputs that don't indicate failure (such as busy resources,
preening requests, and insufficient space to do anything). Finally,
change the _require function to check that filter functions have been
sourced.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:53 +0000 (14:12 -0800)]
fuzzy: clean up scrub stress programs quietly
In the cleanup function for online fsck stress test common code, send
SIGINT instead of SIGTERM to the fsstress and xfs_io processes to kill
them. bash prints 'Terminated' to the golden output when children die
with SIGTERM, which can make a test fail, and we don't want a regular
cleanup function being the thing that prevents the test from passing.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 30 Dec 2022 22:12:53 +0000 (14:12 -0800)]
xfs/422: move the fsstress/freeze/scrub racing logic to common/fuzzy
Hoist all this code to common/fuzzy in preparation for making this code
more generic so that we implement a variety of tests that check the
concurrency correctness of online fsck. Do just enough renaming so that
we don't pollute the test program's namespace; we'll fix the other warts
in subsequent patches.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
David Disseldorp [Tue, 10 Jan 2023 22:22:43 +0000 (23:22 +0100)]
common/rc: drop SGI DMF specific _mount_ops_filter
The _mount() helper function is the only caller of _mount_ops_filter(),
which appears to have been used in the past to replace the SGI DMF
specific mtpt= mount option setting.
_mount() invocations could now be replaced with $MOUNT_PROG calls
directly, but I've retained the helper function for readability.
Link: https://irix7.com/techpubs/007-3683-007.pdf Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
btrfs/012 is requiring ext4 support to test the conversion, but the test
case is only checking if mkfs.ext4 is available, not if the filesystem
driver is actually available on the test host.
Check if the driver is available as well, before trying to run the test.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Qu Wenruo [Thu, 5 Jan 2023 07:18:19 +0000 (15:18 +0800)]
btrfs: add a test case to verify scrub speed throttle works
We introduced scrub speed throttle in commit eb3b50536642 ("btrfs: scrub:
per-device bandwidth control"), but it is not that well documented
(e.g. what's the unit of the sysfs interface), nor tested by any test
case.
This patch will add a test case for this functionality.
The test case itself is pretty straightforward:
- Fill the fs with 2G file as scrub workload
- Scrub without any throttle to grab the initial speed
- Set the throttle to half of the initial speed
- Scrub again and check the speed against the throttle
The test case has an assumption that we can exclusively use all the
performance of the underlying disk.
But for cloud environment it's not ensured 100%, thus the test case is
not included in auto group to avoid false alerts.
Baokun Li [Thu, 29 Dec 2022 13:44:34 +0000 (21:44 +0800)]
overlay: avoid to use NULL OVL_BASE_FSTYP for mounting
Generally, FSTYP is used to specify OVL_BASE_FSTYP. When we specify FSTYP
through an environment variable, it is not converted to OVL_BASE_FSTYP.
In addition, sometimes we do not even specify the file type. For example,
we only use `./check -n -overlay -g auto` to list overlay-related cases.
If OVL_BASE_FSTYP is NULL, mounting fails and the test fails.
To solve this problem, try to assign a value to OVL_BASE_FSTYP when
specifying -overlay. In addition, in the _overlay_base_mount function,
the basic file system type of the overlay is specified only when
OVL_BASE_FSTYP is not NULL.
Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>