Qu Wenruo [Tue, 27 Aug 2024 00:13:54 +0000 (09:43 +0930)]
fstests: btrfs: test reading data with a corrupted checksum tree leaf
[BUG]
There is a bug report that, KASAN get triggered when:
- A read bio needs to be split
This can happen for profiles with stripes, including
RAID0/RAID10/RAID5/RAID6.
- An error happens before submitting the new split bio
This includes:
* chunk map lookup failure
* data csum lookup failure
Then during the error path of btrfs_submit_chunk(), the original bio is
fully freed before submitted range has a chance to call its endio
function, resulting a use-after-free bug.
[NEW TEST CASE]
Introduce a new test case to verify the specific behavior by:
- Create a btrfs with enough csum leaves with data RAID0 profile
To bump the csum tree level, use the minimal nodesize possible (4K).
Writing 32M data which needs at least 8 leaves for data checksum
RAID0 profile ensures the data read bios will get split.
- Find the last csum tree leave and corrupt it
- Read the data many times until we trigger the bug or exit gracefully
With an x86_64 VM with KASAN enabled, it can trigger the KASAN report in
just 4 iterations (the default iteration number is 32).
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Qu Wenruo [Mon, 26 Aug 2024 22:47:08 +0000 (08:17 +0930)]
fstests: btrfs/125: do not use raid5 for metadata
[BUG]
There are several bug reports of btrfs/125 failure recently, either
causing balance failure (-EIO), or even kernel crash.
The balance failure looks like this:
Mount normal and balance
+ERROR: error during balancing '/mnt/scratch': Input/output error
+There may be more info in syslog - try dmesg | tail
+md5sum: /mnt/scratch/tf2: Input/output error
The test case btrfs/125 is not reliable in the past, and has been
discussed several times:
[CAUSE]
There are several different factors involved.
1. RMW mix the old and new metadata, causing unrepairable corruption
E.g. with the following layout:
data 1 |<- Stale metadata ->| (from the out-of-date device)
data 2 | Unused |
parity |PPPPPPPPPPPPPPPPPPPP|
In above case, although metadata on data 1 is out-of-date, we can
still rebuild the correct data from parity and data 2.
But if we have new metadata writes into the data 2 stripe, an RMW
will screw up the whole situation:
data 1 |<- Stale metadata ->| (from the out-of-date device)
data 2 |<- New metadata ->|
parity |XXXXXXXXXXXXXXXXXXXX|
The RMW will use the stale metadata and new metadata to calculate new
parity.
The resulted new parity will no longer be able to recover the old
data 1.
This is a known bug, thus our documentation is already recommending
to avoid RAID56 for metadata usage.
> Metadata
> Do not use raid5 nor raid6 for metadata. Use raid1 or raid1c3
> respectively.
And this is very hard to fix, unlike data we can fetch the
data csum and verify during RMW, we can not do that during RMW.
At the timing of RMW, we're holding the rbio lock for the full
stripe.
If the extent tree search requires a read-recover, it will generate
another rbio, which may cover the same full stripe we're working on,
leading to a deadlock.
Furthermore the current RAID56 repair code is all based on veritical
sectors, but metadata can cross several horizontal sectors.
This will require multiple combinations to repair a metadata.
2. Crash caused by double freeing a bio
By chance if the above RMW corrupted csum tree, then during
btrfs_submit_chunk() we will hit an error path that leads to double
freeing of a bio, resulting crash or a KASAN report.
[WORKAROUND]
Since it's very hard to fix the RAID56 metadata problem without a
deadlock or a huge code rework, for now just use RAID1 for the metadata
of this particular test case.
There may be a chance to fix the situation by properly marking the
missing-then-reappear device as out-of-date, so no direct read from that
device.
But that will also be a huge new feature, not something can be done
immediately.
Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Brian Foster [Wed, 28 Aug 2024 18:15:34 +0000 (14:15 -0400)]
generic: test to run fsx eof pollution
Filesystem regressions related to partial page zeroing can go
unnoticed for a decent amount of time. A recent example is the issue
of iomap zero range not handling dirty pagecache over unwritten
extents, which leads to wrong behavior on certain file extending
operations (i.e. truncate, write extension, etc.).
fsx does occasionally uncover these sorts of problems, but failures
can be rare and/or require longer running tests outside what is
typically run via full fstests regression runs. fsx now supports a
mode that injects post-eof data in order to explicitly test partial
eof zeroing behavior. This uncovers certain problems more quickly
and applies coverage more broadly across size changing operations.
Add a new test that runs an fsx instance (modeled after generic/127)
with eof pollution mode enabled. While the test is generic, it is
currently limited to XFS as that is currently the only known major
fs that does enough zeroing to satisfy the strict semantics expected
by fsx. The long term goal is to uncover and fix issues so more
filesystems can enable this test.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Brian Foster [Wed, 28 Aug 2024 18:15:33 +0000 (14:15 -0400)]
fsx: support eof page pollution for eof zeroing test coverage
File ranges that are newly exposed via size changing operations are
expected to return zeroes until written to. This behavior tends to
be difficult to regression test as failures can be racy and
transient. fsx is probably the best tool for this type of test
coverage, but uncovering issues can require running for a
significantly longer period of time than is typically invoked
through fstests tests. As a result, these types of regressions tend
to go unnoticed for an unfortunate amount of time.
To facilitate uncovering these problems more quickly, implement an
eof pollution mode in fsx that opportunistically injects post-eof
data prior to operations that change file size. Since data injection
occurs immediately before the size changing operation, it can be
used to detect problems in partial eof page/block zeroing associated
with each relevant operation.
The implementation takes advantage of the fact that mapped writes
can place data beyond eof so long as the page starts within eof. The
main reason for the isolated per-operation approach (vs. something
like allowing mapped writes to write beyond eof, for example) is to
accommodate the fact that writeback zeroes post-eof data on the eof
page. The current approach is therefore not necessarily guaranteed
to detect all problems, but provides more generic and broad test
coverage than the alternative of testing explicit command sequences
and doesn't require significant changes to how fsx works. If this
proves useful long term, further enhancements can be considered that
might facilitate the presence of post-eof data across operations.
Enable the feature with the -e command line option. It is disabled
by default because zeroing behavior is inconsistent across
filesystems. This can also be revisited in the future if zeroing
behavior is refined for the major filesystems that rely on fstests
for regression testing.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Brian Foster [Wed, 28 Aug 2024 18:15:32 +0000 (14:15 -0400)]
fsx: factor out a file size update helper
In preparation for support for eof page pollution, factor out a file
size update helper. This updates the internally tracked file size
based on the upcoming operation and zeroes the appropriate range in
the good buffer for extending operations.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Brian Foster [Wed, 28 Aug 2024 18:15:31 +0000 (14:15 -0400)]
fsx: don't skip file size and buf updates on simulated ops
fsx supports the ability to skip through a certain number of
operations of a given command sequence before beginning full
operation. The way this works is by tracking the operation count,
simulating minimal side effects of skipped operations in-memory, and
then finally writing out the in-memory state to the target file when
full operation begins.
Several fallocate() related operations don't correctly track
in-memory state when simulated, however. For example, consider an
ops file with the following two operations:
zero_range 0x0 0x1000 0x0
read 0x0 0x1000 0x0
... and an fsx run like so:
fsx -d -b 2 --replay-ops=<opsfile> <file>
This simulates the zero_range operation, but fails to track the file
extension that occurs as a side effect such that the subsequent read
doesn't occur as expected:
Will begin at operation 2
skipping zero size read
The read is skipped in this case because the file size is zero. The
proper behavior, and what is consistent with other size changing
operations, is to make the appropriate in-core changes before
checking whether an operation is simulated so the end result of
those changes can be reflected on-disk for eventual non-simulated
operations. This results in expected behavior with the same ops file
and test command:
Will begin at operation 2
2 read 0x0 thru 0xfff (0x1000 bytes)
Update zero, copy and clone range to do the file size and EOF change
related zeroing before checking against the simulated ops count.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
f2fs/003: add missing _fixed_by_kernel_commit line
The bug related to this regression testcase has been fixed by commit b40a2b003709 ("f2fs: use meta inode for GC of atomic file"), let's
add missing _fixed_by_kernel_commit line for this testcase.
Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Amir Goldstein [Fri, 30 Aug 2024 19:45:46 +0000 (21:45 +0200)]
overlay: deprecate test t_truncate_self
Since kernel commit 2a010c412853 ("fs: don't block i_writecount during
exec"), truncating an executable file while it is being executed is
allowed. Therefore, the test t_truncate_self now fails, so remove it.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
André Almeida [Wed, 21 Aug 2024 15:57:46 +0000 (12:57 -0300)]
common/config: Correctly ignore {TEST|SCRATCH}_DEV for tmpfs
As per commit 264e5358e2c2 ("tmpfs: don't require {TEST|SCRATCH}_DEV"),
tmpfs doesn't need TEST or SCRATCH devices due to being a RAM-based
filesystem.
Fix the check by comparing the content of the variable TEST_DEV, instead
of comparing with the string TEST_DEV. Same for SCRATCH_DEV.
Fixes: 264e5358e2c2 ("tmpfs: don't require {TEST|SCRATCH}_DEV") Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
replace _min_dio_alignment with calls to src/min_dio_alignment
Use the min_dio_alignment C tool to check the minimum alignment.
This allows using the values obtained from statx instead of just guessing
based on the sector size and page size.
For tests using the scratch device this sometimes required moving code
around a bit to ensure the scratch device is actually mounted before
querying the alignment.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
generic: don't use _min_dio_alignment without a device argument
Replace calls to _min_dio_alignment that do not provide a device to
check with calls to the feature utility to query the page size, as that
is what these calls actually do.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Add a new C program to find the minimum direct I/O alignment. This
uses the statx stx_dio_offset_align field if provided, then falls
back to the BLKSSZGET ioctl for block backed file systems and finally
the page size. It is intended as a more capable replacement for the
_min_dio_alignment bash helper.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Mon, 12 Aug 2024 13:51:09 +0000 (14:51 +0100)]
btrfs: test send clones extents with unaligned end offset ending at i_size
Test that a send operation will issue a clone operation for a shared
extent of a file if the extent ends at the i_size of the file and the
i_size is not sector size aligned.
This verifies an improvement to the btrfs send feature implemented by
the following kernel patch:
"btrfs: send: allow cloning non-aligned extent if it ends at i_size"
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
If we fail to create a file system with specific passed in options, that
that these options conflict with other options $MKFS_OPTIONS. Don't
fail the test case for that, but instead _norun it and display the options
that caused it to fail.
Add a lower-level _try_scratch_mkfs_xfs helper for those places that want
to check the return value.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
If we fail to create a file system of a specific size that means it can't
work with some of the options in $MKFS_OPTIONS like the log size. Don't
fail the test case for that, but instead _norun it and display the options
that caused it to fail.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 6 Aug 2024 22:55:23 +0000 (15:55 -0700)]
xfs: test online repair when xfiles consists of THPs
Fork xfs/286 so that we can ensure that the xfile and xmbuf code in
fsck can handle THPs and large folios. This actually caused a
regression in the mm code during 6.10.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 6 Aug 2024 22:56:06 +0000 (15:56 -0700)]
xfs: remove all traces of xfs_check
xfsprogs stopped shipping xfs_check (the wrapper script) in May 2014.
It's now been over a decade since it went away, and its replacements
(xfs_repair and xfs_scrub) now detect a superset of the problems that
check can find.
There is no longer any point in invoking xfs_check, so let's remove it
from fstests completely.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Boris Burkov [Fri, 12 Jul 2024 18:52:36 +0000 (11:52 -0700)]
btrfs: add test for btrfstune squota enable/disable
btrfstune supports enabling simple quotas on a fleshed out filesystem
(without adding owner refs) and clearing squotas entirely from a
filesystem that ran under squotas (clearing the incompat bit)
Test that these operations work on a relatively complicated filesystem
populated by fsstress while preserving fssum.
Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Nothing in xfs/233 requires an rmap, it can run on any file system.
And it is a very useful test because it starts out with a very small
file system (or RT subvolume), which exercise some code paths no other
test does.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Wed, 31 Jul 2024 12:41:24 +0000 (13:41 +0100)]
btrfs/287: wait for subvolume deletion to complete
The test deletes the subvolume and then immediately calls the logical
resolve ioctl to confirm the extent is not referenced by the subvolume
anymore. This however may often fail because the subvolume delete only
makes the subvolume not accessible to user space anymore, but the actual
deletion of the subvolume tree, and all its data references, happens in
the background in the cleaner kthread running in kernel space.
So if by the time we do the query the cleaner kthread has not yet deleted
the subvolume tree, the test fails like this:
Filipe Manana [Wed, 31 Jul 2024 15:59:24 +0000 (16:59 +0100)]
generic/019: redirect fsstress output to log file instead
Currently we are redirecting stdout and stderr of fsstress to /dev/null.
In case we have a test failure, it's useful to know the seed that
fsstress was using because it might be helpful to run it again with that
same seed in the hope of being able to reproduce the failure.
So redirect stdout/stderr to the log file $seqres.full instead, so
that we'll see a line like this after running the test:
Darrick J. Wong [Fri, 26 Jul 2024 16:51:07 +0000 (09:51 -0700)]
generic/754: fix _fixed_by tags
This test requires an xfs_repair patch, so note that in the test. Also
update the kernel git hash since we now have one.
Reported-by: maxj.fnst@fujitsu.com Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
xfs/011: support byte-based grant heads are stored in bytes now
New kernels where reservation grant track the actual reservation space
consumed in bytes instead of LSNs in cycle/block tuples export different
sysfs files for this information.
Adapt the test to detect which version is exported, and simply check
for a near-zero reservation space consumption for the byte based version.
Based on work from Dave Chinner.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
f2fs: test for race condition in between atomic_write and dio
Test that we will simulate sqlite atomic write logic w/ below steps:
1. create a regular file, and initialize it w/ 0xff data
2. start transaction (via F2FS_IOC_START_ATOMIC_WRITE) on it
3. write transaction data
4. trigger direct read/write IO to check whether it fails or not
5. commit and end the transaction (via F2FS_IOC_COMMIT_ATOMIC_WRITE)
This is a regression test to check handling of race condition in
between atomic_write and direct IO.
Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
f2fs: test for race condition in between atomic_write and gc
Test that we will simulate sqlite atomic write logic w/ below steps:
1. create a regular file, and initialize it w/ 0xff data
2. start transaction (via F2FS_IOC_START_ATOMIC_WRITE) on it
3. write transaction data
4. trigger foreground GC to migrate data block of the file
5. commit and end the transaction
6. check consistency of transaction
This is a regression test to check handling of race condition in
between atomic_write and GC.
Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Wed, 24 Jul 2024 13:58:52 +0000 (14:58 +0100)]
btrfs: properly shutdown subvolume stress worker to avoid umount failures
When killing a test that is using the subvolume stress worker, we may end
up in a situation where we end up leaving a subvolume mounted which makes
the shutdown sequence fail. Example when killing a script that keeps
running fstests in a loop:
SCRATCH_DEV=/dev/sdc is mounted but not on SCRATCH_MNT=/home/fdmanana/btrfs-tests/scratch_1 - aborting
Already mounted result:
/dev/sdc /home/fdmanana/btrfs-tests/scratch_1 /dev/sdc /home/fdmanana/btrfs-tests/dev/065.mnt
grep: results/btrfs/065.out.bad: No such file or directory
Error iteration 134, total errors 2, leaks 0
'results/btrfs/065.full' -> '/home/fdmanana/failures/btrfs_065/134/065.full'
Running 'mount' to see what's going on:
$ mount
(...)
/dev/sdb on /home/fdmanana/btrfs-tests/dev type btrfs (rw,relatime,discard=async,space_cache=v2,subvolid=5,subvol=/)
/dev/sdc on /home/fdmanana/btrfs-tests/scratch_1 type btrfs (rw,relatime,discard=async,space_cache=v2,subvolid=5,subvol=/)
/dev/sdc on /home/fdmanana/btrfs-tests/dev/065.mnt type btrfs (rw,relatime,discard=async,space_cache=v2,subvolid=2627,subvol=/subvol_3395330)
Then this makes the next attempt to run a test (./check) always fail due
to the extra mount of the subvolume, requiring one to manually umount the
subvolume before running fstests again.
So update _btrfs_kill_stress_subvolume_pid() to also unmount the subvolume.
fstests: btrfs/012: fix a false alert due to socket/pipe files
[BUG]
On my Archlinux VM, the test btrfs/012 always fail with the following
output diff:
QA output created by 012
+File /etc/pacman.d/gnupg/S.dirmngr is a socket while file /mnt/scratch/etc/pacman.d/gnupg/S.dirmngr is a socket
+File /etc/pacman.d/gnupg/S.gpg-agent is a socket while file /mnt/scratch/etc/pacman.d/gnupg/S.gpg-agent is a socket
+File /etc/pacman.d/gnupg/S.gpg-agent.browser is a socket while file /mnt/scratch/etc/pacman.d/gnupg/S.gpg-agent.browser is a socket
+File /etc/pacman.d/gnupg/S.gpg-agent.extra is a socket while file /mnt/scratch/etc/pacman.d/gnupg/S.gpg-agent.extra is a socket
+File /etc/pacman.d/gnupg/S.gpg-agent.ssh is a socket while file /mnt/scratch/etc/pacman.d/gnupg/S.gpg-agent.ssh is a socket
+File /etc/pacman.d/gnupg/S.keyboxd is a socket while file /mnt/scratch/etc/pacman.d/gnupg/S.keyboxd is a socket
...
[CAUSE]
It's a false alerts.
When diff hits two files which are not directory/softlink/regular files
(aka, socket/pipe/char/block files), they are all treated as
non-comparable.
In that case, diff would just do the above message.
And with Archlinux, pacman (the package manager) maintains its gpg
directory inside "/etc/pacman.d/gnupg", and the test case uses
"/etc" as the source directory to populate the target ext4 fs.
And the socket files inside "/etc/pacman.d/gnupg" is causing the false
alerts.
[FIX]
- Use fsstress to populate the fs
That covers all kind of operations, including creating special files.
And fsstress is very reproducible, with the seed saved to the full
log, it's much easier to reproduce than using the distro dependent
"/etc/" directory.
- Use fssum to save the digest and later verify the contents
It does not only verify the contents but also other things like
timestamps/xattrs/uid/gid/mode/etc.
And it's more comprehensive than the content oriented diff tool.
Filipe Manana [Fri, 28 Jun 2024 17:04:49 +0000 (18:04 +0100)]
btrfs/081: wait for reader process to exit before cycle mounting
We send a kill signal to the reader process, check the md5sum of the
files and then cycle mount the scratch device. Most of the time the
reader process has already terminated before we attempt the cycle mount,
but sometimes it may still be alive in which case the cat command
executed by the reader process may fail because the scratch fs was
unmounted and the target file doesn't exist. This makes the cat command
print an error message and the test fail like this:
Verifying file digests after cloning 14968c092c68e32fa35e776392d14523 SCRATCH_MNT/foo 14968c092c68e32fa35e776392d14523 SCRATCH_MNT/bar
+cat: /opt/scratch/bar: No such file or directory
+cat: /opt/scratch/bar: No such file or directory
+cat: /opt/scratch/bar: No such file or directory
+cat: /opt/scratch/bar: No such file or directory
...
(Run diff -u /opt/xfstests/tests/btrfs/081.out
Fix this by making the test wait for the reader to terminate after
sending it the kill signal.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
fstests: btrfs/029: add fixes for the kernel behavior change
Since fstests commit 866948e00073 ("btrfs/029: change the cross vfsmount
reflink test"), the test case will fail for older kernels (e.g. 5.14
kernels from SLE).
The failure is a false alert, but it would still take some time to
figure it out.
So add the fixes tag to make it more clear that it's a kernel behavior
change.
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Fri, 12 Jul 2024 09:54:24 +0000 (10:54 +0100)]
btrfs: fix _require_btrfs_send_version to detect btrfs-progs support
Commit 199d0a992536df3702a0c4843d2a449d54f399c2 ("common/btrfs: introduce
_require_btrfs_send_version") turned _require_btrfs_send_v2 into a generic
helper to detect support for any send stream version, however it's only
working for detecting kernel support, it misses detecting the support from
btrfs-progs - it always checks only that it supports v2 (the send command
supports the --compressed-data option).
Fix that by verifying that btrfs-progs supports the requested version.
Boris Burkov [Tue, 9 Jul 2024 17:51:04 +0000 (10:51 -0700)]
btrfs: add test for subvolid reuse with squota
Squotas are likely to leave qgroups that outlive deleted subvolids.
Before the kernel patch 2b8aa78cf127 ("btrfs: qgroup: fix qgroup id collision across mounts")
this would lead to a repeated subvolid which would collide on an
existing qgroup id and error out with EEXIST. In snapshot creation, this
would lead to a read only fs.
Add a test which exercises the path that could create duplicate
subvolids but with squotas enabled, which should avoid the trap.
Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
[zlang: Remove _supported_fs line]
Starting with kernel patch "btrfs: remove raid-stripe-tree
encoding field from stripe_extent" and btrfs-progs commit 7c549b5f7cc0 ("btrfs-progs: remove raid stripe encoding"), the on-disk
format of the raid stripe tree got changed.
As the feature is still experimental and not to be used in production, it
is OK to do a on-disk format change.
Update the golden output of the RAID stripe tree test cases after the
on-disk format and print format changes.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Remove _supported_fs calls for generic in the generic directory
or for $FSTYP in the $FSTYP directory.
This leaves us with the negative checks, and the overloaded ext4
directory where some tests can also be run for ext2 and ext3.
While at it also remove the pointless "real QA test starts here"
usually placed right next to it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Use a single case statement for fs-specific options and to check if
this test is supported at all.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
To prepare for deprecating _supported_fs use a case statement and
_notrun.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Instead of limiting this test to a few file systems, opt out the
file systems supported in common/rc that don't support overwrite
checking at all, and those like extN that support it, but only when
run interactively.
Also remove support for really old mkfs.btrfs versions that lack
the overwrite check.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Commit a633d252e3c4 ("shared/032: add options for jffs2") added a
check to skip checking the overwrite of jffs2, but only after
adding specific mkfs options for it and zeroing part of the device.
Switch to skipping it earlier in a more obvious place.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Without --quick mkfs.ntfs will zero the entire device, which can take
a very long time.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Use a single case statement instead of lots of conditionals.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
ext4dev is a long deprecated alias for ext4 that was used during early
ext4 development. Drop support for it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Long Li [Fri, 12 Jul 2024 06:47:16 +0000 (14:47 +0800)]
xfs/242: fix test failure due to incorrect filtering in _filter_bmap
I got a failure in xfs/242 as follows, it can be easily reproduced
when I run xfs/242 as a cyclic test.
13. data -> unwritten -> data
0: [0..127]: data
-1: [128..511]: unwritten
-2: [512..639]: data
+1: [128..639]: unwritten
The root cause, as Dave pointed out in previous email [1], is that
_filter_bmap may incorrectly match the AG-OFFSET in column 5 for datadev
files. On the other hand, _filter_bmap missing a "next" to jump out when
it matches "data" in the 5th column, otherwise it might print the result
twice. The issue was introduced by commit 7d5d3f77154e ("xfs/242: fix
_filter_bmap for xfs_io bmap that does rt file properly"). The failure
disappeared when I retest xfs/242 by reverted commit 7d5d3f77154e.
Fix it by matching the 7th column first and then the 5th column in
_filter_bmap, because the rtdev file only has 5 columns in the `bmap -vp`
output.
Long Li [Fri, 12 Jul 2024 06:47:15 +0000 (14:47 +0800)]
xfs/016: fix test fail when head equal to near_end_min
xfs/016 checks for corruption in the log when it wraps. It looks for a log
head that is at or above the minimum log size. If the final position of
the log head equals near_end_min, the test will fail. Under these
conditions, we should let the test continue.
Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:36:05 +0000 (14:36 -0700)]
xfs/444: fix agfl reset warning detection for small log buffers
Collectively, the ten subtests in xfs/444 can generate a lot of kernel
log data. If the amount of log data is enough to overflow the kernel
log buffers, the AGFL reset warning generated by fix_start and fix_wrap
might have been overwritten by subsequent log data. Fix this by
checking for the reset warning after each test and only complaining if
at the end if we have /never/ seen the warning.
Found by running on a kernel configured with CONFIG_LOG_BUF_SHIFT=14
(16K). This happened to be a Raspberry Pi, but in principle this can
happen to anyone. I'd never noticed this before because x86 helpfully
sets it to 17 (128K) by default.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 20 Jun 2024 21:01:06 +0000 (14:01 -0700)]
generic: test creating and removing symlink xattrs
This began as a regression test for the issues identified in "xfs: allow
symlinks with short remote targets". To summarize, the kernel XFS code
does not convert a remote symlink back to a shortform symlink after
deleting the attr fork. Recent attempts to tighten validation have
flagged this incorrectly, so we need a regression test to focus on this
dusty corner of the codebase.
However, there's nothing in here that's xfs-specific so it's a generic
test.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Dave Chinner [Thu, 20 Jun 2024 21:00:50 +0000 (14:00 -0700)]
xfs/348: partially revert dbcc549317 ("xfs/348: golden output is not correct")
In kernel commit 1eb70f54c445f ("xfs: validate inode fork size against
fork format"), we incorrectly started flagging as corrupt symlinks with
short targets that would fit in the inode core but are remote. The
kernel has historically written out symlinks this way and read them back
in, so we're fixing that.
The 1eb70 change came with change dbcc to fstests to adjust the golden
output; since we're adjusting the kernel back to old behavior, we need
to adjust the test too.
Fixes: dbcc549317 ("xfs/348: golden output is not correct") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Add a test to verify parent pointers after an error injection and log
replay.
Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Add a test to verify parent pointers while multiple links to a file are
created and removed.
Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Add a test to verify basic parent pointers operations (create, move, link,
unlink, rename, overwrite).
Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: test the xfs_io parent -p argument too] Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Add helper functions in common/parent to parse and verify parent
pointers. Also add functions to check that mkfs, kernel, and xfs_io
support parent pointers.
Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: add license and copyright, dont _fail tests immediately, make
sure the pptr-generated paths match the dir-generated paths] Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 20 Jun 2024 20:58:45 +0000 (13:58 -0700)]
xfs/306: fix formatting failures with parent pointers
The parent pointers feature isn't supported on tiny 20MB filesystems
because the larger directory transactions result in larger minimum log
sizes, particularly with nrext64 enabled:
** mkfs failed with extra mkfs options added to " -m rmapbt=0, -i nrext64=1, -n parent=1," by test 306 **
** attempting to mkfs using only test 306 options: -d size=20m -n size=64k **
max log size 5108 smaller than min log size 5310, filesystem is too small
We don't support 20M filesystems anymore, so bump the filesystem size up
to 100M and skip this test if we can't actually format the filesystem.
Convert the open-coded punch-alternating logic into a call to that
program to reduce execve overhead, which more than makes up having to
write 5x as much data to fragment the free space.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 20 Jun 2024 20:58:13 +0000 (13:58 -0700)]
xfs/021: adapt golden output files for parent pointers
Parent pointers change the xattr structure dramatically, so fix this
test to handle them. For the most part we can get away with filtering
out the parent pointer fields (which xfs_db decodes for us), but the
namelen/valuelen/attr_filter fields still show through.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 20 Jun 2024 20:56:55 +0000 (13:56 -0700)]
exchangerange: make sure that we don't swap unwritten extents unless they're part of a rt extent
By default, the FILE1_WRITTEN flag for the EXCHANGERANGE ioctl isn't
supposed to touch anything except for written extents. In other words,
it shouldn't exchange delalloc reservations, unwritten preallocations,
or holes. The XFS implementation flushes dirty pagecache to disk so
there should never be delalloc reservations running through the
exchangerange machinery, but there can be unwritten extents.
Hence, write a test to make sure that unwritten extents don't get moved
around. This test opts itself out for realtime filesystems where the
allocation unit is larger than 1 fsblock because xfs has to move full
allocation units, and that requires exchanging of partially written rt
extents.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 20 Jun 2024 20:55:53 +0000 (13:55 -0700)]
misc: flip HAVE_XFS_IOC_EXCHANGE_RANGE logic
We only need to include src/fiexchange.h if the system's xfslibs package
either doesn't define it or is so old that we want a newer definition.
Invert the logic so that we only use src/fiexchange if we need the
override.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 20 Jun 2024 20:55:21 +0000 (13:55 -0700)]
generic/717: remove obsolete check
The final version of the EXCHANGERANGE ioctl has dropped the flag that
enforced that the two files being operated upon were exactly the same
length as was specified in the ioctl parameters. Remove this check
since it's now defunct.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 20 Jun 2024 20:55:06 +0000 (13:55 -0700)]
generic/711,xfs/537: actually fork these tests for exchange-range
Fork g/711 to g/752, x/537 to x/615. These tests to check the same
things with exchange-range as they do for swapext, since the code
porting swapext to commit-range has been dropped.
I was going to fork xfs/789 as well, but it turns out that generic/714
covers this sufficiently so for that one, we just strike fiexchange
from the group tag.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 20 Jun 2024 20:54:50 +0000 (13:54 -0700)]
generic/709,710: rework these for exchangerange vs. quota testing
The exchange-range implementation is now completely separate from the
old swapext ioctl, so let's migrate these quota tests to exchangerange.
There's no point in maintaining these tests for the legacy swapext code
because it returns EINVAL if any quota is enabled and the two files have
different user/group/project ids. Originally I had forward ported the
old swapext ioctl to use commitrange as its backend, but that will be
dropped in favor of porting xfs_fsr to use commitrange directly.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Eric Biggers [Wed, 12 Jun 2024 03:53:34 +0000 (20:53 -0700)]
generic/574: test corruption at more offsets
Expand generic/574 to test for corruption in more different parts of the
file to try to exercise any hashing optimizations that might be used.
There is no existing bug that this finds. This is just to prevent
future bugs, considering optimizations along the lines of
https://lore.kernel.org/fsverity/20240611034822.36603-7-ebiggers@kernel.org/
Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Luis Chamberlain [Sat, 15 Jun 2024 00:29:34 +0000 (17:29 -0700)]
fstests: add stress truncation + writeback test
Stress test folio splits by using the new debugfs interface to a target
a new smaller folio order while triggering writeback at the same time.
This is known to only creates a crash with min order enabled, so for example
with a 16k block sized XFS test profile, an xarray fix for that is merged
already. This issue is fixed by kernel commit 2a0774c2886d ("XArray: set the
marks correctly when splitting an entry").
If inspecting more closely, you'll want to enable on your kernel boot:
dyndbg='file mm/huge_memory.c +p'
Since we want to race large folio splits we also augment the full test
output log $seqres.full with the test specific number of successful
splits from vmstat thp_split_page and thp_split_page_failed. The larger
the vmstat thp_split_page the more we stress test this path.
This test reproduces a really hard to reproduce crash immediately.
[zlang: add _require_debugfs into _require_split_huge_pages_knob]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Luis Chamberlain [Sat, 15 Jun 2024 00:29:33 +0000 (17:29 -0700)]
_require_debugfs(): simplify and fix for debian
Using findmnt -S debugfs arguments does not really output anything on
debian, and is not needed, fix that.
Fixes: 8e8fb3da709e ("fstests: fix _require_debugfs and call it properly") Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Luis Chamberlain [Sat, 15 Jun 2024 00:29:32 +0000 (17:29 -0700)]
fstests: add fsstress + compaction test
Running compaction while we run fsstress can crash older kernels as per
korg#218227 [0], the fix for that [0] has been posted [1] that patch
was merged on v6.9-rc6 fixed by commit d99e3140a4d3 ("mm: turn
folio_test_hugetlb into a PageType"). This test reproduces that crash
right away.
But we have more work to do ...
Even on v6.10-rc2 where this kernel commit is already merged we can
still deadlock when running fsstress and at the same time triggering
compaction, this is a new issue being reported now through this patch,
but this patch also serves as a reproducer with a high confidence. At
least for XFS running this test ~ 44 times will deadlock.
If you enable CONFIG_PROVE_LOCKING with the defaults you will end up
with a complaint about increasing MAX_LOCKDEP_CHAIN_HLOCKS [1], if
you adjust that you then end up with a few soft lockup complaints and
some possible deadlock candidates to evaluate [2].
Provide a simple reproducer and pave the way so we keep on testing this.
Luis Chamberlain [Sat, 15 Jun 2024 00:29:31 +0000 (17:29 -0700)]
fstests: add mmap page boundary tests
mmap() POSIX compliance says we should zero fill data beyond a file
size up to page boundary, and issue a SIGBUS if we go beyond. While fsx
helps us test zero-fill sometimes, fsstress also let's us sometimes test
for SIGBUS however that is based on a random value and its not likely we
always test it. Dedicate a specic test for this to make testing for
this specific situation and to easily expand on other corner cases.
The only filesystem currently known to fail is tmpfs with huge pages on
a 4k base page size system, on 64k base page size it does not fail.
The pending upstream patch "filemap: cap PTE range to be created to
allowed zero fill in folio_map_range()" fixes this issue for tmpfs on
4k base page size with huge pages and it also fixes it for LBS support.
Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Luis Chamberlain [Sat, 15 Jun 2024 00:29:30 +0000 (17:29 -0700)]
common: move mread() to generic helper _mread()
We want a shared way to use mmap in a way that we can test
for the SIGBUS, provide a shared routine which other tests can
leverage.
Suggested-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Theodore Ts'o [Tue, 11 Jun 2024 22:26:58 +0000 (18:26 -0400)]
ext4/045: use the large_dir feature to fix test failures with a 1k block size
If the file system has a 1k blocksize, this test will fail without the
large_dir file system, because the depth of the dir_index tree needs
to be greater than 2. So enable the large_dir unconditionally, which
also gives us better test coverage of the large_dir code paths.
As a result of requiring large_dir, this test will get skipped if the
kernel is older than 4.13 --- which was released in 2017; and that
seems to be reasonable at this point.
Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Theodore Ts'o [Wed, 23 Aug 2023 14:56:21 +0000 (10:56 -0400)]
ext4/059: disable block_validity checks when mounting a corrupted file system
Kernels with the commit "ext4: add correct group descriptors and
reserved GDT blocks to system zone" will refuse to mount the corrupted
file system constructed by this test. So in order to perform the
test, we need to disable the block_validity checks.
Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-and-tested-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Mon, 17 Jun 2024 16:52:14 +0000 (17:52 +0100)]
generic/74[3,8]: add git commit ID for the fixes
The corresponding fixes landed in kernels 6.10-rc1 and 6.10-rc3, so update
the tests to point the commit IDs.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>