git.apps.os.sepia.ceph.com Git - xfstests-dev.git/log

report: derive an xml schema for the xunit report

The "xunit" report format emits an XML document that more or less
follows the junit xml schema.  However, there are two major exceptions:

1. fstests does not emit an @errors attribute on the testsuite element
because we don't have the concept of unanticipated errors such as
"unchecked throwables".

2. The system-out/system-err elements sound like they belong under the
testcase element, though the schema itself imprecisely says "while the
test was executed".  The schema puts them under the top-level testsuite
element, but we put them under the testcase element.

Define an xml schema for the xunit report format, and update the xml
headers to link to the schema file.  This enables consumers of the
reports to check mechanically that the incoming document follows the
format.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

check: generate section reports between tests

Generate the section report between tests so that the summary report
always reflects the outcome of the most recent test.  Two usecases are
envisioned here -- if a cluster-based test runner anticipates that the
testrun could crash the VM, they can set REPORT_DIR to (say) an NFS
mount to preserve the intermediate results.  If the VM does indeed
crash, the scheduler can examine the state of the crashed VM and move
the tests to another VM.  The second usecase is a reporting agent that
runs in the VM to upload live results to a test dashboard.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Leah Rumancik <leah.rumancik@gmail.com>
Tested-by: Leah Rumancik <leah.rumancik@gmail.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fscrypt-crypt-util: fix XTS self-test with latest OpenSSL

In OpenSSL 3.0, XTS encryption fails if the message is zero-length.
Therefore, update test_aes_256_xts() to not test this case.

This only affects the algorithm self-tests within fscrypt-crypt-util,
which are not compiled by default.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fscrypt-crypt-util: use OpenSSL EVP API for AES self-tests

OpenSSL 3.0 has deprecated the easy-to-use AES block cipher API in favor
of EVP. EVP is also available in earlier OpenSSL versions. Therefore,
update test_aes_keysize() to use the non-deprecated API to avoid
deprecation warnings when building the algorithm self-tests.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fscrypt-crypt-util: fix HKDF self-test with latest OpenSSL

In OpenSSL 3.0, EVP_PKEY_derive() fails if the output is zero-length.
Therefore, update test_hkdf_sha512() to not test this case.

This only affects the algorithm self-tests within fscrypt-crypt-util,
which are not compiled by default.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

m4: Check for FTW_ACTIONRETVAL along with nftw

FTW_ACTIONRETVAL is glibc specific extention which is used to implement
xfsfind but it may not be available on other C library implementations on Linux
e.g. musl. Therefore ensure that these defines are available before declaring
nftw() to be usable

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Zorro Lang <zlang@redhat.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>

btrfs/246: skip the test if the tested btrfs doesn't support inline extents creation

[FALSE ALERT]
If test case btrfs/246 is executed with a 16K page sized system (like
some aarch64 SoCs) using 4K sector size (would be the new default), the
test case would fail with output mismatch:

  btrfs/246 1s ... - output mismatch (see ~/xfstests-dev/results//btrfs/246.out.bad)
      --- tests/btrfs/246.out 2022-11-24 19:53:53.158470844 +0800
      +++ ~/xfstests-dev/results//btrfs/246.out.bad 2023-03-22 13:27:34.975796048 +0800
      @@ -3,3 +3,5 @@
       0ca3bfdeda1ef5036bfa5dad078a9f15724e79cf296bd4388cf786bfaf4195d0  SCRATCH_MNT/foobar
       sha256sum after mount cycle
       0ca3bfdeda1ef5036bfa5dad078a9f15724e79cf296bd4388cf786bfaf4195d0  SCRATCH_MNT/foobar
      +no inline extent found
      +no compressed extent found
      ...
      (Run 'diff -u ~/xfstests-dev/tests/btrfs/246.out ~/xfstests-dev/results//btrfs/246.out.bad'  to see the entire diff)

[CAUSE]
For current btrfs subpage support, there are still some limitations:

- No compressed write if the range is not fully page aligned
- No inline extents creation
  Reading inline extents is still supported

Thus we won't create such inlined compressed extent at all.

[FIX]
Just skip the test case if we can not even create a regular inline
extent.

This is done by a new require helper,
_require_btrfs_inline_extent_creation(), which would detect if btrfs can
even create an uncompressed inlined extent.

Reported-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/020: fix another really long attr test failure for ceph

If the CONFIG_CEPH_FS_SECURITY_LABEL is disabled the kernel ceph
the 'selinux_size' will be empty and then:
max_attrval_size=$((65536 - $size - $selinux_size - $max_attrval_namelen))
will be:
max_attrval_size=$((65536 - $size - - $max_attrval_namelen))
which equals to:
max_attrval_size=$((65536 - $size + $max_attrval_namelen))

Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: stress test cycling parent pointers with online repair

Add a couple of new tests to exercise directory and parent pointer
repair against rename() calls moving child subdirectories from one
parent to another. This is a useful test because it turns out that the
VFS doesn't lock the child subdirectory (it does lock the parents), so
repair must be more careful.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/286: add missing calls to _scratch_dev_pool_put and _spare_dev_put

The test is doing a _scratch_dev_pool_get, which shrinks the list of
devices in SCRATCH_DEV_POOL, but it's not calling _scratch_dev_pool_put
before it finishes. This will result in subsequent tests (none at the
moment however) getting a reduced list of devices in SCRATCH_DEV_POOL.

The same goes for the spare device, the test calls _spare_dev_get but
it never calls _spare_dev_put.

So add the missing calls.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/284: list a couple btrfs-progs git commits

This test may often fail when running with btrfs-progs versions not very
recent. The corresponding git commits in btrfs-progs that fix issues
uncovered by this test are:

1) 6f4a51886b37 ("btrfs-progs: receive: fix silent data loss after fall back from encoded write")
Introduced in btrfs-progs v6.0.2;

2) e3209f8792f4 ("btrfs-progs: receive: fix a corruption when decompressing zstd extents"")
Introduced in btrfs-progs v6.2.

So add the corresponding _fixed_by_git_commit calls to the test.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Add tmpfs tests for idmap mounts

This patch calls all tests in the suite s_idmapped_mounts, but with a
tmpfs directory mounted inside a userns. This directory is setup as the
mount point for the test that runs nested.

This excercises that tmpfs mounted inside a userns works as expected
regarding idmap mounts.

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Export test_setup() and test_cleanup()

Future patches will call existing test inside another test, so we need
to properly setup the test environment.

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Make idmapped core tests public

Tests on the suite s_idmapped_mounts are made public, future patches
for tmpfs will call them.

While making them public, we add a "tcore_" prefix so we don't make so
generic names public.

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Prepare tests in &s_idmapped_mounts to be reused inside a userns

Future patches will call these tests within a userns. So, let's change
the makedev major/minor to something that works inside a userns.

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Make switch_userns set PR_SET_DUMPABLE

We need PR_SET_DUMPABLE in order to write the mapping files when
creating a userns. From prctl(2) PR_SET_DUMPABLE is reset when the
process's effective user or group ID is changed.

As we are changing the EUID here, we also reset it to allow creating
nested userns with subsequent switch_users() calls.

This was not causing any issues because we weren't using switch_users()
to create nested userns. Nested userns were created with
userns_fd_cb()/create_userns_hierarchy() that set PR_SET_DUMPABLE.

Future patches will rely on switch_users() to create nested userns. So
this patch fixes that.

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Fix race condition on get_userns_fd()

There is a race when we clone: we call a function that just returns
while at the same time we try to get the userns via /proc/pid/ns/user.
The thing is that when the function returns, in the kernel do_exit()
from kernel/exit.c is called, which calls exit_task_namespaces() to destroy
the namespaces.

So, let's wait indefinitely there and add an _exit() call to avoid
warnings. We are already sending a SIGKILL to this pid, so nothing else
remaining to not leak the process.

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Use tabs to indent, not spaces

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Fix documentation typo

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfs: Don't open-code safe_close()

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/020: fix really long attr test failure for ceph

If the CONFIG_CEPH_FS_SECURITY_LABEL is enabled the kernel ceph
itself will set the security.selinux extended attribute to MDS.
And it will also eat some space of the total size.

Fixes: https://tracker.ceph.com/issues/58742
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: don't clear superblock for zoned scratch pools

_require_scratch_dev_pool() zeros the first 100 sectors of each device to
clear eventual remains of older filesystems.

On zoned devices this won't work as a plain dd will end up creating
unaligned write errors failing all subsequent actions on the device.

For zoned devices it is enough to simply reset the first two zones of the
device to achieve the same result.

Reviewed-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: test xfs_scrub dry run, preen, and repair mode

For each of the three operational modes of xfs_scrub, make sure that we
/only/ repair that which we're supposed to.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online repair of dirs and parent pointers

Create tests to race fsstress with directories and directory parent
pointer repair while running fsstress in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: ensure that online directory repairs don't hit EDQUOT

Add a test to ensure that the sysadmin doesn't get EDQUOT if they try to
repair directory metadata when we've already exceeded a quota limit
somewhere.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online repair of extended attribute data

Create tests to race fsstress with extended attribute repair while
running fsstress in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online repair of realtime summary files

Create tests to race fsstress with rt summary file repair while running
fsstress in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsstress: update for FIEXCHANGE_RANGE

Teach this stress tool to be able to use the file content exchange
ioctl.

[zlang: reduce freq to 2]

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsx: support FIEXCHANGE_RANGE

Upgrade fsx to support exchanging file contents.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: test that file privilege gets dropped with FIEXCHANGE_RANGE

Make sure that we clear the suid and sgid bits and capabilities during a
FIEXCHANGE_RANGE call just like we would for a regular file write.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic, xfs: test scatter-gather atomic file updates

Make sure that FILE_SWAP_RANGE_SKIP_FILE1_HOLES does what we want it to
do -- provide a means to implement scatter-gather atomic file writes.
That means we create a temporary file, write whatever sparse bits to it
that we want, and swap the non-hole parts of the temp file.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: test new vfs swapext ioctl

Test the new vfs swapext ioctl.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: test old xfs extent swapping ioctl

Add some tests to check the operation of the old xfs swapext ioctl.
There aren't any xfs-specific pieces in here, so they're in generic/

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/122: fix for swapext log items

Add entries for the extent swapping log items.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add test case for NODATASUM dev-replace

During my development on dev-replace, I made a mistake in RAID56
dev-replace code where it can lead to NODATASUM corruption.
Thankfully such corruption didn't reach upstream.

Inspired by such incident, here comes a new test case for btrfs
dev-replace, the new case would:

- Populate the filesystem with nodatasum mount option

- Run fssum to record the contents
  Since the test case only cares about data contents, here we don't
  include metadata like uid/gid/timestamp.

- Wipe one device

- Mount the fs with the missing device

- Verify the contents is still correct

- Replace the missing device

- Verify the contents is still correct again
  Before the verification, drop all cache to make sure the 2nd
  verification is reading from the disks.

For current kernels, the test case should pass as expected.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add test for direct io partial writes

btrfs recently had a bug where a direct io partial write resulted in a
hole in the file. Add a new generic test which creates a 2MiB file,
mmaps it, touches the first byte, then does an O_DIRECT write of the
mmapped buffer into a new file. This should result in the mapped pages
being a mix of in and out of page cache and thus a partial write, for
filesystems using iomap and IOMAP_DIO_PARTIAL.

Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

selftest: add tests for debugging testing setup

Many people have developed infrastructure around xfstests. In order to
test a setup, it would be helpful to have dummy tests that have
consistent test outcomes. Add a new test folder with the following
tests:

selftest/001 pass
selftest/002 fail from output mismatch
selftest/003 fail via _fail
selftest/004 skip
selftest/005 crash
selftest/006 hang

Also, create two new groups: 'selftest' which includes tests 001-004 and
'dangerous_selftest' which includes tests 005-006. The selftests will
not run unless explicitly specified.

Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

README: Add section to install required packages on (open)SUSE

The README already lists how to install the required packages on other popular
distributions. Since a lot of filesystem development happens on SUSE systems and
both the commands as well as the list of packages are slightly different,
it only makes sense to include these as well.

Signed-off-by: Gabriel Niebler <gniebler@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: fuzz test key/pointers of inode btrees

Test what happens when we fuzz the key/pointer blocks (aka the interior
nodes) of the inode btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: fuzz test both repair strategies

Add more fuzz tests to examine the effectiveness of online and then
offline repair.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: test fuzzing realtime free space metadata

Fuzz the contents of the realtime bitmap and summary files to see what
happens.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: test fuzzing xattr block mappings

Fuzz the block mappings of extended attributes to see what happens.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: test fuzzing directory block mappings

Fuzz the block mappings of directories to see what happens.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: for fuzzing ag btrees, find the path to the AG header

The fs population code creates various btrees in /some/ allocation group
with at least two levels. These btrees aren't necessarily created in
agno 0, so we need to find it programmatically. While we're at it, fix
a few of the comments that failed to mention when we're fuzzing interior
nodes and not leaves.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: improve metadata array field handling when fuzzing

Currently, we use some gnarly regular expressions to try to constrain
the amount of time we spend fuzzing each element of a metadata array.
This is pretty inflexible (and buggy) since we limit some arrays
(e.g. dir hashes) to the first ten elements and other arrays (e.g.
extent mappings) that use compact index ranges to the first one.

Replace this whole weird mess with logic that can tease out the array
indices, unroll the compact indices if needed, and give the user more
flexible control over which array elements get used.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: report the fuzzing repair strategy in seqres.full

Record in the seqres.full file the filesystem repair strategy that we're
going to use to detect the fuzzed metadata.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: compress coredumps created while fuzzing

Compress the coredumps and put them in the results directory.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: dump metadata state before fuzzing

When we start a fuzz test, dump the metadata to stdout so that anyone
analyzing a failure can see what was in the (supposedly) good image, and
what it turns into after fuzzing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: exercise the filesystem a little harder after repairing

Use fsstress to exercise the filesystem a little more strenuously after
we've run the fuzzing repair strategy, so that we have a better chance
of tripping over corruption problems.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: check xfs health after doing an online scrub

After we've run xfs_scrub -n to perform a check of a mounted
filesystem's metadata, we should check the health reporting system to
make sure that the results got recorded. Also wire this up to the xfs
fuzz testing helpers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: evaluate xfs_check vs xfs_repair

When fuzzing a filesystem and using the offline repair strategy, compare
the outputs of xfs_check against xfs_repair to ensure that the newer
xfs_repair catches at least as many things as xfs_check does.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/{35[45],455}: fix bogus corruption errors

The AGFL fuzz tests first fuzz the entire block header, and second
extract flfirst from the AGF header to start a second round of targeted
fuzzing of live bno pointers in the AGFL. However, flfirst (and the
AGFL field detection at the start of the second round of fuzzing) are
detected after we've already been fuzz testing, which means that the
AGFL might be garbage because repair failed or was not called. If this
is the case, test will fail because the _scratch_xfs_db -c 'agf 0' -c
'p flfirst' call emits things like this:

Fuzz AGFL flfirst
Metadata corruption detected at 0x55f4f789fbc0, xfs_agfl block 0x3/0x200
Metadata corruption detected at 0x55b7356e0bc0, xfs_agfl block 0x3/0x200
Done fuzzing AGFL flfirst

Fix this by restoring the scratch fs before probing flfirst and starting
the second round of fuzzing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: fix some problems with the post-repair fs modification code

While auditing the fuzz tester code, I noticed there were numerous
problems with the code that test-drives the filesystem after we've run
the repair strategy. Now that we've made sure that the repair strategy
checks its own efficacy, we can rearrange this function to try making
mods and then re-check the filesystem afterwards. Also, disable
xfs_repair prefetch to reduce the likelihood of OOM kills.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: fix some problems with the online-then-offline repair strategy

While auditing the fuzz tester code, I noticed there were numerous
problems with the online-then-offline repair strategy -- the stages of
the strategy are not consistently logged to the kernel log, some of the
error messages don't identify /which/ scrubber we're calling, we don't
do a pre-repair check to make sure we detect the fuzzed fields, and we
don't actually re-run online scrub after a repair to make sure that it's
ok. Disable xfs_repair prefetch to reduce the possibility of OOM kills.
Rework the error messages to make reading the golden output easier.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: fix some problems with the no-repair strategy

While auditing the fuzz tester code, I noticed there were numerous
problems with the no repair strategy -- the stages of the strategy
are not consistently logged to the kernel log, and we don't actually
verify that either online or offline scrubs notice the fuzz. Rework the
error messages to make reading the golden output easier.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: fix some problems with the offline repair strategy

While auditing the fuzz tester code, I noticed there were numerous
problems with the offline repair strategy -- the stages of the strategy
are not consistently logged to the kernel log, we don't actually check
that repair -n finds the fuzzed field, and since this is an offline
test, we don't need or want to mount or try to run the online scrubber.
Also, disable prefetch to reduce the chance of an OOM kill. Rework the
error messages to make reading the golden output easier.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: fix some problems with the online repair strategy

While auditing the fuzz tester code, I noticed there were numerous
problems with the online repair strategy -- the stages of the strategy
are not consistently logged to the kernel log, some of the error
messages don't identify /which/ scrubber we're calling, and we don't
actually re-run online scrub after a repair to make sure that it's
verification is ok. Disable xfs_repair prefetch to reduce the chances
of an OOM kill, and abort the fuzz test if we can't mount. We also
reorganize the error messages to make reading the golden output easier.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: hoist the post-repair fs modification step

Hoist the code that tries to modify an fs after repairing our fuzz
damage into a separate function, so that we can further simplify the
caller.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: add an underline to the full log between sections

The fuzz scripts use __fuzz_notify in effect to log each step in the
fuzz process. Enhance it to print an "underline" to ease readability a
bit.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fuzzy: split out each repair strategy into a separate helper

Refactor __scratch_xfs_fuzz_field_test to split out each repair strategy
into a separate helper function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: don't fuzz xattr namespace flags and values

Extended attribute namespace flags are controlled by userspace, and
there is no validation imposed on the values. Don't bother fuzzing
either of these things.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: don't fuzz user-controllable inode flags

Don't fuzz the inode flags that are controlled by userspace and don't
actually have any other effects on the ondisk metadata.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: don't fuzz inode generation numbers

The inode generation number is a randomly selected 32-bit integer that
isn't itself validated anywhere. No need to fuzz that.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: don't fuzz obsolete inode fields

We don't really care about inode fields were used in V4 (deprecated) or
DMAPI (unsupported) so don't bother fuzzing them.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: don't fuzz the log sequence number

Don't bother filtering log sequence numbers since xfs_db doesn't have
the ability to tell us the range of LSNs that would actually cause
validation failures.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: disable timstamp fuzzing by default

Don't fuzz timestamps since all bit patterns are valid and XFS itself
does not perform any validation on them.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: disable per-field random fuzzing by default

Don't run the random fuzzer by default so that we can try to stabilize
the output somewhat.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

populate: fix some weirdness in __populate_check_xfs_agbtree_height

Use a for loop to scan the AGs, and make all the variables local like
you'd expect them to be.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

populate: take a snapshot of the filesystem if creation fails

There have been a few bug reports filed about people not being able to
use the filesystem metadata population code to create filesystems with
all types of metadata on them. Right now this is super-annoying to
debug because we don't capture a metadump for easy debugging. Fix that.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/422: don't freeze while racing rmap repair and fsstress

Since we're moving away from freezing the filesystem for rmap repair,
remove the freeze/thaw race from this test to make it more interesting.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online repair for summary counters

Create tests to race fsstress with fs summary counter repair while
running fsstress in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: test fs summary counter online repair

Fuzz the fs summary counters in the primary super and see if online
repair can fix them.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with inode link count check and repair

Race fsstress with inode link count checking and repair.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online scrub and repair for quotacheck

Create tests to race fsstress with quota count check and repair while
running fsstress in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online scrub and repair for quota metadata

Create tests to race fsstress with dquot repair while running fsstress
in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online repair for special file metadata

For each XFS_SCRUB_TYPE_* that looks at symbolic link and special file
metadata, create a test that runs that repairer in the foreground and
fsstress in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: ensure that online file data fork repairs don't hit EDQUOT

Add a test to ensure that the sysadmin doesn't get EDQUOT if they try to
repair file data fork metadata when we've already exceeded a quota limit
somewhere.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online repair for inode and fork metadata

For each XFS_SCRUB_TYPE_* that looks at inode and data/attr/cow fork
metadata, create a test that runs that repairer in the foreground and
fsstress in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: test rebuilding xattrs when the data fork is btree format

Make sure we handle the case of rebuilding extended attributes properly
when the data fork is in btree format and we therefore cannot zap the
attr fork.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: race fsstress with online repair for inode record metadata

Create a test that runs the inode record repairer in the foreground and
fsstress in the background.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: stress test ag repair functions

Race fsstress and various AG repair functions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: test rebuilding the entire filesystem with online fsck

Add a new knob, TEST_XFS_SCRUB_REBUILD, that makes it so that we use
xfs_scrub to rebuild the ondisk metadata after every test.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: use FORCE_REBUILD over injecting force_repair

For stress testing online repair, try to use the FORCE_REBUILD ioctl
flag over the error injection knobs whenever possible because the knobs
are very noisy and are not always available.

[zlang: do not export the SCRUBSTRESS_USE_FORCE_REBUILD]

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/604: fix test to actually create dirty inodes

The test case generic/604 aims to test a scenario where at unmount time we
have many dirty inodes, however the test does not actually creates any
files, because it calls xfs_io without the -f argument, so xfs_io fails
but any error is ignored because stderr is redirected to /dev/null.

Fix this by passing -f to xfs_io and also stop redirecting stderr to
/dev/null, so that in case of any unexpected failure creating files, the
test fails.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: btrfs/249: add _wants_kernel_commit and _fixed_by_git_commit

Add the _wants_kernel_commit tag for kernel and _fixed_by_git_commit tag
for the btrfs-progs for the benefit of testing on the older kernels and
the older btrfs-progs.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: btrfs/185, 198 and 219 add _fixed_by_kernel_commit

Recently, these test cases were added to the auto group. To ensure we have
some clues if they fail in older kernels, add "_fixed_by_kernel_commit"
for the fix and update the test summary.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: test block group size class loading logic

Add a new test which checks that size classes in freshly loaded block
groups after a cycle mount match size classes before going down

Depends on the kernel patch:
btrfs: add size class stats to sysfs

Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/011: use $_btrfs_profile_configs to limit the tests

Generally the tester need BTRFS_PROFILE_CONFIGS to test certain
profeils. For example, skip raid56 as it's not supported.

For dup profile, add dup to default profile configs.

Signed-off-by: An Long <lan@suse.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: add 080 to the auto and quick groups

xfs/080 is not dangerous, isn't a known fail and runs very quickly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add 251 to the auto group

generic/251 isn't dangerous, doesn't takes overly long to run and doesn't
produce spurious failures, so add it to the auto group.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add 125 to the auto group

This test is not dangerous and passes reliably. Add it to the auto
group.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add 042 to the auto and quick groups

generic/042 was removed from the auto group in 2015 by commit
7721b8501608 ("generic/042: remove from the 'auto' group") because it
always failed on XFS and wasn't run for other file systems back then.
Since then XFS fixed the problem it reproduces, and ext4 and f2fs
have grown shutdown support and also pass it reliably.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add 185 to the auto and quick groups

btrfs/185 runs in a second, add it to the auto and quick group.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add 125 to the auto and quick groups

btrfs/125 runs in 5 seconds on my VM setup, and found a regression in a
recent series. Add it to the auto and quick groups.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add 198 to the auto group

The quick group should be a strict subset of the auto group, so add these
two tests that are in the quick group to the auto group as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/xfs: use whole-word matching for _require_xfsrestore_xflag

On my system, the path to the xfsrestore binary is:

/code/xfsdump/build-x86_64/restore/xfsrestore

The grep command in _require_xfsrestore_xflag matches on the "build-x86"
part, even though my version of xfsrestore does not actually have a -x
flag. Fix the string parsing to match entire words so that we only look
for -x in the help output.

(Maybe someone should patch xfsrestore -h to report basename(argv[0])
instead of argv[0]...)

Fixes: 1ffa16c573 ("xfs: test for fixing wrong root inode number in dump")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: Chown mount even if already idmapped to account for remounts

This is a logical consequence of introducing the chown check in _idmapped_mount,
since now a read-only mount can be made idmapped successfully. But if the mount
is then remounted rw the chown never happens, as _idmapped_mount sees that it's
already idmapped and bows out early.

This patch fixes that by simply moving the chown ahead of the idmapped check,
so it will be performed in any case, even on already idmapped mounts.

Signed-off-by: Gabriel Niebler <gniebler@suse.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: Do not chown ro mountpoint when creating idmapped mount

The function _idmapped_mount tries to change the ownership of the mountpoint
for which it aims to create an idmapped mount, to ensure that the mapped UID
and GID can actually create objects within it. Some tests set up a read-only
mount, however, which lets the chown call fail. This patch fixes the
function to check whether the mount is read-only and skip the chown, if so.

Signed-off-by: Gabriel Niebler <gniebler@suse.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add a stress test for send v2 streams

Currently we don't have any test case in fstests to do randomized and
stress testing of the send stream v2, added in kernel 6.0 and support for
it in btrfs-progs v5.19. For the send v2 stream, we only have btrfs/281
that exercises a specific scenario which used to trigger a bug.

So add a test that uses fsstress to generate a filesystem and exercise
both full and incremental send operations using the v2 send stream with
compressed extents, and then receive the streams without and with
decompression, to verify they work and produce the same results as in
the original filesystem. This is the same base idea as btrfs/007, but
for the send v2 stream with compressed data.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/676: Unstable d_type handling for NFS READDIR

The NFS client may send READDIR or READDIRPLUS to populate the dentry
cache, and switch between them to optimize for least RPC calls based on the
process' behavior. When using READDIR, dentries will have d_type =
DT_UNKNOWN but with READDIRPLUS d_type will be set from the mode.

This heuristic will cause generic/676 to fail when comparing dentries
cached from one or the other call, since we compare d_type directly. Fix
this by bypassing the comparison of d_type if any entry is loaded with
DT_UNKNOWN.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>