git.apps.os.sepia.ceph.com Git - xfstests-dev.git/log

misc: xfs_fsop_geom_t -> struct xfs_fsop_geom

Remove the typedef usage for the xfs geometry structure, which will
be removed in future patch.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs: fix filtering of scratch device in test case 048

The recent commit 4529b20e1aa8f9 ("btrfs/048: amend property validation
cases"), does not properly filter the scratch device because the error
messages are sent to stderr and not to stdout, and the pipe filter only
gets input from the stdout of the btrfs utility. We need to redirect the
stderr of the btrfs utility to its stdout.

Further, the golden output had the path "/mnt/scratch" hardcoded, instead
of using SCRATCH_MNT. Fix that as well.

The test was failing on any setup where the scratch device is not mounted
at "/mnt/scratch".

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs: stress send with deduplication and balance running in parallel

Stress send running in parallel with balance and deduplication against
files that belong to the snapshots used by send. The goal is to verify
that these operations running in parallel do not lead to send crashing
(trigger assertion failures and BUG_ONs), or send finding an inconsistent
snapshot that leads to a failure (reported in dmesg/syslog). The test
needs big trees (snapshots) with large differences between the parent and
send snapshots in order to hit such issues with a good probability.

This currently fails on btrfs, hitting a BUG_ON() often, and with btrfs
error messages in dmesg/syslog. The problem has always existed and it is
not new, but probably unnoticed due to lack of test cases that exercise
these btrfs features running in parallel.

The following patches for btrfs fix the problems:

"Btrfs: fix race between send and deduplication that lead to failures and
crashes"

"Btrfs: prevent send failures and crashes due to concurrent relocation"

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

check: filter lockdep bugs when scanning dmesg

Ignore lockdep complaining about its own bugginess when scanning dmesg
output, because we shouldn't be failing filesystem tests on account of
lockdep.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

check: wipe scratch devices between tests

Wipe the scratch devices in between each test to ensure that tests are
formatting them and not making assumptions about previous contents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

check: remove require_{test,scratch}* after a test fails

Remove the require_{test,scratch]* sentinel files after a test fails.
This eliminates false fsck corruption reports such as the following:

1. Test A calls _require_scratch, which creates the sentinel file
$RESULT_DIR/require_scratch to facilitate fsck after the test completes.

2. Test A runs some test, which corrupts the scratch filesystem due to
kernel bug or something.

3. Test A calls _fail because of the errors in (2). Note that the test
case returned 1, so ./check unmounts the test and scratch filesystems
without checking them or removing $RESULT_DIR/require_scratch

4. Test B starts up, but does not call _require_scratch. The
$RESULT_DIR/require_scratch file is still there.

5. Test B completes successfully.

6. ./check calls _check_filesystems, which sees the
$RESULT_DIR/require_scratch file and runs fsck.

7. fsck reports the corrupt scratch device (which is associated with
test B) even though B did not ever touch the scratch device and it was
actually test A that corrupted the filesystem.

Note that with the "check: wipe scratch devices between tests" patch
applied, we can also reproduce this problem by running xfs/172 and
xfs/195 with a scratch device small enough that the files created in 172
span multiple AGs and therefore cause 172 to fail.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs: test send on subvolume with delalloc after setting it to RO mode

Test that if we have a subvolume/snapshot that is writable, has a file
with unflushed delalloc (buffered writes not yet flushed), turn the
subvolume to readonly mode and then use it for send a operation, the send
stream will contain the delalloc data - that is, no data loss happens.

This currently files on btrfs (data loss) but is fixed by a patch for
the linux kernel titled:

"Btrfs: send, flush dellaloc in order to avoid data loss"

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fssum: add support for checking xattrs

Currently fssum, mostly used for btrfs test cases that test the btrfs send
feature, ignores completely the existence of xattrs. This change teaches
fssum to find xattrs and make them contribute to the checksum of a
filesystem, so that we can catch filesystem bugs regarding missing, corrupt
or not supposed to exist xattrs (i.e. that an incremental btrfs send does
not forget to create, update or remove xattrs).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: add operation for listing xattrs from files and directories

The previous patches added support for operations to set, get and delete
xattrs on regular files and directories, this patch just adds an operation
to list the xattrs of a file/directory.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: add operation for deleting xattrs from files and directories

The previous patches added support for operations to set and get xattrs on
regular files and directories, this patch just adds one operation to delete
xattrs on files and directories.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: add operation for reading xattrs from files and directories

The previous patch added support for an operation to set xattrs on regular
files and directories, this patch just adds one operation to read (get)
them.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: add operation for setting xattrs on files and directories

Currently fsstress does not exercise creating, reading or deleting xattrs
on files or directories. This change adds support for setting xattrs on
files and directories, using only the xattr user namespace (the other
namespaces are not general purpose and are used for security, capabilities,
ACLs, etc). This adds a counter for each file entry structure that keeps
track of the number of xattrs set for the file entry, and each new xattr
has a name that includes the counter's value (example: "user.x4").
Values for the xattrs have at most 100 bytes, which is more than the
maximum size supported for all major filesystems.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: allow afsync on directories too

Currently the afsync function can only be performed against regular files.
Allow it to operate on directories too, to increase test coverage and
allow for chances of finding bugs in a filesystem's implementation of
fsync against directories.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: allow fsync on directories too

Currently the fsync function can only be performed against regular files.
Allow it to operate on directories too, to increase test coverage and
allow for chances of finding bugs in a filesystem's implementation of
fsync against directories.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

clonerange: test remapping the rainbow

Add some more clone range tests that missed various "wacky" combinations
of file state. Specifically, we test reflinking into and out of rainbow
ranges (a mix of real, unwritten, hole, delalloc, and shared extents),
and also we test that we can correctly handle double-inode locking no
matter what order of inodes or the filesystem's locking rules.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/504: fix hard coded fd number

Bash supports file discriptor assignment in this way. So remove the hard
coded numbers. Also close this opened fd in cleanup.

Signed-off-by: Murphy Zhou <xzhou@redhat.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs/048: amend property validation cases

Add more property validation cases which are fixed by the patches [1]
[1]
btrfs: fix vanished compression property after failed set
btrfs: fix zstd compression parameter

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsx: test copy_file_range() using non-zero length copy

The copy_file_range() test detection code performs a zero-length
copy to determine whether to perform such calls during the test run.
While this detects the common case of syscall availability,
copy_file_range() has a somewhat variable implementation on the
kernel side that can depend on certain per-filesystem features, etc.
In some implementations, a zero length copy can shortcut and return
success before ever invoking per-filesystem functionality and thus
not thoroughly testing the copy mechanism on the current system.
This can cause the test detection code to pass only to run into an
immediate failure on the first copy_file_range() call during the
test.

Tweak test_copy_range() to perform a small single byte copy to avoid
this problem. Also fix a typo bug in the errno check of the clone
range detection logic.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/230: reset grace time before overcome hardlimit

Currently, we call repquota to report the latest quota information
after each test case. But repquota will invoke Q_SYNC on the ext4 file
system with old quota, which may be time consuming on the low speed or
busy scratch device. If we call repquota between the "overcome
softlimit" and the "overcome hardlimit" cases, the softlimit grace time
may be exceed after repquota return, and lead to test failure.

Now, we capture the following failure when the disk is busy:

   pwrite: Disk quota exceeded
   Touch 3+4
   Touch 5+6
  +touch: cannot touch 'SCRATCH_MNT/file5': Disk quota exceeded
   touch: cannot touch 'SCRATCH_MNT/file6': Disk quota exceeded
   Touch 5
   touch: cannot touch 'SCRATCH_MNT/file5': Disk quota exceeded

This patch reset grace time before the "overcome hardlimit" case to
avoid this failure.

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

src/t_attr_corruption: covert value to little endian order

generic/529 always fails on ppc64 or s390x big-endian machine as:

set posix acl: Operation not supported

Due to the members of struct posix_acl_xattr_entry/header need to be
little-endian byte order, so use htole*() helper to make sure that.

Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common/rc: use _try_scratch_mount for scratch_remount

When call _scratch_remount for cifs , it always requires to input
password. This will make generic/306 generic/452 failed because
cifs remount failed.

Signed-off-by: Xiaoli Feng <xifeng@redhat.com>
Reviewed-and-tested-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common/populate: decrease the step of rm file

Now that we have allocated 2*4096*64/16(32768) inodes after "Inode btree",
but the step of rm file is too large to create enough free inodes in agi.
So the freecount is not enough large to make free_level gt 1 and call
_scratch__populate on xfs will report the following failure(such as xfs/083):

Failed to create fino of sufficient height!

By decreasing the step of rm file, xfs/083 will pass.

Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: Test that SEEK_HOLE can find a punched hole

Added a test case to seek_sanity_test and a test to run it.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fstests: Add more sanity to seek_sanity_test

seek_sanity_test checks for one of several SEEK_DATA/HOLE
behaviors and allows for the default behavior of filesystems,
where SEEK_HOLE always returns EOF.

This means that if filesystem has a regression in finding
holes, the sanity test won't catch it. And indeed this regression
happened in overlayfs on kernel v4.19 and went unnoticed.

To improve test coverage, add a flag -f to seek_sanity_test to
indicate that the default behavior is not acceptable.
Whitelist all filesystem types that are expected to detect holes
and use wrapper when invoking seek_sanity_test to add the -f flag
to those filesystems.

Overlayfs inherits expected behavior from base fs type.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs: try use forget to unregister device

Some btrfs test cases use btrfs module-reload to unregister devices
in the btrfs kernel. The problem with the module-reload approach is,
if test system contains btrfs as rootfs, then you can't run these
test cases.

Patches [1] introduced btrfs forget feature which can unregister
devices without the module-reload approach.

[1]
btrfs-progs: device scan: add new option to forget one or all scanned devices
btrfs: introduce new ioctl to unregister a btrfs device

And this patch makes relevant changes in the fstests to use this new
feature, when available.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fstests: btrfs verify hardening agaist duplicate fsid

We have a known bug in btrfs, that we let the device path be changed
after the device has been mounted. So using this loop hole the new
copied device would appears as if its mounted immediately after its
been copied. So this test case reproduces this issue.

For example:

Initially.. /dev/mmcblk0p4 is mounted as /

lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
mmcblk0     179:0    0 29.2G  0 disk
|-mmcblk0p4 179:4    0    4G  0 part /
|-mmcblk0p2 179:2    0  500M  0 part /boot
|-mmcblk0p3 179:3    0  256M  0 part [SWAP]
`-mmcblk0p1 179:1    0  256M  0 part /boot/efi

btrfs fi show
Label: none  uuid: 07892354-ddaa-4443-90ea-f76a06accaba
    Total devices 1 FS bytes used 1.40GiB
    devid    1 size 4.00GiB used 3.00GiB path /dev/mmcblk0p4

Copy mmcblk0 to sda
dd if=/dev/mmcblk0 of=/dev/sda

And immediately after the copy completes the change in the device
superblock is notified which the automount scans using
btrfs device scan and the new device sda becomes the mounted root
device.

lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    1 14.9G  0 disk
|-sda4        8:4    1    4G  0 part /
|-sda2        8:2    1  500M  0 part
|-sda3        8:3    1  256M  0 part
`-sda1        8:1    1  256M  0 part
mmcblk0     179:0    0 29.2G  0 disk
|-mmcblk0p4 179:4    0    4G  0 part
|-mmcblk0p2 179:2    0  500M  0 part /boot
|-mmcblk0p3 179:3    0  256M  0 part [SWAP]
`-mmcblk0p1 179:1    0  256M  0 part /boot/efi
btrfs fi show /
Label: none  uuid: 07892354-ddaa-4443-90ea-f76a06accaba
    Total devices 1 FS bytes used 1.40GiB
    devid    1 size 4.00GiB used 3.00GiB path /dev/sda4

The bug is quite nasty that you can't either unmount /dev/sda4 or
/dev/mmcblk0p4. And the problem does not get solved until you take
the sda out of the system on to another system to change its fsid using
the 'btrfstune -u' command.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs/003: enable test with virtio_blk devices in VM

For a long time this test has been failing on all kinds of VM
configuration, which are using virtio_blk devices. This is due to
the fact that scsi devices are deletable and virtio_blk are not.
However, this only prevents device replace case to run and has no
negative effect on the other useful test cases.

Re-enable btrfs/003 to run by making
_require_deletable_scratch_dev_pool private to the test case and
modifying it to return success (0) or failure (1) if devices are not
deletable. Further modify the replace test case to check the return
value of this function and skip it if devices are not deletable.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: unaligned direct AIO write test

A simply reproducer from Frank Sorenson:

  ftruncate(fd, 65012224)
  io_prep_pwrite(iocbs[0], fd, buf[0], 1048576, 63963648);
  io_prep_pwrite(iocbs[1], fd, buf[1], 1048576, 65012224);

  io_submit(io_ctx, 1, &iocbs[0]);
  io_submit(io_ctx, 1, &iocbs[1]);

  io_getevents(io_ctx, 2, 2, events, NULL)

help to find an ext4 corruption:
           **************** **************** ****************
           *    page 1    * *    page 2    * *    page 3    *
           **************** **************** ****************
  existing 0000000000000000 0000000000000000 0000000000000000
  write 1    AAAAAAAAAAAAAA AA
  write 2                     BBBBBBBBBBBBBB BB

  result   00AAAAAAAAAAAAAA 00BBBBBBBBBBBBBB BB00000000000000
  desired  00AAAAAAAAAAAAAA AABBBBBBBBBBBBBB BB00000000000000

This issue remind us we might miss unaligned AIO test for long time.
We thought fsx cover this part, but looks like it's not. So this case
trys to cover unaligned direct AIO write test on file with different
initial truncate i_size.

The following patches fix the issue on xfs and ext4.

xfs: serialize unaligned dio writes against all other dio writes
ext4: Fix data corruption caused by unaligned direct AIO

Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsx: fix errors due to unsupported FIDEDUPERANGE

Older kernels (prior commit 494633fac7896 "vfs: vfs_dedupe_file_range()
doesn't return EOPNOTSUPP") will return EINVAL when operation is not
supported. Make fsx treat this error as a sign of unsupported
deduplication as well to make it usable with these older kernels.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/065: make sure SCRATCH_DEV is unmount before mkfs

Commit "8309b39a fstests: fix broken _require_scratch usage" did
below change on xfs/065:

  -_scratch_unmount 2>/dev/null
  +_scratch_mkfs_xfs >> $seqres.full

It cause xfs/065 always fails now, as:
  QA output created by 065
  mkfs.xfs: /dev/sdb2 contains a mounted filesystem
  ...

So use _require_scratch, to make sure the SCRATCH_DEV is unmounted
before mkfs.

Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/077: Don't delete $seqres.full file after test

When this test finishes there is no 077.full file with output from
commands. Sometimes this information is useful for post mortem so
stop deleting the file upon test completion.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: prohibit fstrim on journalled filesystems with norecovery

This test makes sure that we can't use stale unrecovered fs metadata to
drive a DISCARD festival on a disk and thereby destroy user data by
accident.

The following patches fixed the bug on ext4, xfs and btrfs
ext4: prohibit fstrim in norecovery mode
xfs: prohibit fstrim in norecovery mode
Btrfs: do not allow trimming when a fs is mounted with the nologreplay option

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fstests: Verify that removed device has its superblocks deleted

When a device is removed from a btrfs filesystem its superblock copies
must be deleted. This test ensures this is indeed the case.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: test stale data exposure after writeback crash

XFS has historically had a stale data exposure window if a crash
occurs after a delalloc->physical extent conversion but before
writeback completes to the associated extent. While this should be a
rare occurrence in production environments due to typical writeback
ordering and such, it is not guaranteed in all cases until data
extents are initialized as unwritten (or otherwise zeroed) before
they are written.

Add a test that performs selective writeback ordering to reproduce
stale data exposure after a crash. Note that this test currently
fails on XFS.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fstests: Add missing call to _scratch_dev_pool_put

Every call to _scratch_dev_pool_get must be paired with call to
_scratch_dev_pool_put otherwise the SCRATCH_POOL variable will have
less devices than it actually must.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs: test xfs_copy and xfs_mdrestore on the populate images

Make sure that copy, metadump, and mdrestore work on a filesystem with
all known metadata types.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

_require_prjquota: Disable tests only when using realtime fs

$USE_EXTERNAL needs to be set when using external log devices. In such a
setup, tests which have "_require_prjquota
$SCRATCH_DEV" (e.g. generic/383) incorrectly end up being marked as
"not run" since the test "[ "$USE_EXTERNAL" = yes -a ! -z "$_dev" ]"
evaluates to true.

This commit fixes the bug by marking the test as "not run" only when
$USE_EXTERNAL is set and one of $TEST_RTDEV or $SCRATCH_RTDEV is set.

Signed-off-by: Chandan Rajendra <chandan@linux.ibm.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fstests: fix broken _require_scratch usage

_require_scratch doesn't actually format the scratch device with
anything, which means that tests are required to format them before
using them. Fix tests that don't do this correctly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common/populate: support multiple cached images

Enhance the populated fs metadump image cache to support multiple
configurations per filesystem so that we reduce the image creation
overhead even further.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common/populate: refactor _scratch_populate_cached

Refactor _scratch_populate_cached into smaller helper functions.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

ext4/023: don't require scrub for ext4 populated image creation

Don't require scrub for ext4's populated fs creation test because there
is no general online scrub program for ext*.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/454: stop the test if we run out of space

Certain filesystems (ext4 w/ 1k block size) can run out of space while
running this test because they have very limited xattr storage
capabilities. If we run out of space while setting an attr, don't
bother continuing the test.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/032: fix unwritten extent checks

Fix the unwritten extent detector in this test to ignore post-eof
allocations because those are harmless.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/042: fix stale disk contents check

This test doesn't call fsync or sync to force writeback of the first 60k
of the file, which means that we could end up with a file full of
zeroes or an empty file. Since this is a regression test that looks for
stale disk contents slipping through, change the test to look for the
stale bytes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

check: really improve test list randomization

coreutils provides the shuf(1) utility that randomizes the order of a
list and seeds its random number generator with /dev/urandom. It's a
bit speedier than awk, so use it if available.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

check: improve test list randomization

awk doesn't have a particularly good random number generator -- it seeds
from the Unix epoch time in seconds, which means that the run order
across a bunch of VMs started at exactly the same time are unsettlingly
predictable. Therefore, at least try to seed it with bash's $RANDOM,
which is slightly less predictable.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/131: Create server.out manually.

When test generic/131 on nfs, the 'server.out' maybe create
later than expect. Because the server is running on background,
we should ensure the 'server.out' is exist before 'cat' it.

So, let's create the server.out manually.

Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/398: accept EXDEV for rename or link into encrypted dir

Update generic/398 to pass after kernel commit f5e55e777cc9 ("fscrypt:
return -EXDEV for incompatible rename or link into encrypted dir"),
which intentionally changed some error codes from EPERM to EXDEV in
order to allow standard tools like 'mv' to move files into an encrypted
directory.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/42[01]: don't disturb unwritten status with md5sum

The way we decided if an unwritten extent is considered a hole or
data is by checking if the page and/or blocks are marked uptodate,
that is contain valid data in the page cache.

xfs/420 and xfs/421 try to exercise SEEK_HOLE / SEEK_DATA in the
presence of cowextsize preallocations over holes in the data fork.
The current XFS code never actually uses those for buffer writes,
but a pending patch changes that. For SEEK_HOLE / SEEK_DATA to work
properly in that case we also need to look at the COW fork in their
implementations and thus have to rely on the unwritten extent page
cache probing. But the tests for it ensure we do have valid data in
the pagecache by calling md5sum on the test files, and thus reading
their contents (including the zero-filled holes) in, and thus making
them all valid data.

Fix that by dropping the page cache content after the md5sum calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: test i_mode recovery after power failure

After fsync, filesystem should guarantee inode metadata including
permission info being persisted, so even after sudden power-cut,
during mount, we should recover i_mode fields correctly, in order
to not loss those meta info.

So adding this testcase to check whether generic filesystem can
guarantee that.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: add test for fsync after shrinking truncate and rename

Test that if we truncate a file to reduce its size, rename it and then
fsync it, after a power failure the file has a correct size and name.

This test is motivated by a bug found in btrfs, which is fixed by a
patch for the linux kernel titled:

"Btrfs: fix incorrect file size after shrinking truncate and fsync"

This test currently passes on ext4, xfs, f2fs and patched btrfs.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/097: split user.* xattr tests to new test

Split out most of the user.* tests from 097 and move them to a new
test that only tests user.* xattrs.

This makes it possible to use this test on filesystems that can only
provide user.* xattrs such as CIFS.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

populate: force large finobt creation on xfs

Teach the populate routines to create enough inodes that we end up with
multi-level inode btrees.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: test statx attribute_mask setting

Make sure the filesystem reports attribute_mask for the attributes it
supports.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

ext4/032: SCRATCH_DIR -> SCRATCH_MNT

Use SCRATCH_MNT, not SCRATCH_DIR.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

shared/298: unmount filesystem before examining underlying storage

This test does some weird things with live filesystems -- it seems to be
validating the behavior of fstrim by comparing the filesystem's free
space map to holes in the file image that backs the filesystem.
However, this doesn't account for the fact that some filesystems
maintain in-core preallocations and/or can perturb the free space data
during unmount. This causes sporadic test failures when the two become
out of sync.

Therefore, make sure we unmount the filesystem before we start running
tools against the filesystem image file to eliminate the possibility of
changes to the free space map. This was found by running shared/298 on
xfs with a 1k block size.

cc: enwlinux@gmail.com
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/010: use correct type for finobt corrupting

Use 'type finobt' for corrupting the finobt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common/rc: fix get_max_lfs_filesize

Helper functions are supposed to have a leading underscore ('_') in the
function name, but this one doesn't have it. Unfortunately, the calling
test cases (generic/349-351) /do/ have the leading underscore, so now
they're broken.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/{436,445}: check falloc support

The sanity test case in those tests (i.e. 13..17)
are all skipped in fs with no falloc support, but the tests
are reported to pass.

For example, from 445.full:

File system supports the default behavior.
File system does not support fallocate.
Allocation size: 4096
17. Test file with unwritten extents, data-hole-data inside page
Test skipped as fs doesn't support unwritten extents.

Explicitly check for falloc support before running those tests
so they would be properly reported as skipped.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/482: use thin volume as data device

The dm-log-writes replay mechanism issues discards to provide
zeroing functionality to prevent out-of-order replay issues. These
discards don't always result in zeroing bevavior, however, depending
on the underlying physical device. In turn, this causes test
failures on XFS v5 filesystems that enforce metadata log recovery
ordering if the filesystem ends up with stale data from the future
with respect to the active log at a particular recovery point.

To ensure reliable discard zeroing behavior, use a thinly
provisioned volume as the data device instead of using the scratch
device directly. This slows the test down slightly, but provides
reliable functional behavior at a reduced cost from active snapshot
management or forced zeroing.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common/dmlogwrites: genericize log writes target device

The dm-log-writes infrastructure is currently implemented to use
SCRATCH_DEV as a hardcoded data device. In preparation to allow use
of specialized devices in certain circumstances, genericize the code
to allow an arbitrary data device. This requires passing the target
device as a parameter to several helper functions from various
tests. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

t_attr_corruption: fix this yet again

Jeff Mahoney pointed out that 'security.evm' actually has an expected
value format, which breaks the test if EVM is enabled. It turns out
that the 'security.evm' setxattr call in the original syzkaller report
was a total red herring, as this bug can be reproduced without it.

Fix the test case to do the minimum amount of work needed to reproduce
the corruption.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fstests: don't oom the box opening tmpfiles

For the t_open_tmpfiles tests, limit ourselves to half of file-max
so that we don't OOM the test machine.

[Eryu: fix comments too to match the new limit]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: show correct offsets and length for copy_file_range

Copy original offsets and length and use them for logging as in
splice_f. Fix grammar mistakes in the comment about them.

Signed-off-by: Rostislav Skudnov <rostislav@tuxera.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common/rc: add _get_max_lfs_filesize to return MAX_LFS_FILESIZE

Pick up the common function _get_max_lfs_filesize() to return
MAX_LFS_FILESIZE.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/299: limit max file size

For some filesystem, such as vfat, the max support file size is 4G.
We limit the max size and let the test go on running.

Fix it by moving the function get_max_file_size() of generci/485 to
common/rc, and add the max filesize limit to generic/299.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

shared/298: Wire btrfs support in get_free_sectors

Add support for btrfs in shared/298. Achieve this by introducing 2
new awk scripts that parse relevant btrfs structures and print holes.
Additionally modify the test to create larger - 3gb filesystem in the
case of btrfs. This is needed so that distinct block groups are used
for data and metadata.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

src/Makefile: Link clock_gettime(2) with -lrt

Compiling t_open_tmpfiles.c failed on older glibc(before glibc v2.17)
because clock_gettime(2) was not linked with -lrt, as below:
--------------------------------------------------------------------
/home/yangxiao/xfstests/src/t_open_tmpfiles.c:36: undefined reference to `clock_gettime'
--------------------------------------------------------------------

According to clock_gettime(2) manpage, we should link clock_gettime(2)
with -lrt on older glibc.

Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

src/t_attr_corruption: fix xattr.h include problems

Apparently newer versions of libattr (which haven't yet been picked
up by Debian or Ubuntu) don't ship xattr.h anymore, because we're
supposed to use the libc version in sys/xattr.h. So do that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fstests: add a seek group

This groups all tests exercising SEEK_DATA / SEEK_HOLE behavior.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/42[01]: remove from the dedup group

No dedup functionality is exercised by these tests.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

src/t_open_tmpfiles: flush log when shutting down filesystem

If the caller of t_open_tmpfiles wants to shut down the filesystem,
be sure to flush the log when we shut down so that log recovery will
have to process all the unlinked temporary files.

This is apparently needed to force ext4 to flush updated inode
blocks through the journal at all.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/500: fix reflink support detection and add new groups

Fix some problems detecting reflink support in the test.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/450: require working falloc command

This test needs to check for working falloc command before using it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

shared: cgroup aware writeback accounting test

A test to perform reads/writes under various cgroups and verify that
I/Os are accounted properly according to cgroup aware writeback.
This is a generic test, but not all commonly used local filesystems
support cgroup aware writeback at the moment (i.e., XFS). Therefore,
this test currently requires ext4 or btrfs for the time being.

The common/cgroup2 file is copied from a separate cgroup related
patch from Shaohua Li that never made it upstream.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: avoid infinite zero byte reading

copyrange_f and splice_f functions use a while loop to read a file,
it's fine if there's only one fsstress process(and its children),
but if some third part testing processes remove the file in the
middle phase of copyrange_f running, copyrange_f maybe always return
0, and the while loop can't be end. As below:

root     47184  xxxxxx S+ ./fsstress -R -d /mnt/scratch -n 10000 -p 20 -v
root     47187  xxxxxx R+ ./fsstress -d /mnt/scratch -n 10000 -p 20 -v
root     47199  xxxxxx R+ ./fsstress -d /mnt/scratch -n 10000 -p 20 -v
root     47314  xxxxxx S+ grep --color=auto fsstress
...
...
copy_file_range(3, [372258], 4, [2658770], 71179, 0) = 0
copy_file_range(3, [372258], 4, [2658770], 71179, 0) = 0
copy_file_range(3, [372258], 4, [2658770], 71179, 0) = 0
copy_file_range(3, [372258], 4, [2658770], 71179, 0) = 0
...
...
lr-x------. 1 root root 64 Jan 28 11:34 /proc/47187/fd/3 -> '/mnt/scratch/p2/f2 (deleted)'

Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: check the behavior of programs opening a lot of O_TMPFILE files

Create a test (+ helper program) that opens as many unlinked files as it
possibly can on the scratch filesystem, then closes all the files at
once to stress-test unlinked file cleanup. Add an xfs-specific test to
make sure that the fallback code doesn't bitrot.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

inject: skip tests when knob dir exists but knob doesn't

If the XFS error injection knob directory exists but the knob itself
doesn't, then we know that this kernel doesn't support the knob and
can skip the test.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: posix acl extended attribute memory corruption test

XFS had a use-after-free bug when xfs_xattr_put_listent runs out of
listxattr buffer space while trying to store the name
"system.posix_acl_access" and then corrupts memory by not checking
the seen_enough state and then trying to shove
"trusted.SGI_ACL_FILE" into the buffer as well.

In order to tickle the bug in a user visible way we must have
already put a name in the buffer, so we take advantage of the fact
that "security.evm" sorts before "system.posix_acl_access" to make
sure this happens.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: check for reasonable inode creation time

If statx returns inode creation time (aka btime), check it to make
sure that the filesystem is setting a creation time that's
reasonably close to when it creates a file.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common: fix _require_btime for lazy filesystems

Filesystems are not required to try to fill the statx btime field
unless the caller actually sets STATX_BTIME. They're allowed to
volunteer that information "if it's cheap", but XFS doesn't
volunteer and there may be filesystems that support btime but not
cheaply.

Either way, we want to test btime on any filesystem that supports
it, cheaply or otherwise, so set STATX_BTIME when we're trying to
detect support for it.

[Eryu: fix _require_scratch_btime too]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common: fix kmemleak to work with sections

Refactor the kmemleak code to work correctly with sections. This
requires changing the location of the "is kmemleak enabled?" flag to
use /tmp instead of RESULT_BASE, scanning for leaks after every
test, and clarifying which functions get used when.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs: test mkfs extent size hint validation

Make sure mkfs won't format filesystems that fail extent size hint
validation.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic/075,112: detect preallocation support for fsx tests

Currently generic/075 and generic/112 have two extra fsx passes each
that exercise fsx with preallocation, which are only enabled for
XFS.

These tests can also be run with other file systems, given that the
XFS prealloc ioctls are implemented in generic code since the
addition of the fallocate system call. This also means a version of
XFS that does not support preallocation (e.g. because it always
writes out of place) can skip the prealloc tests while still
completing the normal fsx tests just fine.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs: test for corruption when reading compressed files

Regression test for read corruption of compressed and shared extents
after punching holes into a file. The same extent is shared by the
same file in consecutive ranges (without other extents in between).

This is motivated by a bug recently found in btrfs for which there
is a patch for the linux kernel titled:

"Btrfs: fix corruption reading shared and compressed extents after hole
punching"

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: test fsync after succession of renames and unlink

Test that after a combination of file renames, linking and creating
a new file with the old name of a renamed file, if we fsync the new
file, after a power failure we are able to mount the filesystem and
all file names correspond to the correct inodes.

This test is motivated by a bug found in btrfs, which is fixed by
applying the following two patches to the linux kernel:

"[PATCH 1/2] Btrfs: fix fsync after succession of renames of different files"
"[PATCH 2/2] Btrfs: fix fsync after succession of renames and unlink/rmdir"

The test passes on ext4, xfs and patched btrfs, however at least in
a 5.0-rc5 linux kernel, it fails on f2fs.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

generic: test fsync after succession of file renames

Test that after a combination of file renames, linking and creating
a new file with the old name of a renamed file, if we fsync the new
file, after a power failure we are able to mount the filesystem and
all file names correspond to the correct inodes.

This test is motivated by a bug found in btrfs which is fixed by a
patch for the linux kernel titled:

"Btrfs: fix fsync after succession of renames of different files"

The test passes on ext4, xfs and patched btrfs, however at least in
a 5.0-rc5 linux kernel, it fails on f2fs.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

src/t_mtab: Add error check for unlock_mtab()

When unlink() fails, that is, when the lock file is not deleted
successfully, variable we_created_lockfile is still set to 0.

On the next iteration, the 3 processes will not be able to
successfully create the lock file.

Signed-off-by: Cui Yue <cuiyue-fnst@cn.fujitsu.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/068: Verify actual file count instead of reported file count

This test has the number of files/dirs created by xfsrestore hardcoded
in golden output.

When fsstress is added new ops, the number of files/dirs created with
the same random seed changes and this regularly breaks this test,
so when new fsstress ops are added they should be either added to the
dump test blacklist or golden output of this test needs to be ammended
to reflect the change.

The golden output includes only the file count reported by xfsrestore
and test does not even verify that this is the correct file count.
Instead, leave the golden output neutral and explicitly verify that
file count before and after the test are the same.

With this change, the test becomes agnostic to fsstress ops and we
could also stop blacklisting clone/dedup/copy ops if we want.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fstests: Check that high-offset reads and writes work on non-blockdev fs

This is a variant of test generic/466 for filesystems that
do not support mkfs_sized

It is needed for testing high-offset reads and writes with overlayfs
over a basefs that supports huge files.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/252: requires fallocate support for preallocation

xfs/252 has a few feature tests, but misses checking for preallocation
support. Because of that it will fail instead of not being run for
and XFS file system in always COW mode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

fsstress: avoid splice_f generating too large sparse file

Thanks to Darrick J. Wong find this issue! Current splice_f generates
file offset as below:

lr = ((int64_t)random() << 32) + random();
off2 = (off64_t)(lr % maxfsize);

It generates a pseudorandom 64-bit candidate offset for the
destination file where we'll land the splice data, and then caps the
offset at maxfsize (which is 2^63- 1 on x64), which effectively means
that the data will appear at a very high file offset which creates
large (sparse) files very quickly.

That's not what we want, and some case likes shared/009 will take
forever to run md5sum on lots of huge files.

Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

overlay/060: Use falloc to make sure a meta copy file got copied up

Overlayfs might copy up data of file on first write of file (and
not necessarily upon open of file). So call falloc file opened
with O_RDWR and after that data must have been copied up.

[Eryu: add _require_xfs_io_command "falloc" to make sure underlying
fs have fallocate(2) support]

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

common/dump: do not override test cleanup trap

Instead, call _cleanup_dump explicitly from a private _cleanup.
Remove the generic cleanup bits (rm $tmp.*) from _cleanup_dump.

The only xfs/dump test that had anything other than rm $tmp.* in
_cleanup in xfs/287, but that was _scratch_unmount, which is not
needed anyway.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/138: format the scratch device before using it

Format the scratch device before using it, or else xfs_db will fail,
particularly if the previous test left a corrupt fs behind.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

xfs/093: make sure the scratch directory still exists after repair

Make sure that we still have the scratch directory after repairing our
corrupted filesystem, because repair could have nuked it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

overlay/061: enhance mmap ro/rw inconsistencies test

overlay/061 is currently the only overlay test that is expected to
fail on upstream kernel.

It is a flavor of test overlay/016 with mread in stead of pread.
The ro/rw inconsistencies related to file read()/write() API were
fixed with stacked file operations in v4.19, but the ro/rw
inconsistencies related to shared mmap read/write remain to be
fixed.

The test currently checks cache coherency between mmap read and file
write(), but this sort of cache coherency is a Linux implementation
detail not a requirement of the API.

Instead of mread vs. pwrite, check consistency of mread vs. mwrite
to shared mmap, which is required by the MMAP_SHARED API.

Because we can, perform the test on shared memory that maps files
that are already close and check also that mwrite after file is
closed is persistent. This adds test coverage for future overlayfs
writeback code.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

overlay: Do not lose security.capability xattr over metadata only file copy-up

Extend test 064 to check security.capability xattr is not lost over
copy-up of a metadata only file. This requires mounting overlay with
option metacopy=on and first trigger metadata only copy-up and then
trigger data copy-up.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs: Test if btrfs will report false ENOSPC error balancing small metadata chunk

This is a test case for a long existing bug, caused by
over-estimated metadata space_info::bytes_may_use.

There is one proposed patch for btrfs-progs to fix it, titled:
"btrfs-progs: balance: Sync the fs before balancing metadata chunks"

The test case itself is almost the same as btrfs/181, which uses
small files to bump the reserved space to trigger the false alert.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>

btrfs: Test if btrfs will commit too many transactions for balance

Kernel commit 64403612b73a ("btrfs: rework
btrfs_check_space_for_delayed_refs") is introducing a regression for
btrfs balance performance.

Since that commit will cause btrfs to commit too many transactions
for nothing during balance/relocation, it will slow balance
dramatically even we only need to relocate several megabytes.

This test case will catch the problem by using super block
generation as failure criteria.

For small chunk relocated, we will commit 6 transactions for each
block group, and the test case should only have 2 block groups, it
should only commit 12 transactions.

This test case will use 120 as the threshold to detect the failure.

And in my test environment, with kernel fix btrfs committed 14
transactions. While without the fix btrfs committed 209
transactions.

So the test case should be enough to detect the regression, while still
keep the runtime small enough for failure.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>