From 5bb196119dff10adbf25f294104371226d307344 Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Tue, 16 Apr 2019 15:34:59 -0700 Subject: [PATCH] check: remove require_{test,scratch}* after a test fails Remove the require_{test,scratch]* sentinel files after a test fails. This eliminates false fsck corruption reports such as the following: 1. Test A calls _require_scratch, which creates the sentinel file $RESULT_DIR/require_scratch to facilitate fsck after the test completes. 2. Test A runs some test, which corrupts the scratch filesystem due to kernel bug or something. 3. Test A calls _fail because of the errors in (2). Note that the test case returned 1, so ./check unmounts the test and scratch filesystems without checking them or removing $RESULT_DIR/require_scratch 4. Test B starts up, but does not call _require_scratch. The $RESULT_DIR/require_scratch file is still there. 5. Test B completes successfully. 6. ./check calls _check_filesystems, which sees the $RESULT_DIR/require_scratch file and runs fsck. 7. fsck reports the corrupt scratch device (which is associated with test B) even though B did not ever touch the scratch device and it was actually test A that corrupted the filesystem. Note that with the "check: wipe scratch devices between tests" patch applied, we can also reproduce this problem by running xfs/172 and xfs/195 with a scratch device small enough that the files created in 172 span multiple AGs and therefore cause 172 to fail. Signed-off-by: Darrick J. Wong Reviewed-by: Eryu Guan Signed-off-by: Eryu Guan --- check | 2 ++ 1 file changed, 2 insertions(+) diff --git a/check b/check index a2c5ba21..0f141703 100755 --- a/check +++ b/check @@ -769,6 +769,8 @@ for section in $HOST_OPTIONS_SECTIONS; do _dump_err_cont "[failed, exit status $sts]" _test_unmount 2> /dev/null _scratch_unmount 2> /dev/null + rm -f ${RESULT_DIR}/require_test* + rm -f ${RESULT_DIR}/require_scratch* err=true else # the test apparently passed, so check for corruption -- 2.39.5