From 5bb196119dff10adbf25f294104371226d307344 Mon Sep 17 00:00:00 2001
From: "Darrick J. Wong" <darrick.wong@oracle.com>
Date: Tue, 16 Apr 2019 15:34:59 -0700
Subject: [PATCH] check: remove require_{test,scratch}* after a test fails

Remove the require_{test,scratch]* sentinel files after a test fails.
This eliminates false fsck corruption reports such as the following:

1. Test A calls _require_scratch, which creates the sentinel file
$RESULT_DIR/require_scratch to facilitate fsck after the test completes.

2. Test A runs some test, which corrupts the scratch filesystem due to
kernel bug or something.

3. Test A calls _fail because of the errors in (2).  Note that the test
case returned 1, so ./check unmounts the test and scratch filesystems
without checking them or removing $RESULT_DIR/require_scratch

4. Test B starts up, but does not call _require_scratch.  The
$RESULT_DIR/require_scratch file is still there.

5. Test B completes successfully.

6. ./check calls _check_filesystems, which sees the
$RESULT_DIR/require_scratch file and runs fsck.

7. fsck reports the corrupt scratch device (which is associated with
test B) even though B did not ever touch the scratch device and it was
actually test A that corrupted the filesystem.

Note that with the "check: wipe scratch devices between tests" patch
applied, we can also reproduce this problem by running xfs/172 and
xfs/195 with a scratch device small enough that the files created in 172
span multiple AGs and therefore cause 172 to fail.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
---
 check | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/check b/check
index a2c5ba21..0f141703 100755
--- a/check
+++ b/check
@@ -769,6 +769,8 @@ for section in $HOST_OPTIONS_SECTIONS; do
 			_dump_err_cont "[failed, exit status $sts]"
 			_test_unmount 2> /dev/null
 			_scratch_unmount 2> /dev/null
+			rm -f ${RESULT_DIR}/require_test*
+			rm -f ${RESULT_DIR}/require_scratch*
 			err=true
 		else
 			# the test apparently passed, so check for corruption
-- 
2.39.5