common/rc: use directio mode for the loop device when possible
Recently, I've been observing very high runtimes of tests that
format a filesystem atop a loop device and write enough data to fill
memory, such as generic/590 and generic/361. Logging into the test
VMs, I noticed that the writes to the file on the upper filesystem
started fast, but soon slowed down to about 500KB/s and stayed that
way for nearly 20 minutes. Looking through the D-state processes on
the system revealed:
Here's the xfs_io process performing a buffered write to the file on the
upper filesystem, which at this point has dirtied enough pages to be
ratelimited.
Here's the loop device worker handling the writeback IO submitted by the
flusher thread. Unfortunately, the loop device is using buffered write
mode, which means that /writeback/ is dirtying pages and being throttled
for that. This is stupid.
Fix this by trying to enable "directio" mode on the loop device, which
delivers two performance benefits: setting directio mode also enables
async io mode, which will allow multiple IOs at once; and using directio
nearly eliminates the chance that writeback will get throttled.
On the author's system with fast storage, this reduces the runtime of
g/590 from 20 minutes to 12 seconds, and g/361 from ~30s to ~3s.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Eryu Guan <guaneryu@gmail.com>