Thanks to Darrick J. Wong find this issue! Current splice_f generates
file offset as below:
lr = ((int64_t)random() << 32) + random();
off2 = (off64_t)(lr % maxfsize);
It generates a pseudorandom 64-bit candidate offset for the
destination file where we'll land the splice data, and then caps the
offset at maxfsize (which is 2^63- 1 on x64), which effectively means
that the data will appear at a very high file offset which creates
large (sparse) files very quickly.
That's not what we want, and some case likes shared/009 will take
forever to run md5sum on lots of huge files.
Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
/*
* splice can overlap write, so the offset of the target file can be
- * any number (< maxfsize)
+ * any number. But to avoid too large offset, add a clamp of 1024 blocks
+ * past the current dest file EOF
*/
lr = ((int64_t)random() << 32) + random();
- off2 = (off64_t)(lr % maxfsize);
+ off2 = (off64_t)(lr % MIN(stat2.st_size + (1024ULL * stat2.st_blksize), MAXFSIZE));
/*
* Due to len, off1 and off2 will be changed later, so record the