mkfs: adjust_nr_zones for zoned file system on conventional devices
When creating zoned file systems on conventional devices, mkfs doesn't
currently align the RT device size to the zone size, which can create
unmountable file systems. Fix this by moving the rgcount modification
to account for reserved zoned and then calling adjust_nr_zones
unconditionally, and thus ensuring that the rtblocks and rtextents values
are guaranteed to always be a multiple of the zone size.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Tue, 9 Dec 2025 20:57:38 +0000 (12:57 -0800)]
xfs_logprint: fix pointer bug
generic/055 captures a crash in xfs_logprint due to an incorrect
refactoring trying to increment a pointer-to-pointer whereas before it
incremented a pointer.
Fixes: 5a9b7e95140893 ("logprint: factor out a xlog_print_op helper") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
On big-endian architectures (e.g. s390x), restoring a filesystem from a
v2 metadump fails with "Invalid superblock disk address/length". This is
caused by restore_v2() treating a superblock extent length of 1 as an
error, even though a length of 1 is expected because the superblock fits
within a 512-byte sector.
On little-endian systems, the same raw extent length bytes that represent
a value of 1 on big-endian are misinterpreted as 16777216 due to byte
ordering, so the faulty check never triggers there and the bug is hidden.
Fix the issue by using an endian-correct comparison of xme_len so that
the superblock extent length is validated properly and consistently on
all architectures.
Signed-off-by: Pavel Reichl <preichl@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
repair: add canonical names for the XR_INO_ constants
Add an array with the canonical name for each inode type so that code
doesn't have to implement switch statements for that, and remove the now
trivial process_misc_ino_types and process_misc_ino_types_blocks
functions.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Tue, 9 Dec 2025 16:16:08 +0000 (08:16 -0800)]
mkfs: enable new features by default
Since the LTS is coming up, enable parent pointers and exchange-range by
default for all users. Also fix up an out of date comment.
I created a really stupid benchmarking script that does:
#!/bin/bash
# pptr overhead benchmark
umount /opt /mnt
rmmod xfs
for i in 1 0; do
umount /opt
mkfs.xfs -f /dev/sdb -n parent=$i | grep -i parent=
mount /dev/sdb /opt
mkdir -p /opt/foo
for ((i=0;i<5;i++)); do
time fsstress -n 100000 -p 4 -z -f creat=1 -d /opt/foo -s 1
done
done
This is the result of creating an enormous number of empty files in a
single directory:
# ./dumb.sh
naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0
real 0m18.807s
user 0m2.169s
sys 0m54.013s
naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=1
real 0m20.654s
user 0m2.374s
sys 1m4.441s
As you can see, there's a 10% increase in runtime here. If I make the
workload a bit more representative by changing the -f argument to
include a directory tree workout:
libfrog: fix incorrect FS_IOC_FSSETXATTR argument to ioctl()
xfsprogs 6.17.0 has broken project quota due to incorrect argument
passed to FS_IOC_FSSETXATTR ioctl(). Instead of passing struct fsxattr,
struct file_attr was passed.
# LC_ALL=C /usr/sbin/xfs_quota -x -c "project -s -p /home/xxx 389701" /home
Setting up project 389701 (path /home/xxx)...
xfs_quota: cannot set project on /home/xxx: Invalid argument
Processed 1 (/etc/projects and cmdline) paths for project 389701 with
recursion depth infinite (-1).
There seems to be a double mistake which hides the original ioctl()
argument bug on old kernel with xfsprogs built against it. The size of
fa_xflags was also wrong in xfsprogs's linux.h header. This way when
xfsprogs is compiled on newer kernel but used with older kernel this bug
uncovers.
Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
When we are picking a zone for gc it might already be in the pipeline
which can lead to us moving the same data twice resulting in in write
amplification and a very unfortunate case where we keep on garbage
collecting the zone we just filled with migrated data stopping all
forward progress.
Fix this by introducing a count of on-going GC operations on a zone, and
skip any zone with ongoing GC when picking a new victim.
Fixes: 080d01c41 ("xfs: implement zoned garbage collection") Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Co-developed-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
For regular block devices using the zoned allocator, the default
maximum number of open zones is set to 1/4 of the number of realtime
groups. For a large capacity device, this leads to a very large limit.
E.g. with a 26 TB HDD:
mount /dev/sdb /mnt
...
XFS (sdb): 95836 zones of 65536 blocks size (23959 max open)
In turn such large limit on the number of open zones can lead, depending
on the workload, on a very large number of concurrent write streams
which devices generally do not handle well, leading to poor performance.
Introduce the default limit XFS_DEFAULT_MAX_OPEN_ZONES, defined as 128
to match the hardware limit of most SMR HDDs available today, and use
this limit to set mp->m_max_open_zones in xfs_calc_open_zones() instead
of calling xfs_max_open_zones(), when the user did not specify a limit
with the max_open_zones mount option.
For the 26 TB HDD example, we now get:
mount /dev/sdb /mnt
...
XFS (sdb): 95836 zones of 65536 blocks (128 max open zones)
This change does not prevent the user from specifying a lareger number
for the open zones limit. E.g.
mount -o max_open_zones=4096 /dev/sdb /mnt
...
XFS (sdb): 95836 zones of 65536 blocks (4096 max open zones)
Finally, since xfs_calc_open_zones() checks and caps the
mp->m_max_open_zones limit against the value calculated by
xfs_max_open_zones() for any type of device, this new default limit does
not increase m_max_open_zones for small capacity devices.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Don't pass expr to XFS_TEST_ERROR. Most calls pass a constant false,
and the places that do pass an expression become cleaner by moving it
out.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
[aalbersh: remove argument from a macro and fix call in defer_item.c] Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
These are purely in-memory values and not used at all in xfsprogs.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
When mounting file systems with a log that was dirtied on i386 on
other architectures or vice versa, log recovery is unhappy:
[ 11.068052] XFS (vdb): Torn write (CRC failure) detected at log block 0x2. Truncating head block from 0xc.
This is because the CRCs generated by i386 and other architectures
always diff. The reason for that is that sizeof(struct xlog_rec_header)
returns different values for i386 vs the rest (324 vs 328), because the
struct is not sizeof(uint64_t) aligned, and i386 has odd struct size
alignment rules.
This issue goes back to commit 13cdc853c519 ("Add log versioning, and new
super block field for the log stripe") in the xfs-import tree, which
adds log v2 support and the h_size field that causes the unaligned size.
At that time it only mattered for the crude debug only log header
checksum, but with commit 0e446be44806 ("xfs: add CRC checks to the log")
it became a real issue for v5 file system, because now there is a proper
CRC, and regular builds actually expect it match.
Fix this by allowing checksums with and without the padding.
Fixes: 0e446be44806 ("xfs: add CRC checks to the log") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Also fix up the comment about the struct xfs_extent definition to be
correct and read more easily.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
There are almost no users of the typedef left, kill it and switch the
remaining users to use the underlying struct.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
These sysctl knobs were scheduled for removal in September 2025. That
time has come, so remove them.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
These four mount options were scheduled for removal in September 2025,
so remove them now.
Cc: preichl@redhat.com Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Darrick J. Wong [Tue, 2 Dec 2025 01:28:00 +0000 (17:28 -0800)]
man2: fix getparents ioctl manpage
Fix a silly typo in the manual page for the GETPARENTS ioctl.
Cc: linux-xfs@vger.kernel.org # v6.10.0 Fixes: a24294c252d4a6 ("man: document the XFS_IOC_GETPARENTS ioctl") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Tue, 2 Dec 2025 01:27:29 +0000 (17:27 -0800)]
libxfs: fix build warnings
gcc 14.2 with all the warnings turn on complains about missing
prototypes for these two functions:
util.c:147:1: error: no previous prototype for 'current_fixed_time' [-Werror=missing-prototypes]
147 | current_fixed_time(
| ^~~~~~~~~~~~~~~~~~
util.c:590:1: error: no previous prototype for 'get_deterministic_seed' [-Werror=missing-prototypes]
590 | get_deterministic_seed(
| ^~~~~~~~~~~~~~~~~~~~~~
Since they're not used outside of util.c, just make them static.
Fixes: 4a54700b4385bb ("libxfs: support reproducible filesystems using deterministic time/seed") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Re-indent, drop typedefs and invert a conditional to allow for an early
return.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[aalbersh: add one column of tabs to arguments and vars definitions] Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
Toward the caller. And reindent it while we're at it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[aalbersh: add one tab column to arguments to make it align with in_f32] Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
The code has been stubbed out since the initial creation of the
xfsprogs repository. Open code the single-line printf in the
data fork caller (attr forks can't contain directories) and remove
the dead code.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
Darrick J. Wong [Fri, 21 Nov 2025 16:39:37 +0000 (08:39 -0800)]
xfs_scrub: fix null pointer crash in scrub_render_ino_descr
Starting in Debian 13's libc6, passing a NULL format string to vsnprintf
causes the program to segfault. Prior to this, the null format string
would be ignored. Because @format is optional, let's explicitly steer
around the vsnprintf if there is no format string. Also tidy whitespace
in the comment.
Found by generic/45[34] on Debian 13.
Cc: linux-xfs@vger.kernel.org # v6.10.0 Fixes: 9a8b09762f9a52 ("xfs_scrub: use parent pointers when possible to report file operations") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Carlos Maiolino [Thu, 13 Nov 2025 13:57:11 +0000 (14:57 +0100)]
metadump: catch used extent array overflow
An user reported a SIGSEGV when attempting to create a metadump image of
a filesystem.
The reason is because we fail to catch a possible overflow in the
used extents array in process_exinode() which may happen if the extent
count is corrupted.
This leads process_bmbt_reclist() to attempt to index into the array
using the bogus extent count with:
convert_extent(&rp[numrecs - 1], &o, &s, &c, &f);
Fix this by extending the used counter to uint64_t and
checking for the overflow possibility.
Reported-by: hubert . <hubjin657@outlook.com> Suggested-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Carlos Maiolino [Thu, 13 Nov 2025 13:46:13 +0000 (14:46 +0100)]
mkfs: fix zone capacity check for sequential zones
Sequential zones can have a different, smaller capacity than
conventional zones.
Currently mkfs assumes both sequential and conventional zones will have
the same capacity and and set the zone_info to the capacity of the first
found zone and use that value to validate all the remaining zones's
capacity.
Because conventional zones can't have a different capacity than its
size, the first zone always have the largest possible capacity, so, mkfs
will fail to validate any consecutive sequential zone if its capacity is
smaller than the conventional zones.
What we should do instead, is set the zone info capacity accordingly to
the settings of first zone found of the respective type and validate
the capacity based on that instead of assuming all zones will have the
same capacity.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Luca Di Maio [Sat, 8 Nov 2025 14:39:53 +0000 (15:39 +0100)]
libxfs: support reproducible filesystems using deterministic time/seed
Add support for reproducible filesystem creation through two environment
variables that enable deterministic behavior when building XFS filesystems.
SOURCE_DATE_EPOCH support:
When SOURCE_DATE_EPOCH is set, use its value for all filesystem timestamps
instead of the current time. This follows the reproducible builds
specification (https://reproducible-builds.org/specs/source-date-epoch/)
and ensures consistent inode timestamps across builds.
DETERMINISTIC_SEED support:
When DETERMINISTIC_SEED=1 is set, return a fixed seed value (0x53454544 =
"SEED") from get_random_u32() instead of reading from /dev/urandom.
get_random_u32() seems to be used mostly to set inode generation number, being
fixed should not be create collision issues at mkfs time.
The implementation introduces two helper functions to minimize changes
to existing code:
- current_fixed_time(): Parses and caches SOURCE_DATE_EPOCH on first
call. Returns fixed timestamp when set, falls back to gettimeofday() on
parse errors or when unset.
- get_deterministic_seed(): Checks for DETERMINISTIC_SEED=1 environment
variable on first call, and returns a fixed seed value (0x53454544).
Falls back to getrandom() when unset.
- Both helpers use one-time initialization to avoid repeated getenv() calls.
- Both quickly exit and noop if environment is not set or has invalid
variables, falling back to original behaviour.
This enables distributions and build systems to create bit-for-bit
identical XFS filesystems when needed for verification and debugging.
v1 -> v2:
- simplify deterministic seed by returning a fixed value instead
of using Middle Square Weyl Sequence PRNG
- fix timestamp type time_t -> time64_t
- fix timestamp initialization flag to allow negative epochs
- fix timestamp conversion type using strtoll
- fix timestamp conversion check to be sure the whole string was parsed
- print warning message when SOURCE_DATE_EPOCH is invalid
Signed-off-by: Luca Di Maio <luca.dimaio1@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Right now 5 places in the kernel and one in xfsprogs need to be updated
for each new error tag. Add a bit of macro magic so that only the
error tag definition and a single table, which reside next to each
other, need to be updated.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>