]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-client.git/log
ceph-client.git
7 weeks agokernel: rpm-pkg: Restore find-debuginfo.sh approach to -debuginfo package
Nathan Chancellor [Tue, 10 Feb 2026 07:04:49 +0000 (00:04 -0700)]
kernel: rpm-pkg: Restore find-debuginfo.sh approach to -debuginfo package

Commit 62089b804895 ("kbuild: rpm-pkg: Generate debuginfo package
manually") effectively reverted commit a7c699d090a1 ("kbuild: rpm-pkg:
build a debuginfo RPM") but the approach it took is not safe for older
RPM releases. Restore commit a7c699d090a1 ("kbuild: rpm-pkg: build a
debuginfo RPM") for the !CONFIG_MODULE_SIG case to allow more
environments and configurations to take advantage of the separate debug
information package process.

Cc: stable@vger.kernel.org
Fixes: 62089b804895 ("kbuild: rpm-pkg: Generate debuginfo package manually")
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Steve French <stfrench@microsoft.com>
Tested-by: Juergen Gross <jgross@suse.com>
Acked-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260210-kbuild-fix-debuginfo-rpm-v1-2-0730b92b14bc@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
7 weeks agokbuild: rpm-pkg: Restrict manual debug package creation
Nathan Chancellor [Tue, 10 Feb 2026 07:04:48 +0000 (00:04 -0700)]
kbuild: rpm-pkg: Restrict manual debug package creation

Commit 62089b804895 ("kbuild: rpm-pkg: Generate debuginfo package
manually") moved away from the built-in RPM machinery for generating
-debuginfo packages to a more manual way to be compatible with module
signing, as the built-in machinery strips the modules after the
installation process, breaking the signatures.

Unfortunately, prior to rpm 4.20.0, there is a bug where a custom %files
directive is ignored for a -debuginfo subpackage [1], meaning builds
using older versions of RPM (such as on RHEL9 or RHEL10) fail with:

  Checking for unpackaged file(s): /usr/lib/rpm/check-files .../rpmbuild/BUILDROOT/kernel-6.19.0_dirty-1.x86_64
  error: Installed (but unpackaged) file(s) found:
     /debuginfo.list
     /usr/lib/debug/.build-id/09/748c214974bfba1522d434a7e0a02e2fd7f29b.debug
     /usr/lib/debug/.build-id/0b/b96dd9c7d3689d82e56d2e73b46f53103cc6c7.debug
     /usr/lib/debug/.build-id/0e/979a2f34967c7437fd30aabb41de1f0c8b6a66.debug
    ...

To workaround this, restrict the manual debug info package creation
process to when it is necessary (CONFIG_MODULE_SIG=y) and possible (when
using RPM >= 4.20.0). A follow up change will restore the RPM debuginfo
creation process using a separate internal flag to allow the package to
be built in more situations, as RPM 4.20.0 is a fairly recent version
and the built-in -debuginfo generation works fine when module signing is
disabled.

Cc: stable@vger.kernel.org
Fixes: 62089b804895 ("kbuild: rpm-pkg: Generate debuginfo package manually")
Link: https://github.com/rpm-software-management/rpm/commit/49f906998f3cf1f4152162ca61ac0869251c380f
Reported-by: Steve French <smfrench@gmail.com>
Closes: https://lore.kernel.org/CAH2r5mugbrHTwnaQwQiYEUVwbtqmvFYf0WZiLrrJWpgT8iwftw@mail.gmail.com/
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Steve French <stfrench@microsoft.com>
Tested-by: Juergen Gross <jgross@suse.com>
Acked-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260210-kbuild-fix-debuginfo-rpm-v1-1-0730b92b14bc@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
7 weeks agoMerge tag 'uml-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml...
Linus Torvalds [Fri, 13 Feb 2026 19:24:38 +0000 (11:24 -0800)]
Merge tag 'uml-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux

Pull UML updates from Johannes Berg:
 "UML was _really_ quiet, with just four small commits:

   - two signal handling fixes

   - dynamic addition of virtio devices

   - a single code cleanup"

* tag 'uml-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
  arch/um: remove unused varible err in remove_files_and_dir()
  um: virtio_uml: Support adding devices via mconsole
  um: Handle SIGCHLD in seccomp mode like other IRQ signals
  um: Preserve errno within signal handler

7 weeks agoscripts/make_fit.py: Drop explicit LZMA parallel compression
Chen-Yu Tsai [Thu, 12 Feb 2026 07:43:07 +0000 (15:43 +0800)]
scripts/make_fit.py: Drop explicit LZMA parallel compression

Parallel compression for LZMA was added using the plzip tool. However
plzip produces lzip format output, which is different from the raw LZMA
format that the lzma tool produces. This causes depthcharge (the second
stage bootloader on Chromebooks) to fail to load the payload.

Drop the explicit LZMA parallel compression toolchain. If the lzma tool
on the build machine is from xz-utils, then there's a chance parallel
compression is already enabled.

The xz-utils manpage says the following for the -T (threads) argument:

    Specify the number of worker threads to use.  Setting threads to a
    special value 0 makes xz use up to as many threads as the processor(s)
    on the system support.  The actual number of threads can be fewer than
    threads if the input file is not big enough for threading with the
    given settings or if using more threads would exceed the memory usage
    limit.

    [...]

    The default value for threads is 0.  In xz 5.4.x and older the default
    is 1.

Fixes: fcdcf22a34b0 ("scripts/make_fit: Support a few more parallel compressors")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Simon Glass <simon.glass@canonical.com>
Link: https://patch.msgid.link/20260212074308.2189032-1-wenst@chromium.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
7 weeks agokbuild: Fix CC_CAN_LINK detection
Mickaël Salaün [Thu, 12 Feb 2026 13:35:43 +0000 (14:35 +0100)]
kbuild: Fix CC_CAN_LINK detection

Most samples cannot be build on some environments because they depend
on CC_CAN_LINK, which is set according to the result of
scripts/cc-can-link.sh called by cc_can_link_user.

Because cc-can-link.sh must now build without warning, it may fail
because it is calling printf() with an empty string:

  + cat
  + gcc -m32 -Werror -Wl,--fatal-warnings -x c - -o /dev/null
  <stdin>: In function ‘main’:
  <stdin>:4:9: error: zero-length gnu_printf format string [-Werror=format-zero-length]
  cc1: all warnings being treated as errors

Fix this warning and the samples build by actually printing something.

Cc: stable@vger.kernel.org
Fixes: d81d9d389b9b ("kbuild: don't enable CC_CAN_LINK if the dummy program generates warnings")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20260212133544.1331437-1-mic@digikod.net
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
7 weeks agoARM: clean up the memset64() C wrapper
Thomas Weißschuh [Fri, 13 Feb 2026 07:39:29 +0000 (08:39 +0100)]
ARM: clean up the memset64() C wrapper

The current logic to split the 64-bit argument into its 32-bit halves is
byte-order specific and a bit clunky.  Use a union instead which is
easier to read and works in all cases.

GCC still generates the same machine code.

While at it, rename the arguments of the __memset64() prototype to
actually reflect their semantics.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 weeks agoselftests/sched_ext: Fix rt_stall flaky failure
Ihor Solodrai [Fri, 13 Feb 2026 18:21:36 +0000 (10:21 -0800)]
selftests/sched_ext: Fix rt_stall flaky failure

The rt_stall test measures the runtime ratio between an EXT and an RT
task pinned to the same CPU, verifying that the deadline server prevents
RT tasks from starving SCHED_EXT tasks. It expects the EXT task to get
at least 4% of CPU time.

The test is flaky because sched_stress_test() calls sleep(RUN_TIME)
immediately after fork(), without waiting for the RT child to complete
its setup (set_affinity + set_sched). If the RT child experiences
scheduling latency before completing setup, that delay eats into the
measurement window: the RT child runs for less than RUN_TIME seconds,
and the EXT task's measured ratio drops below the 4% threshold.

For example, in the failing CI run [1]:
  EXT=0.140s RT=4.750s total=4.890s (expected ~5.0s)
  ratio=2.86% < 4% → FAIL

The 110ms gap (5.0 - 4.89) corresponds to the RT child's setup time
being counted inside the measurement window, during which fewer
deadline server ticks fire for the EXT task.

Fix by using pipes to synchronize: each child signals the parent after
completing its setup, and the parent waits for both signals before
starting sleep(RUN_TIME). This ensures the measurement window only
counts time when both tasks are fully configured and competing.

[1] https://github.com/kernel-patches/bpf/actions/runs/21961895809/job/63442490449

Fixes: be621a76341c ("selftests/sched_ext: Add test for sched_ext dl_server")
Assisted-by: claude-opus-4-6-v1
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 weeks agoalpha: add missing address argument in call to page_table_check_pte_clear()
Thomas Weißschuh [Fri, 13 Feb 2026 07:35:14 +0000 (08:35 +0100)]
alpha: add missing address argument in call to page_table_check_pte_clear()

After the merge of the alpha and mm trees, this code does not compile,
as a parameter is missing in a call to page_table_check_pte_clear().

The parameter was re-added in commit d7b4b67eb6b3 ("mm/page_table_check:
reinstate address parameter in [__]page_table_check_pte_clear()").
The alpha-specific code was newly added in commit dd5712f3379c ("alpha:
fix user-space corruption during memory compaction").

Fixes: 4cff5c05e076 ("Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 weeks agoMerge tag 'nand/for-7.0' into mtd/next
Miquel Raynal [Fri, 13 Feb 2026 17:10:09 +0000 (18:10 +0100)]
Merge tag 'nand/for-7.0' into mtd/next

SPI NAND

- The major feature this release is the support for octal DTR
  modes (8D-8D-8D).
- There has been as well a series of conversion to scoped for each OF
  child loops.
- Support for Foresee F35SQB002G chips has been added.

Other changes are small fixes.

7 weeks agospi: wpcm-fiu: Fix potential NULL pointer dereference in wpcm_fiu_probe()
Felix Gu [Thu, 12 Feb 2026 12:41:40 +0000 (20:41 +0800)]
spi: wpcm-fiu: Fix potential NULL pointer dereference in wpcm_fiu_probe()

platform_get_resource_byname() can return NULL, which would cause a crash
when passed the pointer to resource_size().

Move the fiu->memory_size assignment after the error check for
devm_ioremap_resource() to prevent the potential NULL pointer dereference.

Fixes: 9838c182471e ("spi: wpcm-fiu: Add direct map support")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: J. Neuschäfer <j.ne@posteo.net>
Link: https://patch.msgid.link/20260212-wpcm-v1-1-5b7c4f526aac@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
7 weeks agoKVM: arm64: vgic: Handle const qualifier from gic_kvm_info allocation type
Kees Cook [Fri, 6 Feb 2026 22:30:23 +0000 (14:30 -0800)]
KVM: arm64: vgic: Handle const qualifier from gic_kvm_info allocation type

In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "struct gic_kvm_info", but the returned type,
while matching, is const qualified. To get them exactly matching, just
use the dereferenced pointer for the sizeof().

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20260206223022.it.052-kees@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
7 weeks agoKVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state()
Fuad Tabba [Fri, 13 Feb 2026 14:38:15 +0000 (14:38 +0000)]
KVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state()

The `sve_state` pointer in `hyp_vcpu->vcpu.arch` is initialized as a
hypervisor virtual address during vCPU initialization in
`pkvm_vcpu_init_sve()`.

`unpin_host_sve_state()` calls `kern_hyp_va()` on this address. Since
`kern_hyp_va()` is idempotent, it's not a bug. However, it is
unnecessary and potentially confusing. Remove the redundant conversion.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260213143815.1732675-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
7 weeks agoKVM: arm64: Fix ID register initialization for non-protected pKVM guests
Fuad Tabba [Fri, 13 Feb 2026 14:38:14 +0000 (14:38 +0000)]
KVM: arm64: Fix ID register initialization for non-protected pKVM guests

In protected mode, the hypervisor maintains a separate instance of
the `kvm` structure for each VM. For non-protected VMs, this structure is
initialized from the host's `kvm` state.

Currently, `pkvm_init_features_from_host()` copies the
`KVM_ARCH_FLAG_ID_REGS_INITIALIZED` flag from the host without the
underlying `id_regs` data being initialized. This results in the
hypervisor seeing the flag as set while the ID registers remain zeroed.

Consequently, `kvm_has_feat()` checks at EL2 fail (return 0) for
non-protected VMs. This breaks logic that relies on feature detection,
such as `ctxt_has_tcrx()` for TCR2_EL1 support. As a result, certain
system registers (e.g., TCR2_EL1, PIR_EL1, POR_EL1) are not
saved/restored during the world switch, which could lead to state
corruption.

Fix this by explicitly copying the ID registers from the host `kvm` to
the hypervisor `kvm` for non-protected VMs during initialization, since
we trust the host with its non-protected guests' features. Also ensure
`KVM_ARCH_FLAG_ID_REGS_INITIALIZED` is cleared initially in
`pkvm_init_features_from_host` so that `vm_copy_id_regs` can properly
initialize them and set the flag once done.

Fixes: 41d6028e28bd ("KVM: arm64: Convert the SVE guest vcpu flag to a vm flag")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260213143815.1732675-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
7 weeks agoKVM: arm64: Optimise away S1POE handling when not supported by host
Fuad Tabba [Fri, 13 Feb 2026 14:38:13 +0000 (14:38 +0000)]
KVM: arm64: Optimise away S1POE handling when not supported by host

Although ID register sanitisation prevents guests from seeing the
feature, adding this check to the helper allows the compiler to entirely
eliminate S1POE-specific code paths (such as context switching POR_EL1)
when the host kernel is compiled without support (CONFIG_ARM64_POE is
disabled).

This aligns with the pattern used for other optional features like SVE
(kvm_has_sve()) and FPMR (kvm_has_fpmr()), ensuring no POE logic if the
host lacks support, regardless of the guest configuration state.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260213143815.1732675-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
7 weeks agoKVM: arm64: Hide S1POE from guests when not supported by the host
Fuad Tabba [Fri, 13 Feb 2026 14:38:12 +0000 (14:38 +0000)]
KVM: arm64: Hide S1POE from guests when not supported by the host

When CONFIG_ARM64_POE is disabled, KVM does not save/restore POR_EL1.
However, ID_AA64MMFR3_EL1 sanitisation currently exposes the feature to
guests whenever the hardware supports it, ignoring the host kernel
configuration.

If a guest detects this feature and attempts to use it, the host will
fail to context-switch POR_EL1, potentially leading to state corruption.

Fix this by masking ID_AA64MMFR3_EL1.S1POE in the sanitised system
registers, preventing KVM from advertising the feature when the host
does not support it (i.e. system_supports_poe() is false).

Fixes: 70ed7238297f ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260213143815.1732675-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
7 weeks agotools/power turbostat: Expunge logical_cpu_id
Len Brown [Fri, 13 Feb 2026 05:52:02 +0000 (23:52 -0600)]
tools/power turbostat: Expunge logical_cpu_id

There is only once cpu_id name space -- cpu_id.
Expunge the term logical_cpu_id.

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Enhance HT enumeration
Len Brown [Fri, 13 Feb 2026 05:31:02 +0000 (23:31 -0600)]
tools/power turbostat: Enhance HT enumeration

Record the cpu_id of each CPU HT sibling -- will need this later.

Rename "thread_id" to "ht_id" to disambiguate that the scope
of this id is within a Core -- it is not a global cpu_id.

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Simplify global core_id calculation
Len Brown [Fri, 13 Feb 2026 05:06:47 +0000 (23:06 -0600)]
tools/power turbostat: Simplify global core_id calculation

Standardize the generation of globally unique core_id's
in a macro, and simplify the related code.

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Unify even/odd/average counter referencing
Len Brown [Sun, 8 Feb 2026 18:18:35 +0000 (12:18 -0600)]
tools/power turbostat: Unify even/odd/average counter referencing

Update the syntax of accesses to the even and odd counters
to match the average counters.

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Allocate average counters dynamically
Len Brown [Sun, 8 Feb 2026 18:08:26 +0000 (12:08 -0600)]
tools/power turbostat: Allocate average counters dynamically

The current static definition of average{} is inconsistent with
the dynamically allocated even{} and odd{} counters.

Allocate average{} counters dynamically.

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Delete core_data.core_id
Len Brown [Sun, 8 Feb 2026 15:34:34 +0000 (09:34 -0600)]
tools/power turbostat: Delete core_data.core_id

Delete redundant core_data.core_id.
Use cpus[].core_id as the single copy of the truth.

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Rename physical_core_id to core_id
Len Brown [Sun, 8 Feb 2026 15:25:51 +0000 (09:25 -0600)]
tools/power turbostat: Rename physical_core_id to core_id

The Linux Kernel topology sysfs is flawed.
core_id is not globally unique, but is per-package.

Turbostat works around this when it needs to, with

        rapl_core_id = cpus[cpu].core_id;
        rapl_core_id += cpus[cpu].package_id * nr_cores_per_package

Otherwise, turbostat handles core_id as subservient to each package.

As there is only one core_id namespace, rename
physical_core_id to simply be core_id.

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Cleanup package_id
Len Brown [Sun, 8 Feb 2026 15:09:25 +0000 (09:09 -0600)]
tools/power turbostat: Cleanup package_id

The kernel topology sysfs uses the name "physical_package_id"
because it is allowed to be sparse.

Inside Turbostat, that physical package_id namespace is the only
package_id namespace, so re-name it to simply be "package_id"
in cpus[].

Delete the redundant copy of package_id in pkg_data.
Rely instead on the single copy of the truth in cpus[].

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Cleanup internal use of "base_cpu"
Len Brown [Sat, 7 Feb 2026 16:17:38 +0000 (10:17 -0600)]
tools/power turbostat: Cleanup internal use of "base_cpu"

Disambiguate the uses "base_cpu":

master_cpu: lowest permitted cpu#, read global MSRs here
package_data.first_cpu: lowest permitted cpu# in that package
core_data.first_cpu: lowest permitted cpu# in the core
current_cpu: where I'm running now

No functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agotools/power turbostat: Add L2 cache statistics
Len Brown [Thu, 22 Jan 2026 03:55:39 +0000 (21:55 -0600)]
tools/power turbostat: Add L2 cache statistics

version 2026.02.04

Add support for L2 cache statistics: L2MRPS and L2%hit
L2 statistics join the LLC in the "cache" counter group.

While the underlying LLC perf kernel support was architectural,
L2 perf counters are model-specific:

Support Intel Xeon -- Sapphire Rapids and newer.
Support Intel Atom -- Gracemont and newer.
Support Intel Hybrid -- Alder Lake and newer.

Example:

alder-lake-n$ sudo turbostat --quiet --show CPU,Busy%,cache my_workload
CPU Busy% LLCMRPS LLC%hit L2MRPS L2%hit
- 49.82 1210 85.02 2909 31.63
0 99.14 322 88.89 767 32.38
1 0.91 1 32.47 1 18.86
2 0.20 0 40.78 0 23.34
3 99.17 295 81.79 706 31.89
4 0.68 1 58.71 1 15.61
5 99.16 299 85.65 726 31.32
6 0.08 0 45.35 0 31.71
7 99.21 293 83.63 707 30.92

where "my_workload" is a wrapper for a yogini workload
that has 4 fully-busy threads with 2MB working set each.

Note that analogous to the system summary for multiple LLC systems,
the system summary row for the L2 is the aggregate of all CPUS in the
system -- there is no per-cache roll-up.

Signed-off-by: Len Brown <len.brown@intel.com>
7 weeks agonvme-pci: do not try to add queue maps at runtime
Keith Busch [Tue, 10 Feb 2026 18:38:09 +0000 (10:38 -0800)]
nvme-pci: do not try to add queue maps at runtime

The block layer allocates the set's maps once. We can't add special
purpose queues at runtime if they weren't allocated at initialization
time.

Tested-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
7 weeks agonvme-pci: cap queue creation to used queues
Keith Busch [Tue, 10 Feb 2026 19:00:12 +0000 (11:00 -0800)]
nvme-pci: cap queue creation to used queues

If the user reduces the special queue count at runtime and resets the
controller, we need to reduce the number of queues and interrupts
requested accordingly rather than start with the pre-allocated queue
count.

Tested-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
7 weeks agonvme-pci: ensure we're polling a polled queue
Keith Busch [Tue, 10 Feb 2026 17:26:54 +0000 (09:26 -0800)]
nvme-pci: ensure we're polling a polled queue

A user can change the polled queue count at run time. There's a brief
window during a reset where a hipri task may try to poll that queue
before the block layer has updated the queue maps, which would race with
the now interrupt driven queue and may cause double completions.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
7 weeks agofunction_graph: Restore direct mode when callbacks drop to one
Shengming Hu [Fri, 13 Feb 2026 06:29:32 +0000 (14:29 +0800)]
function_graph: Restore direct mode when callbacks drop to one

When registering a second fgraph callback, direct path is disabled and
array loop is used instead.  When ftrace_graph_active falls back to one,
we try to re-enable direct mode via ftrace_graph_enable_direct(true, ...).
But ftrace_graph_enable_direct() incorrectly disables the static key
rather than enabling it.  This leaves fgraph_do_direct permanently off
after first multi-callback transition, so direct fast mode is never
restored.

Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260213142932519cuWSpEXeS4-UnCvNXnK2P@zte.com.cn
Fixes: cc60ee813b503 ("function_graph: Use static_call and branch to optimize entry function")
Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
7 weeks agoACPI: button: Tweak acpi_button_remove()
Rafael J. Wysocki [Thu, 12 Feb 2026 13:33:00 +0000 (14:33 +0100)]
ACPI: button: Tweak acpi_button_remove()

Modify acpi_button_remove() to get the ACPI device object pointer
from button->adev instead of retrieving it with the help of the
ACPI_COMPANION() macro to reduce overhead slightly.

While at it, rename the struct acpi_device pointer variable in
acpi_button_remove() to adev.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/13948466.uLZWGnKmhe@rafael.j.wysocki
7 weeks agoACPI: button: Tweak system wakeup handling
Rafael J. Wysocki [Thu, 12 Feb 2026 13:31:33 +0000 (14:31 +0100)]
ACPI: button: Tweak system wakeup handling

Modify struct acpi_button to hold a struct device pointer instead
of a struct platform_device one to avoid unnecessary pointer
dereferences and use that pointer consistently for system wakeup
initialization, handling and cleanup.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/1948258.tdWV9SEqCh@rafael.j.wysocki
7 weeks agoACPI: battery: Drop redundant checks from acpi_battery_remove()
Rafael J. Wysocki [Thu, 12 Feb 2026 13:26:54 +0000 (14:26 +0100)]
ACPI: battery: Drop redundant checks from acpi_battery_remove()

In acpi_battery_remove(), "battery" cannot be NULL because it is the
driver data of the platform device passed to that function and it has
been set by acpi_battery_probe(), so drop the redundant check of it
against NULL.

Moreover, getting the ACPI device pointer from battery->device is
slightly less overhead than using the ACPI_COMPANION() macro on the
platform device to retrieve it, so do that and drop the check of that
pointer against NULL which is also redundant.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12836976.O9o76ZdvQC@rafael.j.wysocki
7 weeks agoMerge tag 'amd-drm-next-6.20-2026-02-06' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 13 Feb 2026 01:46:51 +0000 (11:46 +1000)]
Merge tag 'amd-drm-next-6.20-2026-02-06' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.20-2026-02-06:

amdgpu:
- DML 2.1 fixes
- Panel replay fixes
- Display writeback fixes
- MES 11 old firmware compat fix
- DC CRC improvements
- DPIA fixes
- XGMI fixes
- ASPM fix
- SMU feature bit handling fixes
- DC LUT fixes
- RAS fixes
- Misc memory leak in error path fixes
- SDMA queue reset fixes
- PG handling fixes
- 5 level GPUVM page table fix
- SR-IOV fix
- Queue reset fix

amdkfd:
- Fix possible double deletion of validate list
- Event setup fix
- Device disconnect regression fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260206192706.59396-1-alexander.deucher@amd.com
7 weeks agoMerge tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 13 Feb 2026 03:17:44 +0000 (19:17 -0800)]
Merge tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Paul Walmsley:

 - Add support for control flow integrity for userspace processes.

   This is based on the standard RISC-V ISA extensions Zicfiss and
   Zicfilp

 - Improve ptrace behavior regarding vector registers, and add some
   selftests

 - Optimize our strlen() assembly

 - Enable the ISO-8859-1 code page as built-in, similar to ARM64, for
   EFI volume mounting

 - Clean up some code slightly, including defining copy_user_page() as
   copy_page() rather than memcpy(), aligning us with other
   architectures; and using max3() to slightly simplify an expression
   in riscv_iommu_init_check()

* tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
  riscv: lib: optimize strlen loop efficiency
  selftests: riscv: vstate_exec_nolibc: Use the regular prctl() function
  selftests: riscv: verify ptrace accepts valid vector csr values
  selftests: riscv: verify ptrace rejects invalid vector csr inputs
  selftests: riscv: verify syscalls discard vector context
  selftests: riscv: verify initial vector state with ptrace
  selftests: riscv: test ptrace vector interface
  riscv: ptrace: validate input vector csr registers
  riscv: csr: define vtype register elements
  riscv: vector: init vector context with proper vlenb
  riscv: ptrace: return ENODATA for inactive vector extension
  kselftest/riscv: add kselftest for user mode CFI
  riscv: add documentation for shadow stack
  riscv: add documentation for landing pad / indirect branch tracking
  riscv: create a Kconfig fragment for shadow stack and landing pad support
  arch/riscv: add dual vdso creation logic and select vdso based on hw
  arch/riscv: compile vdso with landing pad and shadow stack note
  riscv: enable kernel access to shadow stack memory via the FWFT SBI call
  riscv: add kernel command line option to opt out of user CFI
  riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobe
  ...

7 weeks agoMerge branch 'net-mscc-ocelot-fix-missing-lock-in-ocelot_port_xmit'
Jakub Kicinski [Fri, 13 Feb 2026 03:01:17 +0000 (19:01 -0800)]
Merge branch 'net-mscc-ocelot-fix-missing-lock-in-ocelot_port_xmit'

Ziyi Guo says:

====================
net: mscc: ocelot: fix missing lock in ocelot_port_xmit()

ocelot_port_xmit() calls ocelot_can_inject() and
ocelot_port_inject_frame() without holding the injection group lock.
Both functions contain lockdep_assert_held() for the injection lock,
and the correct caller felix_port_deferred_xmit() properly acquires
the lock using ocelot_lock_inj_grp() before calling these functions.

this v3 splits the fix into a 3-patch series to separate
refactoring from the behavioral change:

  1/3: Extract the PTP timestamp handling into an ocelot_xmit_timestamp()
       helper so the logic isn't duplicated when the function is split.

  2/3: Split ocelot_port_xmit() into ocelot_port_xmit_fdma() and
       ocelot_port_xmit_inj(), keeping the FDMA and register injection
       code paths fully separate.

  3/3: Add ocelot_lock_inj_grp()/ocelot_unlock_inj_grp() in
       ocelot_port_xmit_inj() to fix the missing lock protection.

Patches 1-2 are pure refactors with no behavioral change.
Patch 3 is the actual bug fix.
====================

Link: https://patch.msgid.link/20260208225602.1339325-1-n7l8m4@u.northwestern.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: mscc: ocelot: add missing lock protection in ocelot_port_xmit_inj()
Ziyi Guo [Sun, 8 Feb 2026 22:56:02 +0000 (22:56 +0000)]
net: mscc: ocelot: add missing lock protection in ocelot_port_xmit_inj()

ocelot_port_xmit_inj() calls ocelot_can_inject() and
ocelot_port_inject_frame() without holding the injection group lock.
Both functions contain lockdep_assert_held() for the injection lock,
and the correct caller felix_port_deferred_xmit() properly acquires
the lock using ocelot_lock_inj_grp() before calling these functions.

Add ocelot_lock_inj_grp()/ocelot_unlock_inj_grp() around the register
injection path to fix the missing lock protection. The FDMA path is not
affected as it uses its own locking mechanism.

Fixes: c5e12ac3beb0 ("net: mscc: ocelot: serialize access to the injection/extraction groups")
Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260208225602.1339325-4-n7l8m4@u.northwestern.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: mscc: ocelot: split xmit into FDMA and register injection paths
Ziyi Guo [Sun, 8 Feb 2026 22:56:01 +0000 (22:56 +0000)]
net: mscc: ocelot: split xmit into FDMA and register injection paths

Split ocelot_port_xmit() into two separate functions:
- ocelot_port_xmit_fdma(): handles the FDMA injection path
- ocelot_port_xmit_inj(): handles the register-based injection path

The top-level ocelot_port_xmit() now dispatches to the appropriate
function based on the ocelot_fdma_enabled static key.

This is a pure refactor with no behavioral change. Separating the two
code paths makes each one simpler and prepares for adding proper locking
to the register injection path without affecting the FDMA path.

Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260208225602.1339325-3-n7l8m4@u.northwestern.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: mscc: ocelot: extract ocelot_xmit_timestamp() helper
Ziyi Guo [Sun, 8 Feb 2026 22:56:00 +0000 (22:56 +0000)]
net: mscc: ocelot: extract ocelot_xmit_timestamp() helper

Extract the PTP timestamp handling logic from ocelot_port_xmit() into a
separate ocelot_xmit_timestamp() helper function. This is a pure
refactor with no behavioral change.

The helper returns false if the skb was consumed (freed) due to a
timestamp request failure, and true if the caller should continue with
frame injection. The rew_op value is returned via pointer.

This prepares for splitting ocelot_port_xmit() into separate FDMA and
register injection paths in a subsequent patch.

Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260208225602.1339325-2-n7l8m4@u.northwestern.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: sparx5/lan969x: fix DWRR cost max to match hardware register width
Daniel Machon [Tue, 10 Feb 2026 13:44:01 +0000 (14:44 +0100)]
net: sparx5/lan969x: fix DWRR cost max to match hardware register width

DWRR (Deficit Weighted Round Robin) scheduling distributes bandwidth
across traffic classes based on per-queue cost values, where lower cost
means higher bandwidth share.

The SPX5_DWRR_COST_MAX constant is 63 (6 bits) but the hardware
register field HSCH_DWRR_ENTRY_DWRR_COST is GENMASK(24, 20), only
5 bits wide (max 31). This causes sparx5_weight_to_hw_cost() to
compute cost values that silently overflow via FIELD_PREP, resulting
in incorrect scheduling weights.

Set SPX5_DWRR_COST_MAX to 31 to match the hardware register width.

Fixes: 211225428d65 ("net: microchip: sparx5: add support for offloading ets qdisc")
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260210-sparx5-fix-dwrr-cost-max-v1-1-58fbdbc25652@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoatm: fore200e: fix use-after-free in tasklets during device removal
Duoming Zhou [Tue, 10 Feb 2026 09:45:37 +0000 (17:45 +0800)]
atm: fore200e: fix use-after-free in tasklets during device removal

When the PCA-200E or SBA-200E adapter is being detached, the fore200e
is deallocated. However, the tx_tasklet or rx_tasklet may still be running
or pending, leading to use-after-free bug when the already freed fore200e
is accessed again in fore200e_tx_tasklet() or fore200e_rx_tasklet().

One of the race conditions can occur as follows:

CPU 0 (cleanup)           | CPU 1 (tasklet)
fore200e_pca_remove_one() | fore200e_interrupt()
  fore200e_shutdown()     |   tasklet_schedule()
    kfree(fore200e)       | fore200e_tx_tasklet()
                          |   fore200e-> // UAF

Fix this by ensuring tx_tasklet or rx_tasklet is properly canceled before
the fore200e is released. Add tasklet_kill() in fore200e_shutdown() to
synchronize with any pending or running tasklets. Moreover, since
fore200e_reset() could prevent further interrupts or data transfers,
the tasklet_kill() should be placed after fore200e_reset() to prevent
the tasklet from being rescheduled in fore200e_interrupt(). Finally,
it only needs to do tasklet_kill() when the fore200e state is greater
than or equal to FORE200E_STATE_IRQ, since tasklets are uninitialized
in earlier states. In a word, the tasklet_kill() should be placed in
the FORE200E_STATE_IRQ branch within the switch...case structure.

This bug was identified through static analysis.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@kernel.org
Suggested-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260210094537.9767-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: stmmac: fix oops when split header is enabled
Jie Zhang [Mon, 9 Feb 2026 22:50:32 +0000 (17:50 -0500)]
net: stmmac: fix oops when split header is enabled

For GMAC4, when split header is enabled, in some rare cases, the
hardware does not fill buf2 of the first descriptor with payload.
Thus we cannot assume buf2 is always fully filled if it is not
the last descriptor. Otherwise, the length of buf2 of the second
descriptor will be calculated wrong and cause an oops:

Unable to handle kernel paging request at virtual address ffff00019246bfc0
...
x2 : 0000000000000040 x1 : ffff00019246bfc0 x0 : ffff00009246c000
Call trace:
 dcache_inval_poc+0x28/0x58 (P)
 dma_direct_sync_single_for_cpu+0x38/0x6c
 __dma_sync_single_for_cpu+0x34/0x6c
 stmmac_napi_poll_rx+0x8f0/0xb60
 __napi_poll.constprop.0+0x30/0x144
 net_rx_action+0x160/0x274
 handle_softirqs+0x1b8/0x1fc
...

To fix this, the PL bit-field in RDES3 register is used for all
descriptors, whether it is the last descriptor or not.

Fixes: ec222003bd94 ("net: stmmac: Prepare to add Split Header support")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Link: https://patch.msgid.link/20260209225037.589130-1-jie.zhang@analog.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: intel: fix PCI device ID conflict between i40e and ipw2200
Ethan Nelson-Moore [Tue, 10 Feb 2026 02:12:34 +0000 (18:12 -0800)]
net: intel: fix PCI device ID conflict between i40e and ipw2200

The ID 8086:104f is matched by both i40e and ipw2200. The same device
ID should not be in more than one driver, because in that case, which
driver is used is unpredictable. Fix this by taking advantage of the
fact that i40e devices use PCI_CLASS_NETWORK_ETHERNET and ipw2200
devices use PCI_CLASS_NETWORK_OTHER to differentiate the devices.

Fixes: 2e45d3f4677a ("i40e: Add support for X710 B/P & SFP+ cards")
Cc: stable@vger.kernel.org
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20260210021235.16315-1-enelsonmoore@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoselftests: drv-net: limit RPS test CPUs to supported range
Gal Pressman [Tue, 10 Feb 2026 09:31:10 +0000 (11:31 +0200)]
selftests: drv-net: limit RPS test CPUs to supported range

The _get_unused_cpus() function can return CPU numbers >= 16, which
exceeds RPS_MAX_CPUS in toeplitz.c. When this happens, the test fails
with a cryptic message:

  # Exception| Traceback (most recent call last):
  # Exception|   File "/tmp/cur/linux/tools/testing/selftests/net/lib/py/ksft.py", line 319, in ksft_run
  # Exception|     func(*args)
  # Exception|   File "/tmp/cur/linux/tools/testing/selftests/drivers/net/hw/toeplitz.py", line 189, in test
  # Exception|     with bkg(" ".join(rx_cmd), ksft_ready=True, exit_wait=True) as rx_proc:
  # Exception|          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  # Exception|   File "/tmp/cur/linux/tools/testing/selftests/net/lib/py/utils.py", line 124, in __init__
  # Exception|     super().__init__(comm, background=True,
  # Exception|   File "/tmp/cur/linux/tools/testing/selftests/net/lib/py/utils.py", line 77, in __init__
  # Exception|     raise Exception("Did not receive ready message")
  # Exception| Exception: Did not receive ready message

Rename _get_unused_cpus() to _get_unused_rps_cpus() and cap the CPU
search range to RPS_MAX_CPUS.

Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260210093110.1935149-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoselftests: net: lib: Fix jq parsing error
Yue Haibing [Wed, 11 Feb 2026 02:21:46 +0000 (10:21 +0800)]
selftests: net: lib: Fix jq parsing error

The testcase failed as below:
$./vlan_bridge_binding.sh
...
+ adf_ip_link_set_up d1
+ local name=d1
+ shift
+ ip_link_is_up d1
+ ip_link_has_flag d1 UP
+ local name=d1
+ shift
+ local flag=UP
+ shift
++ ip -j link show d1
++ jq --arg flag UP 'any(.[].flags.[]; . == $flag)'
jq: error: syntax error, unexpected '[', expecting FORMAT or QQSTRING_START
 (Unix shell quoting issues?) at <top-level>, line 1:
any(.[].flags.[]; . == $flag)
jq: 1 compile error

Remove the extra dot (.) after flags array to fix this.

Fixes: 4baa1d3a5080 ("selftests: net: lib: Add ip_link_has_flag()")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260211022146.190948-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agobng_en: Remove duplicate include
Chen Ni [Wed, 11 Feb 2026 03:20:21 +0000 (11:20 +0800)]
bng_en: Remove duplicate include

Remove duplicate inclusion of <net/netdev_queues.h>.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Link: https://patch.msgid.link/20260211032021.2719742-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoselftests: mlxsw: tc_restrictions: Fix test failure with new iproute2
Ido Schimmel [Mon, 9 Feb 2026 13:53:53 +0000 (14:53 +0100)]
selftests: mlxsw: tc_restrictions: Fix test failure with new iproute2

As explained in [1], iproute2 started rejecting tc-police burst sizes
that result in an overflow. This can happen when the burst size is high
enough and the rate is low enough.

A couple of test cases specify such configurations, resulting in
iproute2 errors and test failure.

Fix by reducing the burst size so that the test will pass with both new
and old iproute2 versions.

[1] https://lore.kernel.org/netdev/20250916215731.3431465-1-jay.vosburgh@canonical.com/

Fixes: cb12d1763267 ("selftests: mlxsw: tc_restrictions: Test tc-police restrictions")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/88b00c6e85188aa6a065dc240206119b328c46e1.1770643998.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: mctp: ensure our nlmsg responses are initialised
Jeremy Kerr [Mon, 9 Feb 2026 07:27:33 +0000 (15:27 +0800)]
net: mctp: ensure our nlmsg responses are initialised

Syed Faraz Abrar (@farazsth98) from Zellic, and Pumpkin (@u1f383) from
DEVCORE Research Team working with Trend Micro Zero Day Initiative
report that a RTM_GETNEIGH will return uninitalised data in the pad
bytes of the ndmsg data.

Ensure we're initialising the netlink data to zero, in the link, addr
and neigh response messages.

Fixes: 831119f88781 ("mctp: Add neighbour netlink interface")
Fixes: 06d2f4c583a7 ("mctp: Add netlink route management")
Fixes: 583be982d934 ("mctp: Add device handling and netlink interface")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260209-dev-mctp-nlmsg-v1-1-f1e30c346a43@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge tag 'for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power...
Linus Torvalds [Fri, 13 Feb 2026 02:24:37 +0000 (18:24 -0800)]
Merge tag 'for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply and reset updates from Sebastian Reichel:
 "power-supply core:
   - sysfs: constify pointer passed to dev_attr_psp
   - extend DT binding documentation for battery cells to allow
     describing voltage drop behaviour

  power-supply drivers:
   - multiple: Remove unused gpio include header
   - multiple: Fix potential IRQ use-after-free on driver unload
   - bd71828: Add support for ROHM BD72720
   - misc small fixes

  reset drivers:
   - tdx-ec-poweroff: fix restart"

* tag 'for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (30 commits)
  power: supply: bd71828: Use dev_err_probe()
  dt-bindings: power: supply: google,goldfish-battery: Convert to DT schema
  power: supply: qcom_battmgr: Recognize "LiP" as lithium-polymer
  power: supply: wm97xx: Use devm_power_supply_register()
  power: supply: wm97xx: Use devm_kcalloc()
  power: supply: pm8916_lbc: Fix use-after-free for extcon in IRQ handler
  power: reset: tdx-ec-poweroff: fix restart
  docs: power: update documentation about removed function
  power: supply: wm97xx: Fix NULL pointer dereference in power_supply_changed()
  MAINTAINERS: adjust file entry in ROHM BD71828 CHARGER
  power: supply: ab8500_chargalg: improve kernel-doc
  power: supply: sysfs: Constify pointer passed to dev_attr_psp()
  power: supply: bq27xxx: fix wrong errno when bus ops are unsupported
  power: reset: nvmem-reboot-mode: respect cell size for nvmem_cell_write
  power: supply: sbs-battery: Fix use-after-free in power_supply_changed()
  power: supply: rt9455: Fix use-after-free in power_supply_changed()
  power: supply: pm8916_lbc: Fix use-after-free in power_supply_changed()
  power: supply: pm8916_bms_vm: Fix use-after-free in power_supply_changed()
  power: supply: pf1550: Fix use-after-free in power_supply_changed()
  power: supply: goldfish: Fix use-after-free in power_supply_changed()
  ...

7 weeks agomyri10ge: replace comma with semicolons
Chen Ni [Thu, 12 Feb 2026 05:50:28 +0000 (13:50 +0800)]
myri10ge: replace comma with semicolons

Commit fd24173439c0 ("myri10ge: avoid uninitialized variable use")
added commas instead of semicolons

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260212055028.3248491-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agoMerge tag 'nfs-for-7.0-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Fri, 13 Feb 2026 01:49:33 +0000 (17:49 -0800)]
Merge tag 'nfs-for-7.0-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "New Features:
   - Use an LRU list for returning unused delegations
   - Introduce a KConfig option to disable NFS v4.0 and make NFS v4.1
     the default

  Bugfixes:
   - NFS/localio:
       - Handle short writes by retrying
       - Prevent direct reclaim recursion into NFS via nfs_writepages
       - Use GFP_NOIO and non-memreclaim workqueue in nfs_local_commit
       - Remove -EAGAIN handling in nfs_local_doio()
   - pNFS: fix a missing wake up while waiting on NFS_LAYOUT_DRAIN
   - fs/nfs: Fix a readdir slow-start regression
   - SUNRPC: fix gss_auth kref leak in gss_alloc_msg error path

  Other cleanups and improvements:
   - A few other NFS/localio cleanups
   - Various other delegation handling cleanups from Christoph
   - Unify security_inode_listsecurity() calls
   - Improvements to NFSv4 lease handling
   - Clean up SUNRPC *_debug fields when CONFIG_SUNRPC_DEBUG is not set"

* tag 'nfs-for-7.0-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (60 commits)
  SUNRPC: fix gss_auth kref leak in gss_alloc_msg error path
  nfs: nfs4proc: Convert comma to semicolon
  SUNRPC: Change list definition method
  sunrpc: rpc_debug and others are defined even if CONFIG_SUNRPC_DEBUG unset
  NFSv4: limit lease period in nfs4_set_lease_period()
  NFSv4: pass lease period in seconds to nfs4_set_lease_period()
  nfs: unify security_inode_listsecurity() calls
  fs/nfs: Fix readdir slow-start regression
  pNFS: fix a missing wake up while waiting on NFS_LAYOUT_DRAIN
  NFS: fix delayed delegation return handling
  NFS: simplify error handling in nfs_end_delegation_return
  NFS: fold nfs_abort_delegation_return into nfs_end_delegation_return
  NFS: remove the delegation == NULL check in nfs_end_delegation_return
  NFS: use bool for the issync argument to nfs_end_delegation_return
  NFS: return void from ->return_delegation
  NFS: return void from nfs4_inode_make_writeable
  NFS: Merge CONFIG_NFS_V4_1 with CONFIG_NFS_V4
  NFS: Add a way to disable NFS v4.0 via KConfig
  NFS: Move sequence slot operations into minorversion operations
  NFS: Pass a struct nfs_client to nfs4_init_sequence()
  ...

7 weeks agoMerge branch 'net-phy_port-sfp-modules-representation-and-phy_port-listing'
Jakub Kicinski [Fri, 13 Feb 2026 01:45:02 +0000 (17:45 -0800)]
Merge branch 'net-phy_port-sfp-modules-representation-and-phy_port-listing'

Maxime Chevallier says:

====================
net: phy_port: SFP modules representation and phy_port listing (part)
====================

Applying just the initial cleanup + fixes from the series.

Link: https://patch.msgid.link/20260205092317.755906-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: phy: phy_port: Correctly recompute the port's linkmodes
Maxime Chevallier [Thu, 5 Feb 2026 09:23:06 +0000 (10:23 +0100)]
net: phy: phy_port: Correctly recompute the port's linkmodes

a PHY-driven phy_port contains a 'supported' field containing the
linkmodes available on this port. This is populated based on :
 - The PHY's reported features
 - The DT representation of the connector
 - The PHY's attach_mdi() callback

As these different attrbution methods work in conjunction, the helper
phy_port_update_supported() recomputes the final 'supported' value based
on the populated mediums, linkmodes and pairs.

However this recompute wasn't correctly implemented, and added more
modes than necessary by or'ing the medium-specific modes to the existing
support. Let's fix this and properly filter the modes.

Fixes: 589e934d2735 ("net: phy: Introduce PHY ports representation")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Link: https://patch.msgid.link/20260205092317.755906-4-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: phy: phy_port: Cleanup the of-parsing logic for phy_port
Maxime Chevallier [Thu, 5 Feb 2026 09:23:05 +0000 (10:23 +0100)]
net: phy: phy_port: Cleanup the of-parsing logic for phy_port

We don't need to maintain a mediums bitfield, let's drop it and drop a
bogus check for empty mediums, as we already check it above.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Link: https://patch.msgid.link/20260205092317.755906-3-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agonet: phy: initialize the port support based on the PHY's for OF ports
Maxime Chevallier [Thu, 5 Feb 2026 09:23:04 +0000 (10:23 +0100)]
net: phy: initialize the port support based on the PHY's for OF ports

With the phy_port infrastructure came an ethernet-connector binding,
allowing to represent the MDI of a PHY in devicetree. This allows
specifying the mediums and pairs of a port.

Let's initialize the port's supported list based on what the PHY
reports, so that we can then filter it with what the connector allows
using.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Link: https://patch.msgid.link/20260205092317.755906-2-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 weeks agotools/testing: keep legacy generated files around in .gitignore file
Linus Torvalds [Fri, 13 Feb 2026 01:19:36 +0000 (17:19 -0800)]
tools/testing: keep legacy generated files around in .gitignore file

People keep removing generated files from .gitignore files even when the
files stay around.  Please don't do that: just because the file is no
longer being generated doesn't make it magically go away, and doesn't
make it suddenly be something that should now not be ignored any more.

Fixes: dd2c6ec24fca ("selftests/mm: remove virtual_address_range test")
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 weeks agoMerge tag 'ata-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata...
Linus Torvalds [Fri, 13 Feb 2026 01:12:43 +0000 (17:12 -0800)]
Merge tag 'ata-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ATA updates from Damien Le Moal:

 - Cleanup IRQ masking in the handling of completed report zones
   commands (Niklas)

 - Improve the handling of Thunderbolt attached devices to speed up
   device removal (Henry)

 - Several patches to generalize the existing max_sec quirks to
   facilitates quirking the maximum command size of buggy drives, many
   of which have recently showed up with the recent increase of the
   default max_sectors block limit (Niklas)

 - Cleanup the ahci-platform and sata dt-bindings schema (Rob,
   Manivannan)

 - Improve device node scan in the ahci-dwc driver (Krzysztof)

 - Remove clang W=1 warnings with the ahci-imx and ahci-xgene drivers
   (Krzysztof)

 - Fix a long standing potential command starvation situation with
   non-NCQ commands issued when NCQ commands are on-going (me)

 - Limit max_sectors to 8191 on the INTEL SSDSC2KG480G8 SSD (Niklas)

 - Remove Vesa Local Bus (VLB) support in the pata_legacy driver (Ethan)

 - Simple fixes in the pata_cypress (typo) and pata_ftide010 (timing)
   drivers (Ethan, Linus W)

* tag 'ata-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: pata_ftide010: Fix some DMA timings
  ata: pata_cypress: fix typo in error message
  ata: pata_legacy: remove VLB support
  ata: libata-core: Quirk INTEL SSDSC2KG480G8 max_sectors
  dt-bindings: ata: sata: Document the graph port
  ata: libata-scsi: avoid Non-NCQ command starvation
  ata: libata-scsi: refactor ata_scsi_translate()
  ata: ahci-xgene: Fix Wvoid-pointer-to-enum-cast warning
  ata: ahci-imx: Fix Wvoid-pointer-to-enum-cast warning
  ata: ahci-dwc: Simplify with scoped for each OF child loop
  dt-bindings: ata: ahci-platform: Drop unnecessary select schema
  ata: libata: Allow more quirks
  ata: libata: Add libata.force parameter max_sec
  ata: libata: Add support to parse equal sign in libata.force
  ata: libata: Change libata.force to use the generic ATA_QUIRK_MAX_SEC quirk
  ata: libata: Add ata_force_get_fe_for_dev() helper
  ata: libata: Add ATA_QUIRK_MAX_SEC and convert all device quirks
  ata: libata: avoid long timeouts on hot-unplugged SATA DAS
  ata: libata-scsi: Remove superfluous local_irq_save()

7 weeks agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 13 Feb 2026 01:05:20 +0000 (17:05 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Usual smallish cycle. The NFS biovec work to push it down into RDMA
  instead of indirecting through a scatterlist is pretty nice to see,
  been talked about for a long time now.

   - Various code improvements in irdma, rtrs, qedr, ocrdma, irdma, rxe

   - Small driver improvements and minor bug fixes to hns, mlx5, rxe,
     mana, mlx5, irdma

   - Robusness improvements in completion processing for EFA

   - New query_port_speed() verb to move past limited IBA defined speed
     steps

   - Support for SG_GAPS in rts and many other small improvements

   - Rare list corruption fix in iwcm

   - Better support different page sizes in rxe

   - Device memory support for mana

   - Direct bio vec to kernel MR for use by NFS-RDMA

   - QP rate limiting for bnxt_re

   - Remote triggerable NULL pointer crash in siw

   - DMA-buf exporter support for RDMA mmaps like doorbells"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (66 commits)
  RDMA/mlx5: Implement DMABUF export ops
  RDMA/uverbs: Add DMABUF object type and operations
  RDMA/uverbs: Support external FD uobjects
  RDMA/siw: Fix potential NULL pointer dereference in header processing
  RDMA/umad: Reject negative data_len in ib_umad_write
  IB/core: Extend rate limit support for RC QPs
  RDMA/mlx5: Support rate limit only for Raw Packet QP
  RDMA/bnxt_re: Report QP rate limit in debugfs
  RDMA/bnxt_re: Report packet pacing capabilities when querying device
  RDMA/bnxt_re: Add support for QP rate limiting
  MAINTAINERS: Drop RDMA files from Hyper-V section
  RDMA/uverbs: Add __GFP_NOWARN to ib_uverbs_unmarshall_recv() kmalloc
  svcrdma: use bvec-based RDMA read/write API
  RDMA/core: add rdma_rw_max_sge() helper for SQ sizing
  RDMA/core: add MR support for bvec-based RDMA operations
  RDMA/core: use IOVA-based DMA mapping for bvec RDMA operations
  RDMA/core: add bio_vec based RDMA read/write API
  RDMA/irdma: Use kvzalloc for paged memory DMA address array
  RDMA/rxe: Fix race condition in QP timer handlers
  RDMA/mana_ib: Add device‑memory support
  ...

7 weeks agoMerge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Linus Torvalds [Fri, 13 Feb 2026 00:33:05 +0000 (16:33 -0800)]
Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dave Jiang:

 - Introduce cxl_memdev_attach and pave way for soft reserved handling,
   type2 accelerator enabling, and LSA 2.0 enabling. All these series
   require the endpoint driver to settle before continuing the memdev
   driver probe.

 - Address CXL port error protocol handling and reporting.

   The large patch series was split into three parts. The first two
   parts are included here with the final part coming later.

   The first part consists of a series of code refactoring to PCI AER
   sub-system that addresses CXL and also CXL RAS code to prepare for
   port error handling.

   The second part refactors the CXL code to move management of
   component registers to cxl_port objects to allow all CXL AER errors
   to be handled through the cxl_port hierarchy.

 - Provide AMD Zen5 platform address translation for CXL using ACPI
   PRMT. This includes a conventions document to explain why this is
   needed and how it's implemented.

 - Misc CXL patches of fixes, cleanups, and updates. Including CXL
   address translation for unaligned MOD3 regions.

[ TLA service: CXL is "Compute Express Link" ]

* tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits)
  cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  cxl/region: Factor out code into cxl_region_setup_poison()
  cxl/atl: Lock decoders that need address translation
  cxl: Enable AMD Zen5 address translation using ACPI PRMT
  cxl/acpi: Prepare use of EFI runtime services
  cxl: Introduce callback for HPA address ranges translation
  cxl/region: Use region data to get the root decoder
  cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
  cxl/region: Separate region parameter setup and region construction
  cxl: Simplify cxl_root_ops allocation and handling
  cxl/region: Store HPA range in struct cxl_region
  cxl/region: Store root decoder in struct cxl_region
  cxl/region: Rename misleading variable name @hpa to @hpa_range
  Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
  cxl, doc: Moving conventions in separate files
  cxl, doc: Remove isonum.txt inclusion
  cxl/port: Unify endpoint and switch port lookup
  cxl/port: Move endpoint component register management to cxl_port
  cxl/port: Map Port RAS registers
  cxl/port: Move dport RAS setup to dport add time
  ...

7 weeks agoMerge tag 'vfio-v7.0-rc1' of https://github.com/awilliam/linux-vfio
Linus Torvalds [Thu, 12 Feb 2026 23:52:39 +0000 (15:52 -0800)]
Merge tag 'vfio-v7.0-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:
 "A small cycle with the bulk in selftests and reintroducing poison
  handling in the nvgrace-gpu driver. The rest are fixes, cleanups, and
  some dmabuf structure consolidation.

   - Update outdated mdev comment referencing the renamed
     mdev_type_add() function (Julia Lawall)

   - Introduce selftest support for IOMMU mapping of PCI MMIO BARs (Alex
     Mastro)

   - Relax selftest assertion relative to differences in huge page
     handling between legacy (v1) TYPE1 IOMMU mapping behavior and the
     compatibility mode supported by IOMMUFD (David Matlack)

   - Reintroduce memory poison handling support for non-struct-page-
     backed memory in the nvgrace-gpu variant driver (Ankit Agrawal)

   - Replace dma_buf_phys_vec with phys_vec to avoid duplicate structure
     and semantics (Leon Romanovsky)

   - Add missing upstream bridge locking across PCI function reset,
     resolving an assertion failure when secondary bus reset is used to
     provide that reset (Anthony Pighin)

   - Fixes to hisi_acc vfio-pci variant driver to resolve corner case
     issues related to resets, repeated migration, and error injection
     scenarios (Longfang Liu, Weili Qian)

   - Restrict vfio selftest builds to arm64 and x86_64, resolving
     compiler warnings on 32-bit archs (Ted Logan)

   - Un-deprecate the fsl-mc vfio bus driver as a new maintainer has
     stepped up (Ioana Ciornei)"

* tag 'vfio-v7.0-rc1' of https://github.com/awilliam/linux-vfio:
  vfio/fsl-mc: add myself as maintainer
  vfio: selftests: only build tests on arm64 and x86_64
  hisi_acc_vfio_pci: fix the queue parameter anomaly issue
  hisi_acc_vfio_pci: resolve duplicate migration states
  hisi_acc_vfio_pci: update status after RAS error
  hisi_acc_vfio_pci: fix VF reset timeout issue
  vfio/pci: Lock upstream bridge for vfio_pci_core_disable()
  types: reuse common phys_vec type instead of DMABUF open‑coded variant
  vfio/nvgrace-gpu: register device memory for poison handling
  mm: add stubs for PFNMAP memory failure registration functions
  vfio: selftests: Drop IOMMU mapping size assertions for VFIO_TYPE1_IOMMU
  vfio: selftests: Add vfio_dma_mapping_mmio_test
  vfio: selftests: Align BAR mmaps for efficient IOMMU mapping
  vfio: selftests: Centralize IOMMU mode name definitions
  vfio/mdev: update outdated comment

7 weeks agosmb: client: fix data corruption due to racy lease checks
Paulo Alcantara [Thu, 12 Feb 2026 22:53:07 +0000 (19:53 -0300)]
smb: client: fix data corruption due to racy lease checks

Customer reported data corruption in some of their files.  It turned
out the client would end up calling cacheless IO functions while
having RHW lease, bypassing the pagecache and then leaving gaps in the
file while writing to it.  It was related to concurrent opens changing
the lease state while having writes in flight.  Lease breaks and
re-opens due to reconnect could also cause same issue.

Fix this by serialising the lease updates with
cifsInodeInfo::open_file_lock.  When handling oplock break, make sure
to use the downgraded oplock value rather than one in cifsInodeinfo as
it could be changed concurrently.

Reported-by: Frank Sorenson <sorenson@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
7 weeks agolib/group_cpus: handle const qualifier from clusters allocation type
Kees Cook [Fri, 6 Feb 2026 22:20:13 +0000 (14:20 -0800)]
lib/group_cpus: handle const qualifier from clusters allocation type

In preparation for making the kmalloc family of allocators type aware, we
need to make sure that the returned type from the allocation matches the
type of the variable being assigned.  (Before, the allocator would always
return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "const struct cpumask **", but the returned type,
while matching, is not const qualified.  To get them exactly matching,
just use the dereferenced pointer for the sizeof().

Link: https://lkml.kernel.org/r/20260206222010.work.349-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Cc: Wangyang Guo <wangyang.guo@intel.com>
Cc: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agokho: remove unnecessary WARN_ON(err) in kho_populate()
Ran Xiaokai [Thu, 12 Feb 2026 11:11:46 +0000 (11:11 +0000)]
kho: remove unnecessary WARN_ON(err) in kho_populate()

The following pr_warn() provides detailed error and location information,
WARN_ON(err) adds no additional debugging value, so remove the redundant
WARN_ON() call.

Link: https://lkml.kernel.org/r/20260212111146.210086-3-ranxiaokai627@163.com
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agokho: fix missing early_memunmap() call in kho_populate()
Ran Xiaokai [Thu, 12 Feb 2026 11:11:45 +0000 (11:11 +0000)]
kho: fix missing early_memunmap() call in kho_populate()

Patch series "two fixes in kho_populate()", v3.

This patch (of 2):

kho_populate() returns without calling early_memunmap() on success path,
this will cause early ioremap virtual address space leak.

Link: https://lkml.kernel.org/r/20260212111146.210086-1-ranxiaokai627@163.com
Link: https://lkml.kernel.org/r/20260212111146.210086-2-ranxiaokai627@163.com
Fixes: b50634c5e84a ("kho: cleanup error handling in kho_populate()")
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoscripts/gdb: implement x86_page_ops in mm.py
Seongjun Hong [Mon, 2 Feb 2026 03:42:41 +0000 (12:42 +0900)]
scripts/gdb: implement x86_page_ops in mm.py

Implement all member functions of x86_page_ops strictly following the
logic of aarch64_page_ops.

This includes full support for SPARSEMEM and standard page translation
functions.

This fixes compatibility with 'lx-' commands on x86_64, preventing
AttributeErrors when using lx-pfn_to_page and others.

Link: https://lkml.kernel.org/r/20260202034241.649268-1-hsj0512@snu.ac.kr
Signed-off-by: Seongjun Hong <hsj0512@snu.ac.kr>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoobjpool: fix the overestimation of object pooling metadata size
zhouwenhao [Mon, 2 Feb 2026 13:28:46 +0000 (21:28 +0800)]
objpool: fix the overestimation of object pooling metadata size

objpool uses struct objpool_head to store metadata information, and its
cpu_slots member points to an array of pointers that store the addresses
of the percpu ring arrays.  However, the memory size allocated during the
initialization of cpu_slots is nr_cpu_ids * sizeof(struct objpool_slot).
On a 64-bit machine, the size of struct objpool_slot is 16 bytes, which is
twice the size of the actual pointer required, and the extra memory is
never be used, resulting in a waste of memory.  Therefore, the memory size
required for cpu_slots needs to be corrected.

Link: https://lkml.kernel.org/r/20260202132846.68257-1-zhouwenhao7600@gmail.com
Fixes: b4edb8d2d464 ("lib: objpool added: ring-array based lockless MPMC")
Signed-off-by: zhouwenhao <zhouwenhao7600@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Matt Wu <wuqiang.matt@bytedance.com>
Cc: wuqiang.matt <wuqiang.matt@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoselftests/memfd: use IPC semaphore instead of SIGSTOP/SIGCONT
Aristeu Rozanski [Mon, 2 Feb 2026 14:38:05 +0000 (09:38 -0500)]
selftests/memfd: use IPC semaphore instead of SIGSTOP/SIGCONT

selftests/memfd: use IPC semaphore instead of SIGSTOP/SIGCONT

In order to synchronize new processes to test inheritance of memfd_noexec
sysctl, memfd_test sets up the sysctl with a value before creating the new
process.  The new process then sends itself a SIGSTOP in order to wait for
the parent to flip the sysctl value and send a SIGCONT signal.

This would work as intended if it wasn't the fact that the new process is
being created with CLONE_NEWPID, which creates a new PID namespace and the
new process has PID 1 in this namespace.  There're restrictions on sending
signals to PID 1 and, although it's relaxed for other than root PID
namespace, it's biting us here.  In this specific case the SIGSTOP sent by
the new process is ignored (no error to kill() is returned) and it never
stops its execution.  This is usually not noticiable as the parent usually
manages to set the new sysctl value before the child has a chance to run
and the test succeeds.  But if you run the test in a loop, it eventually
reproduces:

while [ 1 ]; do ./memfd_test >log 2>&1 || break; done; cat log

So this patch replaces the SIGSTOP/SIGCONT synchronization with IPC
semaphore.

Link: https://lkml.kernel.org/r/a7776389-b3d6-4b18-b438-0b0e3ed1fd3b@work
Fixes: 6469b66e3f5a ("selftests: improve vm.memfd_noexec sysctl tests")
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: liuye <liuye@kylinos.cn>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agodelayacct: fix build regression on accounting tool
Arnd Bergmann [Tue, 10 Feb 2026 10:34:22 +0000 (11:34 +0100)]
delayacct: fix build regression on accounting tool

The accounting tool was modified for the original ABI using a custom
'timespec64' type in linux/taskstats.h, which I changed to use the regular
__kernel_timespec type, causing a build failure:

        getdelays.c:202:45: warning: 'struct timespec64' declared inside parameter list will not be visible outside of this definition or declaration

     202 | static const char *format_timespec64(struct timespec64 *ts)
         |                                             ^~~~~~~~~~

Change the tool to match the updated header.

Link: https://lkml.kernel.org/r/20260210103427.2984963-1-arnd@kernel.org
Fixes: 503efe850c74 ("delayacct: add timestamp of delay max")
Fixes: f06e31eef4c1 ("delayacct: fix uapi timespec64 definition")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202602091611.lxgINqXp-lkp@intel.com/
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/page_alloc: clear page->private in free_pages_prepare()
Mikhail Gavrilov [Sat, 7 Feb 2026 17:36:14 +0000 (22:36 +0500)]
mm/page_alloc: clear page->private in free_pages_prepare()

Several subsystems (slub, shmem, ttm, etc.) use page->private but don't
clear it before freeing pages.  When these pages are later allocated as
high-order pages and split via split_page(), tail pages retain stale
page->private values.

This causes a use-after-free in the swap subsystem.  The swap code uses
page->private to track swap count continuations, assuming freshly
allocated pages have page->private == 0.  When stale values are present,
swap_count_continued() incorrectly assumes the continuation list is valid
and iterates over uninitialized page->lru containing LIST_POISON values,
causing a crash:

  KASAN: maybe wild-memory-access in range [0xdead000000000100-0xdead000000000107]
  RIP: 0010:__do_sys_swapoff+0x1151/0x1860

Fix this by clearing page->private in free_pages_prepare(), ensuring all
freed pages have clean state regardless of previous use.

Link: https://lkml.kernel.org/r/20260207173615.146159-1-mikhail.v.gavrilov@gmail.com
Fixes: 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split rather than compound")
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Suggested-by: Zi Yan <ziy@nvidia.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoMerge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Thu, 12 Feb 2026 23:43:02 +0000 (15:43 -0800)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Usual driver updates (qla2xxx, mpi3mr, mpt3sas, ufs) plus assorted
  cleanups and fixes.

  The biggest core change is the massive code motion in the sd driver to
  remove forward declarations and the most significant change is to
  enumify the queuecommand return"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (78 commits)
  scsi: csiostor: Fix dereference of null pointer rn
  scsi: buslogic: Reduce stack usage
  scsi: ufs: host: mediatek: Require CONFIG_PM
  scsi: ufs: mediatek: Fix page faults in ufs_mtk_clk_scale() trace event
  scsi: smartpqi: Fix memory leak in pqi_report_phys_luns()
  scsi: mpi3mr: Make driver probing asynchronous
  scsi: ufs: core: Flush exception handling work when RPM level is zero
  scsi: efct: Use IRQF_ONESHOT and default primary handler
  scsi: ufs: core: Use a host-wide tagset in SDB mode
  scsi: qla2xxx: target: Add WQ_PERCPU to alloc_workqueue() users
  scsi: qla2xxx: Add WQ_PERCPU to alloc_workqueue() users
  scsi: qla4xxx: Add WQ_PERCPU to alloc_workqueue() users
  scsi: mpi3mr: Driver version update to 8.17.0.3.50
  scsi: mpi3mr: Fixed the W=1 compilation warning
  scsi: mpi3mr: Record and report controller firmware faults
  scsi: mpi3mr: Update MPI Headers to revision 39
  scsi: mpi3mr: Use negotiated link rate from DevicePage0
  scsi: mpi3mr: Avoid redundant diag-fault resets
  scsi: mpi3mr: Rename log data save helper to reflect threaded/BH context
  scsi: mpi3mr: Add module parameter to control threaded IRQ polling
  ...

7 weeks agoselftests/mm: add memory failure dirty pagecache test
Miaohe Lin [Fri, 6 Feb 2026 03:16:39 +0000 (11:16 +0800)]
selftests/mm: add memory failure dirty pagecache test

This patch adds a new testcase to validate memory failure handling for
dirty pagecache.  This performs similar operations as clean pagecaches
except fsync() is not used to keep pages dirty.

This test helps ensure that memory failure handling for dirty pagecache
works correctly, including proper SIGBUS delivery, page isolation, and
recovery paths.

Link: https://lkml.kernel.org/r/20260206031639.2707102-4-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: kernel test robot <lkp@intel.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoselftests/mm: add memory failure clean pagecache test
Miaohe Lin [Fri, 6 Feb 2026 03:16:38 +0000 (11:16 +0800)]
selftests/mm: add memory failure clean pagecache test

This patch adds a new testcase to validate memory failure handling for
clean pagecache.  This test performs similar operations as anonymous pages
except allocating memory using mmap() with a file fd.

This test helps ensure that memory failure handling for clean pagecache
works correctly, including unchanged page content, page isolation, and
recovery paths.

Link: https://lkml.kernel.org/r/20260206031639.2707102-3-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601221142.mDWA1ucw-lkp@intel.com/
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoselftests/mm: add memory failure anonymous page test
Miaohe Lin [Fri, 6 Feb 2026 03:16:37 +0000 (11:16 +0800)]
selftests/mm: add memory failure anonymous page test

Patch series "selftests/mm: add memory failure selftests", v4.

Introduce selftests to validate the functionality of memory failure.
These tests help ensure that memory failure handling for anonymous pages,
pagecaches pages works correctly, including proper SIGBUS delivery to user
processes, page isolation, and recovery paths.

Currently madvise syscall is used to inject memory failures.  And only
anonymous pages and pagecaches are tested.  More test scenarios, e.g.
hugetlb, shmem, thp, will be added.  Also more memory failure injecting
methods will be supported, e.g.  APEI Error INJection, if required.

This patch (of 3):

This patch adds a new kselftest to validate memory failure handling for
anonymous pages. The test performs the following operations:
1. Allocates anonymous pages using mmap().
2. Injects memory failure via madvise syscall.
3. Verifies expected error handling behavior.
4. Unpoison memory.

This test helps ensure that memory failure handling for anonymous pages
works correctly, including proper SIGBUS delivery to user processes, page
isolation and recovery paths.

Link: https://lkml.kernel.org/r/20260206031639.2707102-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20260206031639.2707102-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: rmap: support batched unmapping for file large folios
Baolin Wang [Mon, 9 Feb 2026 14:07:28 +0000 (22:07 +0800)]
mm: rmap: support batched unmapping for file large folios

Similar to folio_referenced_one(), we can apply batched unmapping for file
large folios to optimize the performance of file folios reclamation.

Barry previously implemented batched unmapping for lazyfree anonymous
large folios[1] and did not further optimize anonymous large folios or
file-backed large folios at that stage.  As for file-backed large folios,
the batched unmapping support is relatively straightforward, as we only
need to clear the consecutive (present) PTE entries for file-backed large
folios.

Note that it's not ready to support batched unmapping for uffd case, so
let's still fallback to per-page unmapping for the uffd case.

Performance testing:
Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and
try to reclaim 8G file-backed folios via the memory.reclaim interface.  I
can observe 75% performance improvement on my Arm64 32-core server (and
50%+ improvement on my X86 machine) with this patch.

W/o patch:
real    0m1.018s
user    0m0.000s
sys     0m1.018s

W/ patch:
real 0m0.249s
user 0m0.000s
sys 0m0.249s

[1] https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@gmail.com/T/#u
Link: https://lkml.kernel.org/r/b53a16f67c93a3fe65e78092069ad135edf00eff.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Barry Song <baohua@kernel.org>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoarm64: mm: implement the architecture-specific clear_flush_young_ptes()
Baolin Wang [Mon, 9 Feb 2026 14:07:27 +0000 (22:07 +0800)]
arm64: mm: implement the architecture-specific clear_flush_young_ptes()

Implement the Arm64 architecture-specific clear_flush_young_ptes() to
enable batched checking of young flags and TLB flushing, improving
performance during large folio reclamation.

Performance testing:
Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and
try to reclaim 8G file-backed folios via the memory.reclaim interface.  I
can observe 33% performance improvement on my Arm64 32-core server (and
10%+ improvement on my X86 machine).  Meanwhile, the hotspot
folio_check_references() dropped from approximately 35% to around 5%.

W/o patchset:
real 0m1.518s
user 0m0.000s
sys 0m1.518s

W/ patchset:
real 0m1.018s
user 0m0.000s
sys 0m1.018s

Link: https://lkml.kernel.org/r/ce749fbae3e900e733fa104a16fcb3ca9fe4f9bd.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoarm64: mm: support batch clearing of the young flag for large folios
Baolin Wang [Mon, 9 Feb 2026 14:07:26 +0000 (22:07 +0800)]
arm64: mm: support batch clearing of the young flag for large folios

Currently, contpte_ptep_test_and_clear_young() and
contpte_ptep_clear_flush_young() only clear the young flag and flush TLBs
for PTEs within the contiguous range.  To support batch PTE operations for
other sized large folios in the following patches, adding a new parameter
to specify the number of PTEs that map consecutive pages of the same large
folio in a single VMA and a single page table.

While we are at it, rename the functions to maintain consistency with
other contpte_*() functions.

Link: https://lkml.kernel.org/r/5644250dcc0417278c266ad37118d27f541fd052.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agoarm64: mm: factor out the address and ptep alignment into a new helper
Baolin Wang [Mon, 9 Feb 2026 14:07:25 +0000 (22:07 +0800)]
arm64: mm: factor out the address and ptep alignment into a new helper

Factor out the contpte block's address and ptep alignment into a new
helper, and will be reused in the following patch.

No functional changes.

Link: https://lkml.kernel.org/r/8076d12cb244b2d9e91119b44dc6d5e4ad9c00af.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: rmap: support batched checks of the references for large folios
Baolin Wang [Mon, 9 Feb 2026 14:07:24 +0000 (22:07 +0800)]
mm: rmap: support batched checks of the references for large folios

Patch series "support batch checking of references and unmapping for large
folios", v6.

Currently, folio_referenced_one() always checks the young flag for each
PTE sequentially, which is inefficient for large folios.  This
inefficiency is especially noticeable when reclaiming clean file-backed
large folios, where folio_referenced() is observed as a significant
performance hotspot.

Moreover, on Arm architecture, which supports contiguous PTEs, there is
already an optimization to clear the young flags for PTEs within a
contiguous range.  However, this is not sufficient.  We can extend this to
perform batched operations for the entire large folio (which might exceed
the contiguous range: CONT_PTE_SIZE).

Similar to folio_referenced_one(), we can also apply batched unmapping for
large file folios to optimize the performance of file folio reclamation.
By supporting batched checking of the young flags, flushing TLB entries,
and unmapping, I can observed a significant performance improvements in my
performance tests for file folios reclamation.  Please check the
performance data in the commit message of each patch.

This patch (of 5):

Currently, folio_referenced_one() always checks the young flag for each
PTE sequentially, which is inefficient for large folios.  This
inefficiency is especially noticeable when reclaiming clean file-backed
large folios, where folio_referenced() is observed as a significant
performance hotspot.

Moreover, on Arm64 architecture, which supports contiguous PTEs, there is
already an optimization to clear the young flags for PTEs within a
contiguous range.  However, this is not sufficient.  We can extend this to
perform batched operations for the entire large folio (which might exceed
the contiguous range: CONT_PTE_SIZE).

Introduce a new API: clear_flush_young_ptes() to facilitate batched
checking of the young flags and flushing TLB entries, thereby improving
performance during large folio reclamation.  And it will be overridden by
the architecture that implements a more efficient batch operation in the
following patches.

While we are at it, rename ptep_clear_flush_young_notify() to
clear_flush_young_ptes_notify() to indicate that this is a batch
operation.

Link: https://lkml.kernel.org/r/cover.1770645603.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/12132694536834262062d1fb304f8f8a064b6750.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agotools/testing/vma: add VMA userland tests for VMA flag functions
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:22 +0000 (16:06 +0000)]
tools/testing/vma: add VMA userland tests for VMA flag functions

Now we have the capability to test the new helpers for the bitmap VMA
flags in userland, do so.

We also update the Makefile such that both VMA (and while we're here)
mm_struct flag sizes can be customised on build.  We default to 128-bit to
enable testing of flags above word size even on 64-bit systems.

We add userland tests to ensure that we do not regress VMA flag behaviour
with the introduction when using bitmap VMA flags, nor accidentally
introduce unexpected results due to for instance higher bit values not
being correctly cleared/set.

As part of this change, make __mk_vma_flags() a custom function so we can
handle specifying invalid VMA bits.  This is purposeful so we can have the
VMA tests work at lower and higher number of VMA flags without having to
duplicate code too much.

Link: https://lkml.kernel.org/r/7fe6afe9c8c61e4d3cfc9a2d50a5d24da8528e68.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agotools/testing/vma: separate out vma_internal.h into logical headers
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:21 +0000 (16:06 +0000)]
tools/testing/vma: separate out vma_internal.h into logical headers

The vma_internal.h file is becoming entirely unmanageable.  It combines
duplicated kernel implementation logic that needs to be kept in-sync with
the kernel, stubbed out declarations that we simply ignore for testing
purposes and custom logic added to aid testing.

If we separate each of the three things into separate headers it makes
things far more manageable, so do so:

* include/stubs.h  contains the stubbed declarations,
* include/dup.h    contains the duplicated kernel declarations, and
* include/custom.h contains declarations customised for testing.

[lorenzo.stoakes@oracle.com: avoid a duplicate struct define]
Link: https://lkml.kernel.org/r/1e032732-61c3-485c-9aa7-6a09016fefc1@lucifer.local
Link: https://lkml.kernel.org/r/dd57baf5b5986cb96a167150ac712cbe804b63ee.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agotools/testing/vma: separate VMA userland tests into separate files
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:20 +0000 (16:06 +0000)]
tools/testing/vma: separate VMA userland tests into separate files

So far the userland VMA tests have been established as a rough expression
of what's been possible.

Adapt it into a more usable form by separating out tests and shared
helper functions.

Since we test functions that are declared statically in mm/vma.c, we make
use of the trick of #include'ing kernel C files directly.

In order for the tests to continue to function, we must therefore also
this way into the tests/ directory.

We try to keep as much shared logic actually modularised into a separate
compilation unit in shared.c, however the merge_existing() and
attach_vma() helpers rely on statically declared mm/vma.c functions so
these must be declared in main.c.

Link: https://lkml.kernel.org/r/a0455ccfe4fdcd1c962c64f76304f612e5662a4e.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: make vm_area_desc utilise vma_flags_t only
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:19 +0000 (16:06 +0000)]
mm: make vm_area_desc utilise vma_flags_t only

Now we have eliminated all uses of vm_area_desc->vm_flags, eliminate this
field, and have mmap_prepare users utilise the vma_flags_t
vm_area_desc->vma_flags field only.

As part of this change we alter is_shared_maywrite() to accept a
vma_flags_t parameter, and introduce is_shared_maywrite_vm_flags() for use
with legacy vm_flags_t flags.

We also update struct mmap_state to add a union between vma_flags and
vm_flags temporarily until the mmap logic is also converted to using
vma_flags_t.

Also update the VMA userland tests to reflect this change.

Link: https://lkml.kernel.org/r/fd2a2938b246b4505321954062b1caba7acfc77a.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: update all remaining mmap_prepare users to use vma_flags_t
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:18 +0000 (16:06 +0000)]
mm: update all remaining mmap_prepare users to use vma_flags_t

We will be shortly removing the vm_flags_t field from vm_area_desc so we
need to update all mmap_prepare users to only use the dessc->vma_flags
field.

This patch achieves that and makes all ancillary changes required to make
this possible.

This lays the groundwork for future work to eliminate the use of
vm_flags_t in vm_area_desc altogether and more broadly throughout the
kernel.

While we're here, we take the opportunity to replace VM_REMAP_FLAGS with
VMA_REMAP_FLAGS, the vma_flags_t equivalent.

No functional changes intended.

Link: https://lkml.kernel.org/r/fb1f55323799f09fe6a36865b31550c9ec67c225.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Damien Le Moal <dlemoal@kernel.org> [zonefs]
Acked-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: update shmem_[kernel]_file_*() functions to use vma_flags_t
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:17 +0000 (16:06 +0000)]
mm: update shmem_[kernel]_file_*() functions to use vma_flags_t

In order to be able to use only vma_flags_t in vm_area_desc we must adjust
shmem file setup functions to operate in terms of vma_flags_t rather than
vm_flags_t.

This patch makes this change and updates all callers to use the new
functions.

No functional changes intended.

[akpm@linux-foundation.org: comment fixes, per Baolin]
Link: https://lkml.kernel.org/r/736febd280eb484d79cef5cf55b8a6f79ad832d2.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: update secretmem to use VMA flags on mmap_prepare
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:16 +0000 (16:06 +0000)]
mm: update secretmem to use VMA flags on mmap_prepare

This patch updates secretmem to use the new vma_flags_t type which will
soon supersede vm_flags_t altogether.

In order to make this change we also have to update mlock_future_ok(), we
replace the vm_flags_t parameter with a simple boolean is_vma_locked one,
which also simplifies the invocation here.

This is laying the groundwork for eliminating the vm_flags_t in
vm_area_desc and more broadly throughout the kernel.

No functional changes intended.

[lorenzo.stoakes@oracle.com: fix check_brk_limits(), per Chris]
Link: https://lkml.kernel.org/r/3aab9ab1-74b4-405e-9efb-08fc2500c06e@lucifer.local
Link: https://lkml.kernel.org/r/a243a09b0a5d0581e963d696de1735f61f5b2075.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: update hugetlbfs to use VMA flags on mmap_prepare
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:15 +0000 (16:06 +0000)]
mm: update hugetlbfs to use VMA flags on mmap_prepare

In order to update all mmap_prepare users to utilising the new VMA flags
type vma_flags_t and associated helper functions, we start by updating
hugetlbfs which has a lot of additional logic that requires updating to
make this change.

This is laying the groundwork for eliminating the vm_flags_t from struct
vm_area_desc and using vma_flags_t only, which further lays the ground for
removing the deprecated vm_flags_t type altogether.

No functional changes intended.

Link: https://lkml.kernel.org/r/9226bec80c9aa3447cc2b83354f733841dba8a50.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: add basic VMA flag operation helper functions
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:14 +0000 (16:06 +0000)]
mm: add basic VMA flag operation helper functions

Now we have the mk_vma_flags() macro helper which permits easy
specification of any number of VMA flags, add helper functions which
operate with vma_flags_t parameters.

This patch provides vma_flags_test[_mask](), vma_flags_set[_mask]() and
vma_flags_clear[_mask]() respectively testing, setting and clearing flags
with the _mask variants accepting vma_flag_t parameters, and the non-mask
variants implemented as macros which accept a list of flags.

This allows us to trivially test/set/clear aggregate VMA flag values as
necessary, for instance:

if (vma_flags_test(&flags, VMA_READ_BIT, VMA_WRITE_BIT))
goto readwrite;

vma_flags_set(&flags, VMA_READ_BIT, VMA_WRITE_BIT);

vma_flags_clear(&flags, VMA_READ_BIT, VMA_WRITE_BIT);

We also add a function for testing that ALL flags are set for convenience,
e.g.:

if (vma_flags_test_all(&flags, VMA_READ_BIT, VMA_MAYREAD_BIT)) {
/* Both READ and MAYREAD flags set */
...
}

The compiler generates optimal assembly for each such that they behave as
if the caller were setting the bitmap flags manually.

This is important for e.g.  drivers which manipulate flag values rather
than a VMA's specific flag values.

We also add helpers for testing, setting and clearing flags for VMA's and
VMA descriptors to reduce boilerplate.

Also add the EMPTY_VMA_FLAGS define to aid initialisation of empty flags.

Finally, update the userland VMA tests to add the helpers there so they
can be utilised as part of userland testing.

Link: https://lkml.kernel.org/r/885d4897d67a6a57c0b07fa182a7055ad752df11.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agotools: bitmap: add missing bitmap_[subset(), andnot()]
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:13 +0000 (16:06 +0000)]
tools: bitmap: add missing bitmap_[subset(), andnot()]

The bitmap_subset() and bitmap_andnot() functions are not present in the
tools version of include/linux/bitmap.h, so add them as subsequent patches
implement test code that requires them.

We also add the missing __bitmap_subset() to tools/lib/bitmap.c.

Link: https://lkml.kernel.org/r/0fd0d4ec868297f522003cb4b5898b53b498805b.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: add mk_vma_flags() bitmap flag macro helper
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:12 +0000 (16:06 +0000)]
mm: add mk_vma_flags() bitmap flag macro helper

This patch introduces the mk_vma_flags() macro helper to allow easy
manipulation of VMA flags utilising the new bitmap representation
implemented of VMA flags defined by the vma_flags_t type.

It is a variadic macro which provides a bitwise-or'd representation of all
of each individual VMA flag specified.

Note that, while we maintain VM_xxx flags for backwards compatibility
until the conversion is complete, we define VMA flags of type vma_flag_t
using VMA_xxx_BIT to avoid confusing the two.

This helper macro therefore can be used thusly:

vma_flags_t flags = mk_vma_flags(VMA_READ_BIT, VMA_WRITE_BIT);

Testing has demonstrated that the compiler optimises this code such that
it generates the same assembly utilising this macro as it does if the
flags were specified manually, for instance:

vma_flags_t get_flags(void)
{
return mk_vma_flags(VMA_READ_BIT, VMA_WRITE_BIT, VMA_EXEC_BIT);
}

Generates the same code as:

vma_flags_t get_flags(void)
{
vma_flags_t flags;

vma_flags_clear_all(&flags);
vma_flag_set(&flags, VMA_READ_BIT);
vma_flag_set(&flags, VMA_WRITE_BIT);
vma_flag_set(&flags, VMA_EXEC_BIT);

return flags;
}

And:

vma_flags_t get_flags(void)
{
vma_flags_t flags;
unsigned long *bitmap = ACCESS_PRIVATE(&flags, __vma_flags);

*bitmap = 1UL << (__force int)VMA_READ_BIT;
*bitmap |= 1UL << (__force int)VMA_WRITE_BIT;
*bitmap |= 1UL << (__force int)VMA_EXEC_BIT;

return flags;
}

That is:

get_flags:
        movl    $7, %eax
        ret

Link: https://lkml.kernel.org/r/fde00df6ff7fb8c4b42cc0defa5a4924c7a1943a.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: rename vma_flag_test/set_atomic() to vma_test/set_atomic_flag()
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:11 +0000 (16:06 +0000)]
mm: rename vma_flag_test/set_atomic() to vma_test/set_atomic_flag()

In order to stay consistent between functions which manipulate a
vm_flags_t argument of the form of vma_flags_...() and those which
manipulate a VMA (in this case the flags field of a VMA), rename
vma_flag_[test/set]_atomic() to vma_[test/set]_atomic_flag().

This lays the groundwork for adding VMA flag manipulation functions in a
subsequent commit.

Link: https://lkml.kernel.org/r/033dcf12e819dee5064582bced9b12ea346d1607.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/vma: remove __private sparse decoration from vma_flags_t
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:10 +0000 (16:06 +0000)]
mm/vma: remove __private sparse decoration from vma_flags_t

Patch series "mm: add bitmap VMA flag helpers and convert all mmap_prepare
to use them", v2.

We introduced the bitmap VMA type vma_flags_t in the aptly named commit
9ea35a25d51b ("mm: introduce VMA flags bitmap type") in order to permit
future growth in VMA flags and to prevent the asinine requirement that VMA
flags be available to 64-bit kernels only if they happened to use a bit
number about 32-bits.

This is a long-term project as there are very many users of VMA flags
within the kernel that need to be updated in order to utilise this new
type.

In order to further this aim, this series adds a number of helper
functions to enable ordinary interactions with VMA flags - that is
testing, setting and clearing them.

In order to make working with VMA bit numbers less cumbersome this series
introduces the mk_vma_flags() helper macro which generates a vma_flags_t
from a variadic parameter list, e.g.:

vma_flags_t flags = mk_vma_flags(VMA_READ_BIT, VMA_WRITE_BIT,
 VMA_EXEC_BIT);

It turns out that the compiler optimises this very well to the point that
this is just as efficient as using VM_xxx pre-computed bitmap values.

This series then introduces the following functions:

bool vma_flags_test_mask(vma_flags_t flags, vma_flags_t to_test);
bool vma_flags_test_all_mask(vma_flags_t flags, vma_flags_t to_test);
void vma_flags_set_mask(vma_flags_t *flags, vma_flags_t to_set);
void vma_flags_clear_mask(vma_flags_t *flags, vma_flags_t to_clear);

Providing means of testing any flag, testing all flags, setting, and
clearing a specific vma_flags_t mask.

For convenience, helper macros are provided - vma_flags_test(),
vma_flags_set() and vma_flags_clear(), each of which utilise
mk_vma_flags() to make these operations easier, as well as an
EMPTY_VMA_FLAGS macro to make initialisation of an empty vma_flags_t value
easier, e.g.:

vma_flags_t flags = EMPTY_VMA_FLAGS;

vma_flags_set(&flags, VMA_READ_BIT, VMA_WRITE_BIT, VMA_EXEC_BIT);
...
if (vma_flags_test(flags, VMA_READ_BIT)) {
...
}
...
if (vma_flags_test_all_mask(flags, VMA_REMAP_FLAGS)) {
...
}
...
vma_flags_clear(&flags, VMA_READ_BIT);

Since callers are often dealing with a vm_area_struct (VMA) or
vm_area_desc (VMA descriptor as used in .mmap_prepare) object, this series
further provides helpers for these - firstly vma_set_flags_mask() and
vma_set_flags() for a VMA:

vma_flags_t flags = EMPTY_VMA_FLAGS:

vma_flags_set(&flags, VMA_READ_BIT, VMA_WRITE_BIT, VMA_EXEC_BIT);
...
vma_set_flags_mask(&vma, flags);
...
vma_set_flags(&vma, VMA_DONTDUMP_BIT);

Note that these do NOT ensure appropriate locks are taken and assume the
callers takes care of this.

For VMA descriptors this series adds vma_desc_[test, set,
clear]_flags_mask() and vma_desc_[test, set, clear]_flags() for a VMA
descriptor, e.g.:

static int foo_mmap_prepare(struct vm_area_desc *desc)
{
...
vma_desc_set_flags(desc, VMA_SEQ_READ_BIT);
vma_desc_clear_flags(desc, VMA_RAND_READ_BIT);
...
if (vma_desc_test_flags(desc, VMA_SHARED_BIT) {
...
}
...
}

With these helpers introduced, this series then updates all mmap_prepare
users to make use of the vma_flags_t vm_area_desc->vma_flags field rather
than the legacy vm_flags_t vm_area_desc->vm_flags field.

In order to do so, several other related functions need to be updated,
with separate patches for larger changes in hugetlbfs, secretmem and shmem
before finally removing vm_area_desc->vm_flags altogether.

This lays the foundations for future elimination of vm_flags_t and
associated defines and functionality altogether in the long run, and
elimination of the use of vm_flags_t in f_op->mmap() hooks in the near
term as mmap_prepare replaces these.

There is a useful synergy between the VMA flags and mmap_prepare work here
as with this change in place, converting f_op->mmap() to
f_op->mmap_prepare naturally also converts use of vm_flags_t to
vma_flags_t in all drivers which declare mmap handlers.

This accounts for the majority of the users of the legacy vm_flags_*()
helpers and thus a large number of drivers which need to interact with VMA
flags in general.

This series also updates the userland VMA tests to account for the change,
and adds unit tests for these helper functions to assert that they behave
as expected.

In order to faciliate this change in a sensible way, the series also
separates out the VMA unit tests into - code that is duplicated from the
kernel that should be kept in sync, code that is customised for test
purposes and code that is stubbed out.

We also separate out the VMA userland tests into separate files to make it
easier to manage and to provide a sensible baseline for adding the
userland tests for these helpers.

This patch (of 13):

We need to pass around these values and access them in a way that sparse
does not allow, as __private implies noderef, i.e.  disallowing
dereference of the value, which manifests as sparse warnings even when
passed around benignly.

Link: https://lkml.kernel.org/r/cover.1769097829.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/64fa89f416f22a60ae74cfff8fd565e7677be192.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: use unmap_desc struct for freeing page tables
Liam R. Howlett [Wed, 21 Jan 2026 16:49:46 +0000 (11:49 -0500)]
mm: use unmap_desc struct for freeing page tables

Pass through the unmap_desc to free_pgtables() because it almost has
everything necessary and is already on the stack.

Updates testing code as necessary.

No functional changes intended.

[Liam.Howlett@oracle.com: fix up unmap desc use on exit_mmap()]
Link: https://lkml.kernel.org/r/20260210214214.364856-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20260121164946.2093480-12-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/vma: use unmap_region() in vms_clear_ptes()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:45 +0000 (11:49 -0500)]
mm/vma: use unmap_region() in vms_clear_ptes()

There is no need to open code the vms_clear_ptes() now that unmap_desc
struct is used.

Link: https://lkml.kernel.org/r/20260121164946.2093480-11-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/vma: use unmap_desc in exit_mmap() and vms_clear_ptes()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:44 +0000 (11:49 -0500)]
mm/vma: use unmap_desc in exit_mmap() and vms_clear_ptes()

Convert vms_clear_ptes() to use unmap_desc to call unmap_vmas() instead of
the large argument list.  The UNMAP_STATE() cannot be used because the vma
iterator in the vms does not point to the correct maple state
(mas_detach), and the tree_end will be set incorrectly.  Setting up the
arguments manually avoids setting the struct up incorrectly and doing
extra work to get the correct pagetable range.

exit_mmap() also calls unmap_vmas() with many arguments.  Using the
unmap_all_init() function to set the unmap descriptor for all vmas makes
this a bit easier to read.

Update to the vma test code is necessary to ensure testing continues to
function.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-10-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: introduce unmap_desc struct to reduce function arguments
Liam R. Howlett [Wed, 21 Jan 2026 16:49:43 +0000 (11:49 -0500)]
mm: introduce unmap_desc struct to reduce function arguments

The unmap_region code uses a number of arguments that could use better
documentation.  With the addition of a descriptor for unmap (called
unmap_desc), the arguments can be more self-documenting and increase the
descriptions within the declaration.

No functional change intended

Link: https://lkml.kernel.org/r/20260121164946.2093480-9-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm: change dup_mmap() recovery
Liam R. Howlett [Wed, 21 Jan 2026 16:49:42 +0000 (11:49 -0500)]
mm: change dup_mmap() recovery

When the dup_mmap() fails during the vma duplication or setup, don't write
the XA_ZERO entry in the vma tree.  Instead, destroy the tree and free the
new resources, leaving an empty vma tree.

Using XA_ZERO introduced races where the vma could be found between
dup_mmap() dropping all locks and exit_mmap() taking the locks.  The race
can occur because the mm can be reached through the other trees via
successfully copied vmas and other methods such as the swapoff code.

XA_ZERO was marking the location to stop vma removal and pagetable
freeing.  The newly created arguments to the unmap_vmas() and
free_pgtables() serve this function.

Replacing the XA_ZERO entry use with the new argument list also means the
checks for xa_is_zero() are no longer necessary so these are also removed.

Note that dup_mmap() now cleans up when ALL vmas are successfully copied,
but the dup_mmap() fails to completely set up some other aspect of the
duplication.

Link: https://lkml.kernel.org/r/20260121164946.2093480-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/vma: add page table limit to unmap_region()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:41 +0000 (11:49 -0500)]
mm/vma: add page table limit to unmap_region()

The unmap_region() calls need to pass through the page table limit for a
future patch.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-7-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/memory: add tree limit to free_pgtables()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:40 +0000 (11:49 -0500)]
mm/memory: add tree limit to free_pgtables()

The ceiling and tree search limit need to be different arguments for the
future change in the failed fork attempt.  The ceiling and floor variables
are not very descriptive, so change them to pg_start/pg_end.

Adding a new variable for the vma_end to the function as it will differ
from the pg_end in the later patches in the series.

Add a kernel doc about the free_pgtables() function.

Test code also updated.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/vma: add limits to unmap_region() for vmas
Liam R. Howlett [Wed, 21 Jan 2026 16:49:39 +0000 (11:49 -0500)]
mm/vma: add limits to unmap_region() for vmas

Add a limit to the vma search instead of using the start and end of the
one passed in.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/mmap: abstract vma clean up from exit_mmap()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:38 +0000 (11:49 -0500)]
mm/mmap: abstract vma clean up from exit_mmap()

Create the new function tear_down_vmas() to remove a range of vmas.
exit_mmap() will be removing all the vmas.

This is necessary for future patches.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-4-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
7 weeks agomm/mmap: move exit_mmap() trace point
Liam R. Howlett [Wed, 21 Jan 2026 16:49:37 +0000 (11:49 -0500)]
mm/mmap: move exit_mmap() trace point

Move the trace point later in the function so that it is not skipped in
the event of a failed fork.

Link: https://lkml.kernel.org/r/20260121164946.2093480-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Chris Li <chrisl@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>