Raphael Zimmer [Tue, 19 May 2026 11:01:30 +0000 (13:01 +0200)]
libceph: Fix multiplication overflow in __decode_pg_upmap_items()
A message of type CEPH_MSG_OSD_MAP holds an OSD map, which typically
contains a pg_upmap part at its end. When decoding this part in
__decode_pg_upmap_items(), a len value is decoded from the message to
determine the number of items and the size of the allocation needed for
them. If the len value is greater than or equal to 2^31, an overflow
occurs in the multiplication that is performed to determine the needed
size of the incoming buffer to decode, as well as for the length of the
allocation for the ceph_pg_mapping struct. Subsequently, this results in
out-of-bounds writes (and reads) when decoding the incoming message
fields into the ceph_pg_mapping struct.
This patch fixes the issue by splitting the computation into multiple
parts and storing the results in local variables of type size_t. This
prevents the overflow by performing the multiplication in the
appropriate data type and ensures that the result is calculated only
once.
Alex Markuze [Thu, 7 May 2026 10:05:41 +0000 (10:05 +0000)]
selftests: ceph: wire up Ceph reset kselftests and documentation
Wire the CephFS reset test suite into the kselftest build:
- Add filesystems/ceph to the top-level selftests Makefile.
- Add the per-suite Makefile with run_validation.sh as TEST_PROGS.
- Add the settings file (kselftest timeout).
- Add the MAINTAINERS entry for the test directory.
- Add README with prerequisites, usage, and troubleshooting.
Alex Markuze [Tue, 28 Apr 2026 07:51:02 +0000 (07:51 +0000)]
selftests: ceph: add validation harness
Add a one-shot validation wrapper that orchestrates the full reset
test suite with per-stage watchdog timeouts and a final status check.
The harness runs five stages: baseline (no resets), corner cases,
moderate stress, aggressive stress, and a post-run status validation.
Each stage runs with an independent timeout so a hang in one stage
does not block the entire run.
Alex Markuze [Tue, 28 Apr 2026 07:50:51 +0000 (07:50 +0000)]
selftests: ceph: add reset corner-case tests
Add targeted corner-case tests for the CephFS manual session reset
feature. Four sequential tests cover:
[1/4] ebusy_rejection - second reset rejected while first in-flight
[2/4] dirty_caps_at_reset - reset with unflushed dirty caps
[3/4] flock_after_reset - stale lock EIO + fresh lock after holder exit
[4/4] unmount_during_reset - umount during active reset (ESHUTDOWN path)
Alex Markuze [Tue, 28 Apr 2026 07:50:39 +0000 (07:50 +0000)]
selftests: ceph: add reset stress test
Add a single-client stress test for the CephFS manual session reset
feature. The test runs concurrent I/O workers alongside periodic
reset injection, then validates data integrity via
validate_consistency.py.
Supports four profiles (baseline, moderate, aggressive, soak) with
configurable duration, reset interval, and worker counts.
Alex Markuze [Tue, 28 Apr 2026 07:50:27 +0000 (07:50 +0000)]
selftests: ceph: add reset consistency checker
Add a Python post-run validator for the CephFS client reset stress
test. The script reads data files written by the stress runner and
checks that every file was either written completely or is missing,
with no partial or corrupted content.
This is a prerequisite for the stress test script which invokes it
after each run.
Alex Markuze [Thu, 7 May 2026 08:54:07 +0000 (08:54 +0000)]
ceph: add manual reset debugfs control and tracepoints
Add the debugfs and trace plumbing used to trigger and observe
manual client reset.
The reset interface exposes a trigger file for operator-initiated
reset and a status file for tracking the most recent run. The
tracepoints record scheduling, completion, and blocked caller
behavior so reset progress can be diagnosed from the client side.
debugfs layout under /sys/kernel/debug/ceph/<client>/reset/:
trigger - write to initiate a manual reset
status - read to see the most recent reset result
The reset directory is cleaned up via debugfs_remove_recursive()
on the parent, so individual file dentries are not stored.
Tracepoints:
ceph_client_reset_schedule - reset queued
ceph_client_reset_complete - reset finished (success or failure)
ceph_client_reset_blocked - caller blocked waiting for reset
ceph_client_reset_unblocked - caller unblocked after reset
All tracepoints use a null-safe access for monc.auth->global_id
to guard against early-init or late-teardown edge cases.
Alex Markuze [Thu, 7 May 2026 09:11:59 +0000 (09:11 +0000)]
ceph: add client reset state machine and session teardown
Add the client-side reset state machine, request gating, and manual
session teardown implementation.
Manual reset is an operator-triggered escape hatch for client/MDS
stalemates in which caps, locks, or unsafe metadata state stop making
forward progress. The reset blocks new metadata work, attempts a
bounded best-effort drain of dirty client state while sessions are
still alive, and finally asks the MDS to close sessions before tearing
local session state down directly.
The reset state machine tracks four phases: IDLE -> QUIESCING ->
DRAINING -> TEARDOWN -> IDLE. QUIESCING is set synchronously by
schedule_reset() before the workqueue item is dispatched, so that new
metadata requests and file-lock acquisitions are gated immediately --
even before the work function begins running. All non-IDLE phases
block callers on blocked_wq, preventing races with session teardown.
The drain phase flushes mdlog state, dirty caps, and pending cap
releases for a bounded interval. State that still cannot make progress
within that interval is discarded during teardown, which is the point
of the reset: break the stalemate and allow fresh sessions to rebuild
clean state.
The session teardown follows the established check_new_map()
forced-close pattern: unregister sessions under mdsc->mutex, then clean
up caps and requests under s->s_mutex. Reconnect is not attempted
because the MDS only accepts reconnects during its own RECONNECT phase
after restart, not from an active client.
Blocked callers are released when reset completes and observe the final
result via -EAGAIN (reset failed) or 0 (success). Internal work-function
errors such as -ENOMEM are not propagated to unrelated callers like
open() or flock(); the detailed error remains in debugfs and
tracepoints.
The work function checks st->shutdown before each phase transition
(DRAINING, TEARDOWN) so that a concurrent ceph_mdsc_destroy() is not
overwritten. If destroy already took ownership, the work function
releases session references and returns without touching the state.
The timeout calculation for blocked-request waiters uses max_t() to
prevent jiffies underflow when the deadline has already passed.
The close-grace sleep before teardown is a best-effort nudge to let
queued REQUEST_CLOSE messages egress; it is not a correctness
requirement since the MDS still has session_autoclose as a fallback.
The destroy path marks reset as failed and wakes blocked waiters before
cancel_work_sync() so unmount does not stall.
Alex Markuze [Thu, 7 May 2026 08:45:27 +0000 (08:45 +0000)]
ceph: add diagnostic timeout loop to wait_caps_flush()
Convert wait_caps_flush() from a silent indefinite wait into a diagnostic
wait loop that periodically dumps pending cap flush state.
The underlying wait semantics remain intact: callers still wait until the
requested cap flushes complete. The difference is that long stalls now
produce actionable diagnostics instead of looking like a silent hang.
CEPH_CAP_FLUSH_MAX_DUMP_ENTRIES limits the number of entries
emitted per diagnostic dump, and CEPH_CAP_FLUSH_MAX_DUMP_ITERS
limits the number of timed diagnostic dumps before the wait
continues silently. When more entries exist than the per-dump
limit, a truncation count is reported. When the dump iteration
limit is reached, a final suppression message is emitted so the
transition to silence is explicit.
The diagnostic dump collects flush entry data under cap_dirty_lock into
a bounded on-stack array, then prints after releasing the lock. This
avoids holding the spinlock across printk calls.
A null cf->ci on the global flush list indicates a bug since all
cap_flush entries are initialized with a valid ci before being added.
Signal this with WARN_ON_ONCE while still printing enough context for
debugging.
READ_ONCE is used for the i_last_cap_flush_ack field, which is read
outside the inode lock domain. Flush tids are monotonically increasing
and acks are processed in order under i_ceph_lock, so the latest ack
tid is always the most recently written value.
Add a ci pointer to struct ceph_cap_flush so that the diagnostic
dump can identify which inode each pending flush belongs to. The
new i_last_cap_flush_ack field tracks the latest acknowledged flush
tid per inode for diagnostic correlation.
This improves reset-drain observability and is also useful for
existing sync and writeback troubleshooting paths.
Alex Markuze [Tue, 28 Apr 2026 07:43:03 +0000 (07:43 +0000)]
ceph: harden send_mds_reconnect and handle active-MDS peer reset
Change send_mds_reconnect() to return an error code so callers can detect
and report reconnect failures instead of silently ignoring them. Add early
bailout checks for sessions that are already closed, rejected, or
unregistered, which avoids sending reconnect messages for sessions that
can no longer be recovered.
The early -ESTALE and -ENOENT bailouts use a separate fail_return label
that skips the pr_err_client diagnostic, since these codes indicate
expected concurrent-teardown races rather than genuine reconnect build
failures.
Move the "reconnect start" log after the early-bailout checks so it
only appears for sessions that actually proceed with reconnect.
Save the prior session state before transitioning to RECONNECTING,
and restore it in the failure path. Without this, a transient
build or encoding failure (-ENOMEM, -ENOSPC) strands the session
in RECONNECTING indefinitely because check_new_map() only retries
sessions in RESTARTING state.
Rewrite mds_peer_reset() to handle the case where the MDS is past its
RECONNECT phase (i.e. active). An active MDS rejects CLIENT_RECONNECT
messages because it only accepts them during its own RECONNECT window
after restart. Previously, the client would send a doomed reconnect
that the MDS would reject or ignore. Now, the client tears the session
down locally and lets new requests re-open a fresh session, which is
the correct recovery for this scenario. The RECONNECTING state is
handled on the same teardown path, since the MDS will reject reconnect
attempts from an active client regardless of the session's local state.
Add explicit cases for CLOSED and REJECTED session states in
mds_peer_reset() since these are terminal states where a connection
drop is expected behavior.
The session teardown path in mds_peer_reset() follows the established
drop-and-reacquire locking pattern from check_new_map(): take
mdsc->mutex for session unregistration, release it, then take s->s_mutex
separately for cleanup. This avoids introducing a new simultaneous lock
nesting pattern.
Log reconnect failures from check_new_map() and mds_peer_reset() at
pr_warn level rather than pr_err, since return codes like -ESTALE
(closed/rejected session) and -ENOENT (unregistered session) are
expected during concurrent teardown. Log dropped messages for
unregistered sessions via doutc() (dynamic debug) rather than
pr_info, as post-reset message arrival is routine and does not
warrant unconditional logging.
Alex Markuze [Tue, 28 Apr 2026 07:41:33 +0000 (07:41 +0000)]
ceph: use proper endian conversion for flock_len in reconnect
Replace the __force __le32 cast with cpu_to_le32() for the flock_len field
in reconnect_caps_cb(). The old code used a type-system bypass to silence
sparse; the new form uses the proper endian conversion macro.
Also switch from a raw bitmask test against i_ceph_flags to test_bit() on
the named CEPH_I_ERROR_FILELOCK_BIT, which is the correct accessor for the
unsigned long flags field after the bit-position conversion.
Remove the now-unused CEPH_I_ERROR_FILELOCK mask define since all callers
use the _BIT form with test_bit/set_bit/clear_bit.
Alex Markuze [Thu, 7 May 2026 08:05:30 +0000 (08:05 +0000)]
ceph: convert inode flags to named bit positions and atomic bitops
Define named bit-position constants for all CEPH_I_* inode flags and
derive the bitmask values from them. This gives every flag a named
_BIT constant usable with the test_bit/set_bit/clear_bit family.
The intentionally unused bit position 1 is documented inline.
Convert all flag modifications to use atomic bitops (set_bit,
clear_bit, test_and_clear_bit). The previous code mixed lockless
atomic ops on some flags (ERROR_WRITE, ODIRECT) with non-atomic
read-modify-write (|= / &= ~) on other flags sharing the same
unsigned long. A concurrent non-atomic RMW can clobber an
adjacent lockless atomic update -- for example, a lockless
clear_bit(ERROR_WRITE) could be silently resurrected by a
concurrent ci->i_ceph_flags |= CEPH_I_FLUSH under the spinlock.
Using atomic bitops for all modifications eliminates this class
of race entirely.
Flags whose only users are now the _BIT form (ERROR_WRITE,
ASYNC_CHECK_CAPS) have their old mask defines removed to document
that callers must use the _BIT constant with the set_bit/test_bit
family. ERROR_FILELOCK and SHUTDOWN retain their mask defines
because they are still used via bitmask tests in lockless readers
(ceph_inode_is_shutdown, reconnect_caps_cb).
The direct assignment in ceph_finish_async_create() is converted
from i_ceph_flags = CEPH_I_ASYNC_CREATE to set_bit(). This
inode is I_NEW at this point -- still invisible to other threads
and guaranteed to have zero flags from alloc_inode -- so either
form is safe, but set_bit() keeps the conversion uniform.
Dhiraj Mishra [Thu, 7 May 2026 19:19:19 +0000 (23:19 +0400)]
libceph, ceph: reject oversized mon and MDS data segments
Monitor and MDS messages can be allocated with only front-buffer storage
and no data items. The messenger receive path copies the wire header
into the selected ceph_msg and later uses hdr.data_len to decide whether
to initialize a data cursor.
If a malicious or compromised peer advertises a non-zero data segment for
one of these front-only messages, the receive path can call
ceph_msg_data_cursor_init() with length greater than msg->data_length and
hit its BUG_ON() checks, crashing the client kernel.
I verified the monitor issue against v7.1-rc1-123-ge75a43c7cec4. The
msgr2 trigger path is present since commit cd1a677cad99 ("libceph,
ceph: implement msgr2.1 protocol (crc and secure modes)"), which is
contained in v5.11-rc1 and later. The allocator patterns are older, but
I have not tested older msgr1-only kernels.
A concrete monitor trigger is a monitor connection over msgr2 after
CEPH_CON_S_OPEN where a FRAME_TAG_MESSAGE contains a monitor reply type
handled by mon_alloc_msg(), a valid front_len for that message type and
data_len = 1. CEPH_MSG_MON_MAP is one such example: the message is
allocated with ceph_msg_new(), leaving msg->data_length and
msg->num_data_items as zero.
The MDS allocation path has the same issue: mds_alloc_msg() allocates the
incoming message with ceph_msg_new(type, front_len, GFP_NOFS, false),
leaving msg->data_length as zero for a front-only message selected from
an attacker-controlled wire header.
Reject monitor and MDS messages whose wire data segment is larger than
the data backing allocated for the selected ceph_msg, mirroring the
existing OSD reply hardening.
Initial analysis has shown that if we try to write out of
end of file, then ceph_write_begin() is responsible for
the issue because it calls netfs_write_begin() and we have
such logic:
The reason of the issue that somehow we have folio in uptodate
state and netfs_write_begin() simply skips the logic of
reading existing file's content.
Futher analysis revealed that we call ceph_fill_inode() and
ceph_fill_inline_data() before ceph_write_begin().
The race sequence:
1. Read completes -> netfs_read_collection() runs
2. netfs_wake_rreq_flag(rreq, NETFS_RREQ_IN_PROGRESS, ...)
3. netfs_wait_for_read() returns -EFAULT to netfs_write_begin()
4. The netfs_unlock_abandoned_read_pages() unlocks the folio
5. netfs_write_begin() calls folio_unlock(folio) -> VM_BUG_ON_FOLIO()
The key reason of the issue that netfs_unlock_abandoned_read_pages()
doesn't check the flag NETFS_RREQ_NO_UNLOCK_FOLIO and executes
folio_unlock() unconditionally. This patch implements in
netfs_unlock_abandoned_read_pages() logic similar to
netfs_unlock_read_folio().
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
cc: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: Ceph Development <ceph-devel@vger.kernel.org> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Ilya Dryomov [Tue, 19 May 2026 21:07:26 +0000 (23:07 +0200)]
rbd: eliminate a race in lock_dwork draining on unmap
Given how rbd_lock_add_request() and rbd_img_exclusive_lock() are
written, lock_dwork may be (re)queued more than it's actually needed:
for example in case a new I/O request comes in while we are in the
middle of rbd_acquire_lock() on behalf of another I/O request. This is
expected and with rbd_release_lock() preemptively canceling lock_dwork
is benign under normal operation.
A more problematic example is maybe_kick_acquire():
It's not unrealistic for lock_dwork to get canceled right after
delayed_work_pending() returns true and for mod_delayed_work() to
requeue it right there anyway. This is a classic TOCTOU race.
When it comes to unmapping the image, there is an implicit assumption
of no self-initiated exclusive lock activity past the point of return
from rbd_dev_image_unlock() which unlocks the lock if it happens to be
held. This unlock is assumed to be final and lock_dwork (as well as
all other exclusive lock tasks, really) isn't expected to get queued
again. However, lock_dwork is canceled only in cancel_tasks_sync()
(i.e. later in the unmap sequence) and on top of that the cancellation
can get in effect nullified by maybe_kick_acquire(). This may result
in rbd_acquire_lock() executing after rbd_dev_device_release() and
rbd_dev_image_release() run and free and/or reset a bunch of things.
One of the possible failure modes then is a violated
Linus Torvalds [Sun, 17 May 2026 19:02:31 +0000 (12:02 -0700)]
Merge tag 'trace-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Add more functions to the remote allowed list
randconfig found more functions that are allowed for the remote code
for s390 and arm. Add them to the allowed list.
- Fix remote_test error path
If one of the simple ring buffers fails to load, the code is supposed
to rollback its initialized buffers. Instead of rolling back the
buffers for the failed load, it uses the global variable and rolls
back all the successfully loaded buffers.
* tag 'trace-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix desc in error path for the trace remote test module
ring-buffer remote: Avoid unexpected symbol warnings (arm, s390)
Carlos López [Tue, 12 May 2026 10:00:41 +0000 (12:00 +0200)]
virt: sev-guest: Do not use host-controlled page order in cleanup path
When issuing an extended guest request (SVM_VMGEXIT_EXT_GUEST_REQUEST),
get_ext_report() allocates a buffer to retrieve a certificate blob from the
host, keeping track of its size in report_req->certs_len.
However, the host may return SNP_GUEST_VMM_ERR_INVALID_LEN, indicating
an invalid buffer size, as well as the expected length of such buffer.
get_ext_report() subsequently updates report_req->certs_len with the
host-controlled value, and cleans up the buffer by computing a page order
from such value. This is incorrect, as the host-provided length may not
match the page order of the original allocation, potentially resulting
in corruption in the page allocator.
Fix this by using alloc_pages_exact() instead, and reusing @npages to
compute the size passed to free_pages_exact(). For consistency, also
use @npages to compute the size when allocating the pages, even though
this last change has no functional effect.
Fixes: 3e385c0d6ce8 ("virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutex") Signed-off-by: Carlos López <clopez@suse.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Michael Roth <michael.roth@amd.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 17 May 2026 18:07:09 +0000 (11:07 -0700)]
Merge tag 'timers-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
- Fix potential garbage reads in the vDSO gettimeofday code
(Thomas Weißschuh)
* tag 'timers-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
vdso/gettimeofday: Reload sequence counter after switch to time page in do_aux()
Linus Torvalds [Sun, 17 May 2026 17:34:15 +0000 (10:34 -0700)]
Merge tag 'irq-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull IRQ fixes from Ingo Molnar:
- Fix use-after-free in irq_work_single() on PREEMPT_RT (Jiayuan Chen)
- Don't call add_interrupt_randomness() for NMIs in
handle_percpu_devid_irq() (Mark Rutland)
- Remove unused function in the ath79-cpu irqchip driver causing LKP
CI build warnings (Rosen Penev)
- Fix IRQ allocation/teardown leakage regressions in the GICv5 irqchip
driver (Sascha Bischoff)
- Fix an IRQ trigger type regression in the Meson S4 SoC irqchip driver
(Xianwei Zhao)
- Fix CPU offlining regression in the RiscV IMSIC irqchip driver
(Yong-Xuan Wang)
* tag 'irq-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq_work: Fix use-after-free in irq_work_single() on PREEMPT_RT
irqchip/riscv-imsic: Clear interrupt move state during CPU offlining
irqchip/meson-gpio: Use the correct register in meson_s4_gpio_irq_set_type()
irqchip/ath79-cpu: Remove unused function
genirq/chip: Don't call add_interrupt_randomness() for NMIs
irqchip/gic-v5: Allocate ITS parent LPIs as a range
irqchip/gic-v5: Support range allocation for LPIs
irqchip/gic-v5: Move LPI allocation into the LPI domain
Linus Torvalds [Sun, 17 May 2026 16:33:49 +0000 (09:33 -0700)]
Merge tag 'riscv-for-linus-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"Relatively low-impact fixes. Probably the most notable one is that we
no longer ask the monitor-mode firmware to delegate misaligned access
handling to the kernel by default, since the kernel code needs
significant improvement to match the functionality of the firmware.
This change avoids functional problems at some cost in performance,
but shouldn't affect any system with misaligned access handling in
hardware.
- Disable satp register probing when no5lvl is specified on the
kernel command line
- Fix a CFI-related issue with the misaligned access speed
measurement code
- Reduce the CFI shadow stack size limit from 4GB to 2GB (following
ARM64 GCS)
- Prevent the kernel from requesting delegation of misaligned access
faults unless a new Kconfig option, RISCV_SBI_FWFT_DELEGATE_MISALIGNED,
is enabled. This will depend on CONFIG_NONPORTABLE until the
deficiencies of the kernel misaligned access fixup code are fixed
- Fix some potential uninitialized memory accesses in error paths in
compat_riscv_gpr_set() and compat_restore_sigcontext()
- Fix a bug in the RISC-V MIPS vendor errata patching code where a
logical-and was used in place of a bitwise-and
- Drop some unnecessary code in riscv_fill_hwcap_from_isa_string()
- Use macros for isa2hwcap indices in riscv_fill_hwcap(), rather than
open-coding them
- Fix some documentation typos (one affecting 'make htmldocs')"
* tag 'riscv-for-linus-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: misaligned: Make enabling delegation depend on NONPORTABLE
riscv: Docs: fix unmatched quote warning
riscv: cfi: reduce shadow stack size limit from 4GB to 2GB
riscv: cpufeature: Use pre-defined ISA ext macros to index isa2hwcap
riscv: mm: Fixup no5lvl failure when vaddr is invalid
riscv: Fix register corruption from uninitialized cregs on error
riscv: errata: Fix bitwise vs logical AND in MIPS errata patching
Documentation: riscv: cmodx: fix typos
riscv: cpufeature: Drop this_hwcap clear in T-Head vector workaround
riscv: Define __riscv_copy_{,vec_}{words,bytes}_unaligned() using SYM_TYPED_FUNC_START
- sy7636a: Fix sysfs attribute name in documentation
* tag 'hwmon-for-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (lm90) Add lock protection to lm90_alert
hwmon: (lm90) Stop work before releasing hwmon device
docs: hwmon: sy7636a: fix temperature sysfs attribute name
hwmon: (asus_atk0110) Check ACPI_COMPANION() against NULL
hwmon: (acpi_power_meter) Check ACPI_COMPANION() against NULL
tracing: Fix desc in error path for the trace remote test module
During initialisation in remote_test_load(), if one of the
simple_ring_buffer fails to initialise, the error path attempts to
rollback initialised buffers. However, the rollback incorrectly uses the
global pointer to the trace descriptor, which is only set upon
successful load completion. Fix the error path by using the local
pointer to the descriptor.
Linus Torvalds [Sat, 16 May 2026 16:53:14 +0000 (09:53 -0700)]
Merge tag 'powerpc-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Madhavan Srinivasan:
- Fix preempt count leak in sysfs show paths
- Fix error handling in pika_dtm_thread
- Remove pmac_low_i2c_{lock,unlock}()
- Enable all windfarms by default
- Fix dead default for GUEST_STATE_BUFFER_TEST
- Remove redundant preempt_disable|enable() calls from
arch_irq_work_raise()
Thanks to Aboorva Devarajan, Ally Heev, Amit Machhiwal, Bart Van Assche,
Christophe Leroy, Christophe Leroy (CS GROUP), Dan Carpenter, Gautam
Menghani, Harsh Prateek Bora, Julian Braha, Krzysztof Kozlowski, Linus
Walleij, Ma Ke, Ritesh Harjani (IBM), and Sayali Patil
* tag 'powerpc-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/time: Remove redundant preempt_disable|enable() calls from arch_irq_work_raise()
powerpc/hv-gpci: fix preempt count leak in sysfs show paths
powerpc: fix dead default for GUEST_STATE_BUFFER_TEST
powerpc/powermac: Remove pmac_low_i2c_{lock,unlock}()
powerpc/warp: Fix error handling in pika_dtm_thread
powerpc: 82xx: fix uninitialized pointers with free attribute
powerpc/g5: Enable all windfarms by default
Linus Torvalds [Sat, 16 May 2026 16:32:30 +0000 (09:32 -0700)]
Merge tag 'sound-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. All device-specific small changes:
HD-audio:
- Fix NULL pointer dereference in snd_hda_ctl_add()
- ACPI and Kconfig fixes for Cirrus drivers
- A regression fix CA0132 codec
- Various device-specific quirks for HP, Lenovo, Samsung, Framework etc
- Documentation path fix
USB-audio:
- Boundary checks for MIDI endpoint descriptors
- Offload mapping error handling for Qualcomm
- A new device quirk for TTGK Technology USB-C Audio
- A fix for Focusrite Scarlett2 mixer"
* tag 'sound-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/ca0132: Disable auto-detect on manual output select
ALSA: hda/realtek: Add mute LED quirk for HP Pavilion Laptop 16-ag0xxx
ALSA: hda/realtek: ALC269 fixup for Lenovo Yoga Pro 7 15ASH111 audio
ALSA: hda: Fix NULL pointer dereference in snd_hda_ctl_add()
ALSA: hda/realtek: Add quirk for Samsung Galaxy Book5 360 headphone
ALSA: hda/cs35l56: Drop malformed default N from Kconfig
ALSA: hda/realtek: fix mic boost on Framework PTL
ALSA: hda/realtek: Limit mic boost on Positivo DN50E
ALSA: doc: cs35l56: Update path to HDA driver source
ALSA: usb-audio: qcom: Check offload mapping failures
ALSA: hda/realtek: Fix Legion 7 16ITHG6 speaker amp binding
ALSA: usb-audio: Add iface reset and delay quirk for TTGK Technology USB-C Audio
ALSA: scarlett2: Add missing error check when initialise Autogain Status
ALSA: hda: cs35l41: Put ACPI device on missing physical node
ALSA: hda: cs35l56: Put ACPI device after setting companion
ALSA: usb-audio: Bound MIDI 2.0 endpoint descriptor scans
ALSA: usb-audio: Bound MIDI endpoint descriptor scans
ALSA: hda/realtek: Add codec SSID quirk for Lenovo Yoga Pro 9 16IMH9 (17aa:38d5)
Guenter Roeck [Thu, 14 May 2026 21:41:00 +0000 (14:41 -0700)]
hwmon: (lm90) Add lock protection to lm90_alert
Sashiko reports:
lm90_alert() executes in the smbus alert context and calls
lm90_update_confreg() to disable the hardware alert line, without
acquiring hwmon_lock.
Concurrently, sysfs write operations (such as lm90_write_convrate) hold
the hwmon_lock, temporarily modify data->config, and then restore it.
If an alert interrupt occurs concurrently with a sysfs write, the sysfs
path will overwrite the alert handler's modifications to data->config
and the hardware register.
This unintentionally re-enables the hardware alert line while the alarm is
still active, causing an interrupt storm.
Add the missing lock to lm90_alert() to solve the problem.
Fixes: 7a1d220ccb0cc ("hwmon: (lm90) Introduce function to update configuration register") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Guenter Roeck [Thu, 14 May 2026 21:31:49 +0000 (14:31 -0700)]
hwmon: (lm90) Stop work before releasing hwmon device
Sashiko reports:
In lm90_probe(), the devm action to cancel the alert_work and report_work
(lm90_restore_conf) is registered in lm90_init_client() before
devm_hwmon_device_register_with_info() is called.
Because devm executes cleanup actions in reverse order during module
unbind or probe failure, the hwmon device is unregistered and freed first.
If lm90_alert_work() or lm90_report_alarms() runs in the window between
the hwmon device being freed and the delayed works being cancelled,
lm90_update_alarms() will dereference the freed data->hwmon_dev here.
Fix the problem by canceling the workers separately after registering
the hwmon device and before registering the interrupt handler. This ensures
that the workers are canceled after interrupts are disabled and before
the hwmon device is released. Add "shutdown" flag to indicate that device
shutdown is in progress to prevent workers from being re-armed.
Linus Torvalds [Sat, 16 May 2026 00:00:45 +0000 (17:00 -0700)]
Merge tag 'drm-fixes-2026-05-16' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes pull, small and all over fixes, mostly xe and amdgpu,
with some ttm and a core fix for the handle change pain.
core:
- fix for the fix for the handle change race
ttm:
- avoid infinite loop in swap out
- avoid infinite loop in BO shrinking
- convert -EAGAIN from dmem_cgroup_try_charge to -ENOSPC
bridge:
- imx8qxp-pxl2dpi: avoid ERR_PTR with device_node cleanup
i915:
- Skip __i915_request_skip() for already signaled requests
- Fix VSC dynamic range signaling for RGB formats [dp]
xe:
- Madvise fix around purgeability tracking
- Restore engine mask for specific blitter style
- Couple UAF fixes
- Drop unused ggtt_balloon field
loongson:
- use managed cleanup for connector polling
panfrost:
- handle results from reservation locking correctly
qaic:
- check for integer overflows in mmap logic
rocket:
- handle results from reservation locking correctly"
* tag 'drm-fixes-2026-05-16' of https://gitlab.freedesktop.org/drm/kernel: (26 commits)
drm: Replace old pointer to new idr
drm/loongson: Use managed KMS polling
drm/ttm: Fix ttm_bo_shrink() infinite LRU walk on backup failure
drm/ttm: Convert -EAGAIN from dmem_cgroup_try_charge to -ENOSPC
drm/gma500/oaktrail_lvds: fix i2c adapter leaks on init
drm/gma500/oaktrail_lvds: fix hang on init failure
drm/gma500/oaktrail_hdmi: fix i2c adapter leak on setup
drm/xe: Drop unused ggtt_balloon field
accel/qaic: Add overflow check to remap_pfn_range during mmap
drm/i915/dp: Fix VSC dynamic range signaling for RGB formats
drm/i915: skip __i915_request_skip() for already signaled requests
drm/bridge: imx8qxp-pxl2dpi: avoid ERR_PTR with device_node cleanup
drm/amdgpu/gfx_v12_0: set gfx.rs64_enable from PFP header on GFX12
drm/amd/ras: Fix CPER ring debugfs read overflow
drm/amd/display: Wrap DCN32 phantom-plane allocation in DC_RUN_WITH_PREEMPTION_ENABLED
drm/amdgpu: fix userq hang detection and reset
drm/amdgpu: remove almost all calls to amdgpu_userq_detect_and_reset_queues
drm/amdgpu: rework amdgpu_userq_signal_ioctl v3
drm/amdgpu: remove deadlocks from amdgpu_userq_pre_reset
drm/xe/dma-buf: fix UAF with retry loop
...
Commit 5e28b7b94408 introduced a logical error by failing to replace the
newly generated IDR pointer to old id's pointer at the correct location
within the "change handle" logic; this resulted in the issue reported by
syzbot [1].
Specifically, the new IDR object pointer is intended to replace the original
id's pointer during the normal execution flow.
Additionally, an unnecessary conditional check for the ret exit path has
been removed.
Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") Reported-by: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d7c9eed171647e421013 Cc: stable@vger.kernel.org Tested-by: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/tencent_C267296443AAA4567771176886DFF364A305@qq.com
Linus Torvalds [Fri, 15 May 2026 22:40:25 +0000 (15:40 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 MPAM fixes from Catalin Marinas:
- Fix NULL dereference and a false-positive warning when the driver
probes hardware with surprising version numbers
- Fix writing values to the wrong registers when probing
cache-utilisation counters. Replace 'NRDY' probing with a version
that is robust for platforms where the bit is writeable by both
hardware and software
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm_mpam: Check whether the config array is allocated before destroying it
arm_mpam: Fix false positive assert failure during mpam_disable()
arm_mpam: Improve check for whether or not NRDY is hardware managed
arm_mpam: Pretend that NRDY is always hardware managed
arm_mpam: Fix monitor instance selection when checking for hardware NRDY
Linus Torvalds [Fri, 15 May 2026 22:22:26 +0000 (15:22 -0700)]
Merge tag 'iommu-fixes-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
"This is probably the largest fixes pull-request ever sent for IOMMU. I
partially blame it on AI code review which found some issues but there
is also some rework in here to fix issues in the iommu parts of PCI
device reset.
AMD-Vi:
- Add bounds checks to debugfs and table lookups
Intel VT-d:
- Apply an existing quirk for Q35 graphic device
- Skip dev_pasid teardown for the blocked domain to avoid
out-of-bounds access
- Return early if dev_pasid is missing to prevent NULL dereference
or UAF
Core:
- Fix bugs and corner cases in pci_dev_reset_iommu_prepare/done()
- Fix various issues found by AI in iommupt code
MAINTAINERS email address update for RISCV IOMMU"
* tag 'iommu-fixes-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
MAINTAINERS: update Tomasz Jeznach's email address
iommupt: Fix the end_index calculation in __map_range_leaf()
iommupt: Check for missing PAGE_SIZE in the pgsize_bitmap
iommu: Handle unmap error when iommu_debug is enabled
iommu: Fix up map/unmap debugging for iommupt domains
iommu: Fix loss of errno on map failure for classic ops
iommu/vt-d: Avoid NULL pointer dereference or refcount corruption
iommu/vt-d: Fix oops due to out of scope access
iommu/vt-d: Disable DMAR for Intel Q35 IGFX
iommu: Warn on premature unblock during DMA aliased sibling reset
iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset
iommu: Fix ATS invalidation timeouts during __iommu_remove_group_pasid()
iommu: Fix nested pci_dev_reset_iommu_prepare/done()
iommu: Fix pasid attach in pci_dev_reset_iommu_prepare/done()
iommu: Replace per-group resetting_domain with per-gdev blocked flag
iommu: Fix kdocs of pci_dev_reset_iommu_done()
iommu: Fix NULL group->domain dereference in pci_dev_reset_iommu_done()
iommu/amd: Bounds-check devid in __rlookup_amd_iommu()
iommu/amd: Remove latent out-of-bounds access in IOMMU debugfs
Linus Torvalds [Fri, 15 May 2026 22:13:02 +0000 (15:13 -0700)]
Merge tag 'vfio-v7.1-rc4' of https://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:
- Convert vfio-pci BAR resource requests and iomaps initialization
from a lazy, on-demand model to an eager pre-allocation model to
avoid races while preserving legacy error behavior. Fix unchecked
barmap access in dma-buf export path (Matt Evans)
- Introduce an implicit unsigned cast in converting vfio-pci device
offsets to region indexes, closing a potential out-of-bounds
access through the vfio_pci_ioeventfd() interface (Matt Evans)
- Fix a dma-buf kref underflow and stuck wait_for_completion() when
closing a previously revoked dma-buf (Alex Williamson)
* tag 'vfio-v7.1-rc4' of https://github.com/awilliam/linux-vfio:
vfio/pci: Check BAR resources before exporting a DMABUF
vfio/pci: Set up BAR resources and maps in vfio_pci_core_enable()
vfio/pci: Make VFIO_PCI_OFFSET_TO_INDEX() return unsigned
vfio/pci: fix dma-buf kref underflow after revoke
Linus Torvalds [Fri, 15 May 2026 21:52:17 +0000 (14:52 -0700)]
Merge tag 'v7.1-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix integer overflow in read
- Fix smbdirect error cleanup
- Multichannel reconnect fix
- Add some missing defines and correct some references to protocol spec
- Fix oob symlink read
* tag 'v7.1-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smbdirect: Fix error cleanup in smbdirect_map_sges_from_iter()
smb: client: avoid integer overflow in SMB2 READ length check
cifs: client: stage smb3_reconfigure() updates and restore ctx on failure
smb/client: fix possible infinite loop and oob read in symlink_data()
SMB3.1.1: add missing QUERY_DIR info levels
Linus Torvalds [Fri, 15 May 2026 21:48:09 +0000 (14:48 -0700)]
Merge tag 'ceph-for-7.1-rc4' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"An important patch from Hristo that squashes a folio reference leak
that could lead to OOM kills in CephFS and a number of miscellaneous
fixes from Raphael and Slava.
All but two are marked for stable"
* tag 'ceph-for-7.1-rc4' of https://github.com/ceph/ceph-client:
libceph: Fix potential null-ptr-deref in decode_choose_args()
libceph: handle rbtree insertion error in decode_choose_args()
libceph: Fix potential out-of-bounds access in osdmap_decode()
ceph: put folios not suitable for writeback
ceph: add ceph_has_realms_with_quotas() check to ceph_quota_update_statfs()
libceph: Fix potential out-of-bounds access in __ceph_x_decrypt()
ceph: fix BUG_ON in __ceph_build_xattrs_blob() due to stale blob size
ceph: fix a buffer leak in __ceph_setxattr()
libceph: Fix unnecessarily high ceph_decode_need() for uniform bucket
libceph: Fix potential out-of-bounds access in crush_decode()
Linus Torvalds [Fri, 15 May 2026 20:22:07 +0000 (13:22 -0700)]
Merge tag 'for-7.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fixup warning when allocating memory for readahead, __GFP_NOWARN was
accidentally dropped when setting mapping constraints
- in tracepoint of file sync, fix sleeping in atomic context when
handling dentries
- harden initial loading of block group on crafted/fuzzed images,
iterate all chunk mapping entries unconditionally
- fix freeing pages of submitted io after checking for errors
- fix incorrect inode size after remount when using fallocate KEEP_SIZE
mode (also requires disabled 'no-holes' feature)
* tag 'for-7.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix incorrect i_size after remount caused by KEEP_SIZE prealloc gap
btrfs: only release the dirty pages io tree after successful writes
btrfs: tracepoints: fix sleep while in atomic context in btrfs_sync_file()
btrfs: always pass __GFP_NOWARN from add_ra_bio_pages()
btrfs: fix check_chunk_block_group_mappings() to iterate all chunk maps
Linus Torvalds [Fri, 15 May 2026 20:17:46 +0000 (13:17 -0700)]
Merge tag 'xfs-fixes-7.1-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
"A few bug fixes, nothing really special stands out"
* tag 'xfs-fixes-7.1-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Fix typo in comment
xfs: fix the "limiting open zones" message
xfs: flush delalloc blocks on ENOSPC in xfs_trans_alloc_icreate
xfs: check da node block pad field during scrub
xfs: fix memory leak for data allocated by xfs_zone_gc_data_alloc()
xfs: fix memory leak on error in xfs_alloc_zone_info()
xfs: check directory data block header padding in scrub
xfs: zero directory data block padding on write verification
xfs: zero entire directory data block header region at init
xfs: remove the meaningless XFS_ALLOC_FLAG_FREEING
Linus Torvalds [Fri, 15 May 2026 20:11:41 +0000 (13:11 -0700)]
Merge tag 'nfsd-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Fixes for this release:
- Correctness fix for the new sunrpc cache netlink protocol
Marked for stable:
- Correctness fixes for delegated attributes
- Prevent an infinite loop when revoking layouts"
* tag 'nfsd-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Fix infinite loop in layout state revocation
sunrpc: start cache request seqno at 1 to fix netlink GET_REQS
nfsd: update mtime/ctime on COPY in presence of delegated attributes
nfsd: update mtime/ctime on CLONE in presense of delegated attributes
nfsd: fix file change detection in CB_GETATTR
nfsd: fix GET_DIR_DELEGATION when VFS leases are disabled
Linus Torvalds [Fri, 15 May 2026 19:47:00 +0000 (12:47 -0700)]
Merge tag 'block-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- NVMe merge request via Keith:
- Fix memory leak on a passthrough integrity mapping failure (Keith)
- Hide secrets behind debug option (Hannes)
- Fix pci use-after-free for host memory buffer (Chia-Lin Kao)
- Fix tcp taregt use-after-free for data digest (Sagi)
- Revert a mistaken quirk (Alan Cui)
- Fix uevent and controller state race condition (Maurizio)
- Fix apple submission queue re-initialization (Nick Chan)
- Three fixes for blk-integrity, fixing an issue with the user data
mapping and two problems with recomputing number of segments
- Two fixes for the iov_iter bounce buffering
- Fix for the handling of dead zoned write plugs
- ublk max_sectors validation fix, with associated selftest addition
* tag 'block-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
nvme-apple: Reset q->sq_tail during queue init
block: align down bounces bios
block: pass a minsize argument to bio_iov_iter_bounce
selftests: ublk: cap nthreads to kernel's actual nr_hw_queues
block: fix handling of dead zone write plugs
block: bio-integrity: Fix null-ptr-deref in bio_integrity_map_user()
block: recompute nr_integrity_segments in blk_insert_cloned_request
block: don't overwrite bip_vcnt in bio_integrity_copy_user()
nvme: fix race condition between connected uevent and STARTED_ONCE flag
Revert "nvme: add quirk NVME_QUIRK_IGNORE_DEV_SUBNQN for 144d:a808"
nvmet-tcp: Fix potential UAF when ddgst mismatch
nvme-pci: fix use-after-free in nvme_free_host_mem()
nvmet-auth: Do not print DH-HMAC-CHAP secrets
nvme: fix bio leak on mapping failure
nvme: make prp passthrough usage less scary
ublk: reject max_sectors smaller than PAGE_SECTORS in parameter validation
Linus Torvalds [Fri, 15 May 2026 19:34:02 +0000 (12:34 -0700)]
Merge tag 'io_uring-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Small series sanitizing the locking done for either modifying or
reading a chain of requests
- If the application has a pid namespace, ensure that the sqthread pid
is correctly printed in fdinfo
- Fix for a hashing issue in the io-wq thread pool, which could lead to
a use-after-free
- Kill dead argument from io_prep_rw_pi()
- Fix for a missed validation of the CQ ring head, affecting CQE refill
* tag 'io_uring-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: validate user-controlled cq.head in io_cqe_cache_refill()
io-wq: check that the predecessor is hashed in io_wq_remove_pending()
io_uring/rw: drop unused attr_type_mask from io_prep_rw_pi()
io_uring: hold uring_lock across io_kill_timeouts() in cancel path
io_uring: defer linked-timeout chain splice out of hrtimer context
io_uring: hold uring_lock when walking link chain in io_wq_free_work()
io_uring/fdinfo: translate SqThread PID through caller's pid_ns
Linus Torvalds [Fri, 15 May 2026 19:27:03 +0000 (12:27 -0700)]
Merge tag 'hardening-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fix from Kees Cook:
- gcc-plugins: Fix GCC 16 removal of CONST_CAST macros
* tag 'hardening-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: Always define CONST_CAST_GIMPLE and CONST_CAST_TREE
Linus Torvalds [Fri, 15 May 2026 19:24:09 +0000 (12:24 -0700)]
Merge tag 'docs-7.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux
Pull documentation fixes from Jonathan Corbet:
"This is Willy Tarreau's new document clarifying the definition and
handling of security-related bugs, which we're trying to get out there
quickly on the theory that some of the bug reporters might actually
read and pay attention to it"
* tag 'docs-7.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux:
docs: threat-model: don't limit root capabilities to CAP_SYS_ADMIN
docs: security-bugs: add a link to the threat-model documentation
Documentation: security-bugs: clarify requirements for AI-assisted reports
Documentation: security-bugs: explain what is and is not a security bug
Documentation: security-bugs: do not systematically Cc the security team
Arnd Bergmann [Fri, 15 May 2026 10:57:09 +0000 (12:57 +0200)]
ring-buffer remote: Avoid unexpected symbol warnings (arm, s390)
The now more verbose check found more architecture specific symbol
missing from the whitelist, during randconfig testing on s390
and 32-bit arm:
Unexpected symbols in kernel/trace/simple_ring_buffer.o:
U __aeabi_unwind_cpp_pr1
Unexpected symbols in kernel/trace/simple_ring_buffer.o:
U __s390_indirect_jump_r1
U __s390_indirect_jump_r10
U __s390_indirect_jump_r14
U __s390_indirect_jump_r2
U __s390_indirect_jump_r5
U __s390_indirect_jump_r7
U __s390_indirect_jump_r8
U __s390_indirect_jump_r9
make[6]: *** [/home/arnd/arm-soc/kernel/trace/Makefile:160: kernel/trace/simple_ring_buffer.o.checked] Error 1
Add these to the list and keep it roughly sorted into sanitizer
and architecture symbols.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://patch.msgid.link/20260515105717.1023007-1-arnd@kernel.org Fixes: 1211907ac0b5 ("tracing: Generate undef symbols allowlist for simple_ring_buffer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Linus Torvalds [Fri, 15 May 2026 18:24:51 +0000 (11:24 -0700)]
Merge tag 'for-linus-7.1b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- one simple cleanup
- a fix for a corner case when running as Xen PV dom0
- a fix of a regression for Xen PV guests, introduced in 7.0
* tag 'for-linus-7.1b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Tolerate nested XEN_LAZY_MMU entering/leaving
x86/xen: Fix xen_e820_swap_entry_with_ram()
xen/arm: Replace __ASSEMBLY__ with __ASSEMBLER__ in interface.h
Linus Torvalds [Fri, 15 May 2026 18:12:54 +0000 (11:12 -0700)]
Merge tag 'platform-drivers-x86-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- asus-nb-wmi:
- Use existing keyboard quirk for ASUS Zenbook Duo UX8407AA
- hp-wmi:
- Add support for Victus 16-r0xxx (8BC2)
- intel/vsec_tpmi:
- Move debugfs register before creating devices
- Prevent fault during unbind
- lenovo-wmi-*:
- Fix memory leak in lwmi_dev_evaluate_int()
- Balance IDA id allocation and free
- Balance component bind and unbind
- Prevent sending uninitialized WMI arguments to the device
- Decouple lenovo-wmi-gamezone and lenovo-wmi-other to simplify
module dependency graph
- Limit adding attributes to supported devices
- samsung-galaxybook:
- Handle kbd backlight, mic mute and camera block hotkeys
* tag 'platform-drivers-x86-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8407AA
platform/x86: lenovo-wmi-other: Limit adding attributes to supported devices
platform/x86: lenovo-wmi-other: Add Attribute ID helper functions
platform/x86: lenovo-wmi-helpers: Move gamezone enums to wmi-helpers
platform/x86: lenovo: Decouple lenovo-wmi-gamezone and lenovo-wmi-other
platform/x86: lenovo-wmi-other: Fix tunable_attr_01 struct members
platform/x86: lenovo-wmi-other: Zero initialize WMI arguments
platform/x86: lenovo-wmi-other: Balance component bind and unbind
platform/x86: lenovo-wmi-other: Balance IDA id allocation and free
platform/x86: lenovo-wmi-helpers: Fix memory leak in lwmi_dev_evaluate_int()
platform/x86: hp-wmi: Add support for Victus 16-r0xxx (8BC2)
platform/x86/intel/tpmi/plr: Prevent fault during unbind
platform/x86: intel: Add notifiers support
platform/x86: intel: Move debugfs register before creating devices
platform/x86: samsung-galaxybook: Handle ACPI hotkey notifications
platform/x86: samsung-galaxybook: Refactor camera lens cover input device
Linus Torvalds [Fri, 15 May 2026 17:38:37 +0000 (10:38 -0700)]
Merge tag 'v7.1-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- Fix potential dead-lock in rhashtable when used by xattr
- Avoid calling kvfree on atomic path in rhashtable
* tag 'v7.1-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
rhashtable: Add bucket_table_free_atomic() helper
mm/slab: Add kvfree_atomic() helper
rhashtable: drop ht->mutex in rhashtable_free_and_destroy()
Matt DeVillier [Thu, 7 May 2026 14:58:41 +0000 (09:58 -0500)]
ALSA: hda/ca0132: Disable auto-detect on manual output select
Commit 778031e1658d ("ALSA: hda/ca0132: Set HP/Speaker
auto-detect default from headphone pin verb") enables HP/Speaker
auto-detect by default when the headphone pin supports presence detect.
With auto-detect enabled, ca0132_select_out() and ca0132_alt_select_out()
choose the output from jack presence instead of the manual HP/Speaker
selection. This means selecting speaker output while headphones are
plugged in updates the control state, but audio still routes to the
headphones.
Treat an explicit manual output selection as a request to leave
auto-detect mode. Clear the HP/Speaker auto-detect switch before applying
the manual selection, and notify userspace so the auto-detect control
state is updated in mixers. Do this for both the normal HP/Speaker
Playback Switch and the alternate Output Select control used by desktop
cards.
This keeps auto-detect enabled by default for devices with jack presence
detection, while preserving the expected behavior that a manual output
choice takes effect immediately.
Adrien Burnett [Thu, 14 May 2026 16:59:05 +0000 (18:59 +0200)]
ALSA: hda/realtek: Add mute LED quirk for HP Pavilion Laptop 16-ag0xxx
Add a SND_PCI_QUIRK entry for the HP Pavilion Laptop 16-ag0xxx
(subsystem 0x103c:0x8cbc, Realtek ALC245). The
ALC245_FIXUP_HP_X360_MUTE_LEDS fixup is already used by the
neighbouring HP Pavilion Aero Laptop 13-bg0xxx (0x103c:0x8cbd);
it chains the master-mute COEF handler with the GPIO mic-mute
LED handler, which is what this machine needs.
Tested on the affected hardware: both the mute and mic-mute key
LEDs respond correctly to the keyboard hotkeys after this change.
Jackie Dong [Thu, 14 May 2026 15:39:40 +0000 (23:39 +0800)]
ALSA: hda/realtek: ALC269 fixup for Lenovo Yoga Pro 7 15ASH111 audio
Volume control for the speakers on the Lenovo Yoga Pro 7 15ASH11 laptop
doesn't work.
The DAC routing is the same as on the ThinkPad X1 Gen7 function, so reuse
the alc285_fixup_thinkpad_x1_gen7 to get it working.
Quan Sun [Thu, 14 May 2026 13:22:45 +0000 (21:22 +0800)]
ALSA: hda: Fix NULL pointer dereference in snd_hda_ctl_add()
snd_hda_ctl_add() dereferences kctl->id.subdevice without checking
whether kctl is NULL. Multiple callers in sound/hda/codecs/ca0132.c
pass the return value of snd_ctl_new1() directly to snd_hda_ctl_add()
without a NULL check:
snd_ctl_new1() returns NULL when the underlying snd_ctl_new() fails
on memory allocation (kzalloc_flex),which can occur under memory
pressure or via fault injection.
Add a NULL check at the entry of snd_hda_ctl_add(), matching the
pattern already used by snd_ctl_add_replace() at the same call
path (sound/core/control.c:515). Return -EINVAL to let callers
handle the error gracefully.
Markus Kramer [Wed, 13 May 2026 22:28:18 +0000 (00:28 +0200)]
ALSA: hda/realtek: Add quirk for Samsung Galaxy Book5 360 headphone
The Samsung Galaxy Book5 360 (NP750QHA, PCI subsystem ID 0x144d:0xc902)
has severe audio distortion on the 3.5mm headphone jack. Applying
ALC256_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET corrects the output path
configuration, consistent with fixes already applied to other Samsung
Galaxy Book models using the same ALC256 codec.
Andy Shevchenko [Wed, 13 May 2026 16:27:58 +0000 (18:27 +0200)]
ALSA: hda/cs35l56: Drop malformed default N from Kconfig
First of all, it has to be 'default n' (small letter n), otherwise
it looks for CONFIG_N which is absent and in case of appearance
will enable something unrelated. Second and most important is that
'n' *is* the default 'default' already. Hence just drop malformed
line.
Daniel Schaefer [Wed, 13 May 2026 15:55:13 +0000 (23:55 +0800)]
ALSA: hda/realtek: fix mic boost on Framework PTL
In addition to the mic jack fix, also need to avoid boosting the
internal mic too much, otherwise >50% input volume clips a lot.
Also add a second SSID. We have one for the classic chassis/speaker and
one for the new Pro chassis/speaker.
To: Jaroslav Kysela <perex@perex.cz>
To: Takashi Iwai <tiwai@suse.com>
To: linux-sound@vger.kernel.org Cc: Dustin L. Howett <dustin@howett.net> Cc: linux@frame.work Signed-off-by: Daniel Schaefer <dhs@frame.work> Link: https://patch.msgid.link/20260513155513.11683-1-dhs@frame.work Signed-off-by: Takashi Iwai <tiwai@suse.de>
ALSA: hda/realtek: Limit mic boost on Positivo DN50E
The internal mic boost on the Positivo DN50E is too high.
Fix this by applying the ALC269_FIXUP_LIMIT_INT_MIC_BOOST fixup to the machine
to limit the gain.
uaudio_transfer_buffer_setup() calls dma_get_sgtable() and then passes
the sg_table to uaudio_iommu_map_xfer_buf() without checking whether sg
table construction succeeded. If dma_get_sgtable() fails, the sg_table
contents are not valid.
uaudio_iommu_map_pa() also ignores iommu_map() failures for the event and
transfer rings and still returns the allocated IOVA to the QMI response.
That can expose an unmapped IOVA to the audio DSP. For transfer rings,
the failed mapping also leaves the IOVA allocator state marked in use.
Check both operations. Free the coherent transfer buffer when sg table
construction fails, free the sg table when transfer-buffer IOMMU mapping
fails, and release the transfer-ring IOVA if iommu_map() fails. Also
return the existing event-ring IOVA when the event ring is already mapped,
matching the pre-split helper behavior.
Myeonghun Pak [Wed, 13 May 2026 06:57:00 +0000 (15:57 +0900)]
drm/loongson: Use managed KMS polling
lsdc_pci_probe() initializes KMS polling before setting up vblank support,
requesting the IRQ and registering the DRM device. If any of those later
steps fails, probe returns without finalizing polling. The driver also
never finalizes polling on regular removal.
Use drmm_kms_helper_poll_init() so polling is tied to the DRM device
lifetime and automatically finalized on probe failure and device removal.
This issue was identified during our ongoing static-analysis research while
reviewing kernel code.
Fixes: f39db26c5428 ("drm: Add kms driver for loongson display controller") Cc: stable@vger.kernel.org Co-developed-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Ijae Kim <ae878000@gmail.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Jianmin Lv <lvjianmin@loongson.cn> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Myeonghun Pak <mhun512@gmail.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260513065706.23803-1-mhun512@gmail.com
The Lenovo Legion 7 16ITHG6 uses codec SSID 17aa:3855, but its PCI
SSID is 17aa:3811. The latter is now also used by the Legion S7 15IMH05
quirk, which is matched before codec SSID fallback and incorrectly
routes Legion 7 16ITHG6 machines to ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS.
That fixup does not bind the CLSA0101 CS35L41 companion amplifiers,
making the built-in speakers silent even though playback appears to be
active.
Add a codec SSID quirk for 17aa:3855 before the conflicting PCI SSID
quirk so that the Legion 7 16ITHG6 uses ALC287_FIXUP_LEGION_16ITHG6.
This restores CS35L41 firmware loading and binds both speaker
amplifiers.
Fixes: 67f4c61a73e9 ("ALSA: hda/realtek: Add quirk for Legion S7 15IMH") Cc: stable@vger.kernel.org Tested-by: Nicholas Bonello <hadobedo@gmail.com> Assisted-by: Codex:GPT-5 Signed-off-by: Nicholas Bonello <hadobedo@gmail.com> Link: https://patch.msgid.link/20260508225507.47667-1-hadobedo@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
Lianqin Hu [Fri, 8 May 2026 12:49:34 +0000 (12:49 +0000)]
ALSA: usb-audio: Add iface reset and delay quirk for TTGK Technology USB-C Audio
Setting up the interface when suspended/resumeing fail on this card.
Adding a reset and delay quirk will eliminate this problem.
usb 1-1: new full-speed USB device number 2 using xhci-hcd
usb 1-1: New USB device found, idVendor=3302, idProduct=17c2
usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-1: Product: USB-C Audio
usb 1-1: Manufacturer: TTGK Technology
usb 1-1: SerialNumber: 170120210706
ALSA: scarlett2: Add missing error check when initialise Autogain Status
When initialise new control with scarlett2_add_new_ctl() function for
Autogain Status, scarlett2_add_new_ctl() might throw an error. So, add
error check after initialise new control for Autogain Status.
This is reported by Coverity Scan with CID 1598781 as UNUSED_VALUE.
Jason Gunthorpe [Tue, 12 May 2026 16:46:17 +0000 (13:46 -0300)]
iommupt: Fix the end_index calculation in __map_range_leaf()
Sashiko noticed a mismatch of units in this math: num_leaves is
actually the number of leaf *entries* (so a 16-item contiguous leaf
is one num_leaves), while index is in items. The mismatch in maths
causes __map_range_leaf() to exit early instead of efficiently
filling a larger range of contiguous PTEs.
The early exit is caught by the functions above and then
__map_range_leaf() is re-invoked, so there is no functional issue.
Correct the misuse of units by adjusting num_leaves with the leaf
size and avoid the performance cost of looping externally.
There are also some mismatched types for num_leaves; simplify
things to remove the duplicated calculations.
Jason Gunthorpe [Tue, 12 May 2026 16:46:16 +0000 (13:46 -0300)]
iommupt: Check for missing PAGE_SIZE in the pgsize_bitmap
Sashiko pointed out that the driver could drop PAGE_SIZE from the
pgsize_bitmap. That is technically allowed but nothing does it, and
such an iommu_domain would not be used with the DMA API today.
Still, it is against the design and it is trivial to fix up. Lift
the PT_WARN_ON to the if branch and just skip the fast path.
Jason Gunthorpe [Tue, 12 May 2026 16:46:15 +0000 (13:46 -0300)]
iommu: Handle unmap error when iommu_debug is enabled
Sashiko noticed a latent bug where the map error flow called iommu_unmap()
which calls iommu_debug_unmap_begin()/iommu_debug_unmap_end() however
since this is an error path the map flow never actually established the
original iommu_debug_map() it will malfunction.
Lift the unmap error handling into iommu_map_nosync() and reorder it so
the trace_map()/iommu_debug_map() records the partial mapping and then
immediately unmaps it. This avoid creating the unbalanced tracking and
provides saner tracing instead of a unmap unmatched to any map.
Jason Gunthorpe [Tue, 12 May 2026 16:46:14 +0000 (13:46 -0300)]
iommu: Fix up map/unmap debugging for iommupt domains
Sashiko noticed a few issues in this path, and a few more were
found on review. Tidy them up further. These are intertwined
because the debug code depends on some of the WARN_ONs to function
right:
Lift into iommu_map_nosync():
- The might_sleep_if()
- 0 pgsize_bitmap WARN_ON
- Promote the illegal domain->type to a WARN_ON
- WARN_ON for illegal gfp flags
Then remove the return 0 since it is now safe to call
iommu_debug_map().
Lift into __iommu_unmap():
- 0 pgsize_bitmap WARN_ON
- Promote the illegal domain->type to a WARN_ON
- iommu_debug_unmap_begin()
This now pairs with the unconditional iommu_debug_map() on the
mapping side. Thus iommu debugging now works for iommupt along
with some of the other debugging features.
Jason Gunthorpe [Tue, 12 May 2026 16:46:13 +0000 (13:46 -0300)]
iommu: Fix loss of errno on map failure for classic ops
A typo, likely from a rebase, inverted the condition and caused
errors to be lost. Fix it to be "if (ret)".
This was breaking iommu_create_device_direct_mappings() on drivers
that don't use iommupt and don't fully set up their domain in
alloc_pages() (i.e., SMMUv2). In this case the first call of
iommu_create_device_direct_mappings() should fail due to the
incompletely initialized domain. Since it wrongly returns success,
the second call to iommu_create_device_direct_mappings() doesn't
happen and IOMMU_RESV_DIRECT is never set up.
Jens Axboe [Fri, 15 May 2026 01:14:33 +0000 (19:14 -0600)]
Merge tag 'nvme-7.1-2026-05-14' of git://git.infradead.org/nvme into block-7.1
Pull NVMe fixes from Keith:
"- Fix memory leak on a passthrough integrity mapping failure (Keith)
- Hide secrets behind debug option (Hannes)
- Fix pci use-after-free for host memory buffer (Chia-Lin Kao)
- Fix tcp taregt use-after-free for data digest (Sagi)
- Revert a mistaken quirk (Alan Cui)
- Fix uevent and controller state race condition (Maurizio)
- Fix apple submission queue re-initialization (Nick Chan)"
* tag 'nvme-7.1-2026-05-14' of git://git.infradead.org/nvme:
nvme-apple: Reset q->sq_tail during queue init
nvme: fix race condition between connected uevent and STARTED_ONCE flag
Revert "nvme: add quirk NVME_QUIRK_IGNORE_DEV_SUBNQN for 144d:a808"
nvmet-tcp: Fix potential UAF when ddgst mismatch
nvme-pci: fix use-after-free in nvme_free_host_mem()
nvmet-auth: Do not print DH-HMAC-CHAP secrets
nvme: fix bio leak on mapping failure
nvme: make prp passthrough usage less scary
Linus Torvalds [Thu, 14 May 2026 21:30:01 +0000 (14:30 -0700)]
Merge tag 'hid-for-linus-2026051401' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- fixes for a few OOB/UAF in several HID drivers (Florian Pradines, Lee
Jones, Michael Zaidman, Rosalie Wanders, Sangyun Kim and Tomasz
Pakuła)
- more general sanitation of input data, dealing with potentially
malicious hardware in hid-core (Benjamin Tissoires)
- a few device-specific quirks and fixups
* tag 'hid-for-linus-2026051401' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (22 commits)
HID: logitech-hidpp: Add support for newer Bluetooth keyboards
HID: pidff: Fix integer overflow in pidff_rescale
HID: i2c-hid: add reset quirk for BLTP7853 touchpad
HID: core: introduce hid_safe_input_report()
HID: pass the buffer size to hid_report_raw_event
HID: google: hammer: stop hardware on devres action failure
HID: appletb-kbd: run inactivity autodim from workqueues
HID: appletb-kbd: fix UAF in inactivity-timer cleanup path
HID: playstation: Clamp num_touch_reports
HID: magicmouse: Prevent out-of-bounds (OOB) read during DOUBLE_REPORT_ID
HID: mcp2221: fix OOB write in mcp2221_raw_event()
HID: quirks: really enable the intended work around for appledisplay
HID: hid-sjoy: race between init and usage
HID: uclogic: Fix regression of input name assignment
HID: intel-thc-hid: Intel-quickspi: Fix some error codes
HID: hid-lenovo-go-s: restore OS_TYPE after resume from s2idle
HID: elan: Add support for ELAN SB974D touchpad
HID: sony: add missing size validation for Rock Band 3 Pro instruments
HID: sony: add missing size validation for SMK-Link remotes
HID: sony: remove unneeded WARN_ON() in sony_leds_init()
...
Linus Torvalds [Thu, 14 May 2026 21:06:31 +0000 (14:06 -0700)]
Merge tag 'acpi-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support fixes from Rafael Wysocki:
"These fix several platform drivers that use the ACPI companion of the
given platform device without checking its presence, which may lead to
a NULL pointer dereference or other kind of malfunction if the driver
is forced to match a device without an ACPI companion via driver
override, and restore debug log level for some messages in the ACPI
CPPC library:
- Check ACPI_COMPANION() against NULL during probe in several core
ACPI device drivers (Rafael Wysocki)
- Restore log level of messages in amd_set_max_freq_ratio() (Mario
Limonciello)"
* tag 'acpi-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PAD: xen: Check ACPI_COMPANION() against NULL
ACPI: driver: Check ACPI_COMPANION() against NULL during probe
Revert "ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn"
David Howells [Wed, 13 May 2026 18:50:02 +0000 (19:50 +0100)]
smbdirect: Fix error cleanup in smbdirect_map_sges_from_iter()
Fix smbdirect_map_sges_from_iter() to use pre-decrement, not post-decrement
so that it cleans up the correct slots.
Fixes: e5fbdde43017 ("cifs: Add a function to build an RDMA SGE list from an iterator") Closes: https://sashiko.dev/#/patchset/20260326104544.509518-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Stefan Metzmacher <metze@samba.org>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Tom Talpey <tom@talpey.com>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Matt Evans [Mon, 11 May 2026 14:58:24 +0000 (07:58 -0700)]
vfio/pci: Check BAR resources before exporting a DMABUF
A DMABUF exports access to BAR resources and, although they are
requested at startup time, we need to ensure they really were reserved
before exporting. Otherwise, it's possible to access unreserved
resources through the export.
Matt Evans [Mon, 11 May 2026 14:58:23 +0000 (07:58 -0700)]
vfio/pci: Set up BAR resources and maps in vfio_pci_core_enable()
Previously BAR resource requests and the corresponding pci_iomap()
were performed on-demand and without synchronisation, which was racy.
Rather than add synchronisation, it's simplest to address this by
doing both activities from vfio_pci_core_enable().
The resource allocation and/or pci_iomap() can still fail; their
status is tracked and existing calls to vfio_pci_core_setup_barmap()
will fail in a similar way to before. This keeps the point of failure
as observed by userspace the same, i.e. failures to request/map unused
BARs are benign.
With the support of nested lazy mmu sections it can happen that
arch_enter_lazy_mmu_mode() is being called twice without a call of
arch_leave_lazy_mmu_mode() in between, as the lazy_mmu_*() helpers
are not disabling preemption when checking for nested lazy mmu
sections.
This is a problem when running as a Xen PV guest, as
xen_enter_lazy_mmu() and xen_leave_lazy_mmu() don't tolerate this
case.
Fix that in xen_enter_lazy_mmu() and xen_leave_lazy_mmu() in order
not to hurt all other lazy mmu mode users.
Fixes: 291b3abed657 ("x86/xen: use lazy_mmu_state when context-switching") Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260508143933.493013-1-jgross@suse.com>
Juergen Gross [Tue, 5 May 2026 10:24:17 +0000 (12:24 +0200)]
x86/xen: Fix xen_e820_swap_entry_with_ram()
When swapping a not page-aligned E820 map entry with RAM, the start
address of the modified entry is calculated wrong (the offset into the
page is subtracted instead of being added to the page address).
Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory") Reported-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260505102417.208138-1-jgross@suse.com>
- ipv6: flowlabel: enforce per-netns limit for unprivileged callers
- tls: fix off-by-one in sg_chain entry count for wrapped sk_msg ring
- smc: avoid NULL deref of conn->lnk in smc_msg_event tracepoint
- sctp: revalidate list cursor after sctp_sendmsg_to_asoc() in SCTP_SENDALL
- batman-adv:
- reject new tp_meter sessions during teardown
- purge non-released claims
- eth:
- i40e: cleanup PTP registration on probe failure
- idpf: fix double free and use-after-free in aux device error paths
- ena: fix potential use-after-free in get_timestamp"
* tag 'net-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
net: phy: DP83TC811: add reading of abilities
net: tls: prevent chain-after-chain in plain text SG
net: tls: fix off-by-one in sg_chain entry count for wrapped sk_msg ring
net/smc: reject CHID-0 ACCEPT that matches an empty ism_dev slot
macsec: use rcu_work to defer TX SA crypto cleanup out of softirq
macsec: use rcu_work to defer RX SA crypto cleanup out of softirq
macsec: introduce dedicated workqueue for SA crypto cleanup
net: net_failover: Fix the deadlock in slave register
MAINTAINERS: update atlantic driver maintainer
selftests/tc-testing: Add QFQ/CBS qlen underflow test
net/sched: sch_cbs: Call qdisc_reset for child qdisc
FDDI: defza: Sanitise the reset safety timer
net: ethernet: ravb: Do not check URAM suspension when WoL is active
ethtool: fix ethnl_bitmap32_not_zero() bit interval semantics
net/smc: avoid NULL deref of conn->lnk in smc_msg_event tracepoint
net/smc: fix sleep-inside-lock in __smc_setsockopt() causing local DoS
net: atm: fix skb leak in sigd_send() default branch
net: ethtool: phy: avoid NULL deref when PHY driver is unbound
net: atlantic: preserve PCI wake-from-D3 on shutdown when WOL enabled
net: shaper: reject QUEUE scope handle with missing id
...
Jeremy Erazo [Thu, 14 May 2026 12:03:34 +0000 (12:03 +0000)]
smb: client: avoid integer overflow in SMB2 READ length check
SMB2 READ response validation in cifs_readv_receive() and
handle_read_data() checks data_offset + data_len against the received
buffer length. Both values are attacker-controlled fields from the
server response and are stored as unsigned int, so the addition can
wrap before the bounds check:
fs/smb/client/transport.c:1259
if (!use_rdma_mr && (data_offset + data_len > buflen))
fs/smb/client/smb2ops.c:4839
else if (buf_len >= data_offset + data_len)
A malicious SMB server can use this to bypass validation. In the
non-encrypted receive path the client attempts an oversized socket
read and stalls for the SMB response timeout (180 seconds) before
reconnecting. In the SMB3 encrypted path, runtime testing shows the
malformed length can reach copy_to_iter() in handle_read_data() with
attacker-controlled size, where usercopy hardening stops the oversized
copy before bytes reach userspace.
Guard both call sites with check_add_overflow(), which is already
used elsewhere in this subsystem (smb2pdu.c). On overflow, treat the
response as malformed and reject with -EIO.
Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Thu, 14 May 2026 15:53:24 +0000 (08:53 -0700)]
Merge tag 'audit-pr-20260513' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit fixes from Paul Moore:
- Correctly log the inheritable capabilities
- Honor AUDIT_LOCKED in the AUDIT_TRIM and AUDIT_MAKE_EQUIV commands
* tag 'audit-pr-20260513' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: enforce AUDIT_LOCKED for AUDIT_TRIM and AUDIT_MAKE_EQUIV
audit: fix incorrect inheritable capability in CAPSET records
Linus Torvalds [Wed, 13 May 2026 18:37:18 +0000 (11:37 -0700)]
ptrace: slightly saner 'get_dumpable()' logic
The 'dumpability' of a task is fundamentally about the memory image of
the task - the concept comes from whether it can core dump or not - and
makes no sense when you don't have an associated mm.
And almost all users do in fact use it only for the case where the task
has a mm pointer.
But we have one odd special case: ptrace_may_access() uses 'dumpable' to
check various other things entirely independently of the MM (typically
explicitly using flags like PTRACE_MODE_READ_FSCREDS). Including for
threads that no longer have a VM (and maybe never did, like most kernel
threads).
It's not what this flag was designed for, but it is what it is.
The ptrace code does check that the uid/gid matches, so you do have to
be uid-0 to see kernel thread details, but this means that the
traditional "drop capabilities" model doesn't make any difference for
this all.
Make it all make a *bit* more sense by saying that if you don't have a
MM pointer, we'll use a cached "last dumpability" flag if the thread
ever had a MM (it will be zero for kernel threads since it is never
set), and require a proper CAP_SYS_PTRACE capability to override.
DaeMyung Kang [Wed, 13 May 2026 13:26:22 +0000 (22:26 +0900)]
cifs: client: stage smb3_reconfigure() updates and restore ctx on failure
smb3_reconfigure() moves strings out of cifs_sb->ctx before the
multichannel update, so a later failure can leave the live context
with NULL strings or options that do not match the session.
Stage the new ctx separately, commit it only on success, and restore
the snapshot on failure. Also make smb3_sync_session_ctx_passwords()
all-or-nothing.
Commit session passwords before channel updates so newly added channels
authenticate with the staged credentials.
Fixes: ef529f655a2c ("cifs: client: allow changing multichannel mount options on remount") Reported-by: RAJASI MANDAL <rajasimandalos@gmail.com> Closes: https://lore.kernel.org/lkml/CAEY6_V1+dzW3OD5zqXhsWyXwrDTrg5tAMGZ1AJ7_GAuRE+aevA@mail.gmail.com/ Link: https://lore.kernel.org/lkml/xkr2dlvgibq5j6gkcxd3yhhnj4atgxw2uy4eug2pxm7wy7nbms@iq6cf5taa65v/ Reviewed-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Nick Chan [Thu, 14 May 2026 13:16:01 +0000 (21:16 +0800)]
nvme-apple: Reset q->sq_tail during queue init
Fixes a "duplicate tag error for tag 0" firmware crash during controller
reset while setting up a queue on Apple A11 / T8015 caused by stale
entries in the submission queue due to an invalid sq_tail offset after
reset.
Fixes: 04d8ecf37b5e ("nvme: apple: Add Apple A11 support") Cc: stable@vger.kernel.org Suggested-by: Yuriy Havrylyuk <yhavry@gmail.com> Reviewed-by: Sven Peter <sven@kernel.org> Signed-off-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
Ye Bin [Thu, 14 May 2026 13:14:18 +0000 (21:14 +0800)]
smb/client: fix possible infinite loop and oob read in symlink_data()
On 32-bit architectures, the infinite loop is as follows:
len = p->ErrorDataLength == 0xfffffff8
u8 *next = p->ErrorContextData + len
next == p
On 32-bit architectures, the out-of-bounds read is as follows:
len = p->ErrorDataLength == 0xfffffff0
u8 *next = p->ErrorContextData + len
next == (u8 *)p - 8
Reported-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+") Cc: stable@vger.kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Thomas Hellström [Mon, 11 May 2026 16:24:43 +0000 (18:24 +0200)]
drm/ttm: Fix ttm_bo_shrink() infinite LRU walk on backup failure
Apply the same fix as b2ed01e7ad ("drm/ttm: Fix ttm_bo_swapout()
infinite LRU walk on swapout failure") to the ttm_bo_shrink() path.
Move del_bulk_move from before the backup to after success only,
using ttm_resource_del_bulk_move_unevictable() since the resource
is now unevictable once fully backed up.
Fixes: 70d645deac98 ("drm/ttm: Add helpers for shrinking") Cc: Christian König <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v6.15+ Assisted-by: GitHub_Copilot:claude-opus-4.6 Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260511162443.24352-1-thomas.hellstrom@linux.intel.com Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Sven Schuchmann [Tue, 12 May 2026 07:19:47 +0000 (09:19 +0200)]
net: phy: DP83TC811: add reading of abilities
At this time the driver is not listing any speeds
it supports. This should be ETHTOOL_LINK_MODE_100baseT1_Full_BIT
for DP83TC811. Add the missing call for phylib to read the abilities.
Fixes: b753a9faaf9a ("net: phy: DP83TC811: Introduce support for the DP83TC811 phy") Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Sven Schuchmann <schuchmann@schleissheimer.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260512071949.6218-1-schuchmann@schleissheimer.de
[pabeni@redhat.com: dropped revision history] Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jonathan Corbet [Wed, 13 May 2026 20:58:53 +0000 (14:58 -0600)]
docs: threat-model: don't limit root capabilities to CAP_SYS_ADMIN
The threat-model document says that only users with CAP_SYS_ADMIN can carry
out a number of admin-level tasks, but there are numerous capabilities that
can confer that sort of power. Generalize the text slightly to make it
clear that CAP_SYS_ADMIN is not the only all-powerful capability.
Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Jakub Kicinski [Mon, 11 May 2026 17:49:18 +0000 (10:49 -0700)]
net: tls: prevent chain-after-chain in plain text SG
Sashiko points out that if end = 0 (start != 0) the current
code will create a chain link to content type right after
the wrap link:
This would create a chain where the wrap link points directly
to another chain link. The scatterlist API sg_next iterator
does not recursively resolve consecutive chain links.
meaning this is illegal input to crypto.
The wrapping link is unnecessary if end = 0. end is the entry after
the last one used so end = 0 means there's nothing pushed after
the wrap:
end start i
v v v
[ ]...[ ][ d ][ d ][ d ][ d ][rsv for wrap]
Skip the wrapping in this case.
TLS 1.3 can use the "wrapping slot" for it's chaining if end = 0.
This avoids the chain-after-chain.
Move the wrap chaining before marking END and chaining off content
type, that feels like more logical ordering to me, but should not
matter from functional perspective.
Reported-by: Sashiko <sashiko-bot@kernel.org> Fixes: 9aaaa56845a0 ("bpf: Sockmap/tls, skmsg can have wrapped skmsg that needs extra chaining") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20260511174920.433155-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Mon, 11 May 2026 17:49:17 +0000 (10:49 -0700)]
net: tls: fix off-by-one in sg_chain entry count for wrapped sk_msg ring
When an sk_msg scatterlist ring wraps (sg.end < sg.start),
tls_push_record() chains the tail portion of the ring to the head
using sg_chain(). An extra entry in the sg array is reserved for
this:
struct sk_msg_sg {
[...]
/* The extra two elements:
* 1) used for chaining the front and sections when the list becomes
* partitioned (e.g. end < start). The crypto APIs require the
* chaining;
* 2) to chain tailer SG entries after the message.
*/
struct scatterlist data[MAX_MSG_FRAGS + 2];
The current code uses MAX_SKB_FRAGS + 1 as the ring size:
instead of the true last entry. This is likely due to a "race" of
the commit under Fixes landing close to
commit 031097d9e079 ("bpf: sk_msg, zap ingress queue on psock down")
Convert to ARRAY_SIZE and drop the data[start] / - start (as suggested
by Sabrina).
Reported-by: 钱一铭 <yimingqian591@gmail.com> Fixes: 9aaaa56845a0 ("bpf: Sockmap/tls, skmsg can have wrapped skmsg that needs extra chaining") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260511174920.433155-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
drm/ttm: Convert -EAGAIN from dmem_cgroup_try_charge to -ENOSPC
dmem_cgroup_try_charge() returns -EAGAIN when the cgroup limit is
hit and the charge fails. TTM has no concept of -EAGAIN from resource
allocation; -ENOSPC is the canonical error meaning "no space, try
eviction". Convert at the source in ttm_resource_alloc() so no caller
needs to handle an unexpected error code, and clean up the now-redundant
-EAGAIN check in ttm_bo_alloc_resource().
Without this, -EAGAIN escaping ttm_resource_alloc() during an eviction
walk causes the walk to terminate early instead of continuing to the
next candidate.
Cc: Friedrich Vock <friedrich.vock@gmx.de> Cc: Maarten Lankhorst <dev@lankhorst.se> Cc: Tejun Heo <tj@kernel.org> Cc: Maxime Ripard <mripard@kernel.org> Cc: Christian Koenig <christian.koenig@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.14+ Fixes: 2b624a2c1865 ("drm/ttm: Handle cgroup based eviction in TTM") Assisted-by: GitHub_Copilot:claude-sonnet-4.6 Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Maarten Lankhorst <dev@lankhrost.se> Link: https://patch.msgid.link/20260508160920.230339-1-thomas.hellstrom@linux.intel.com
Thomas Weißschuh [Wed, 22 Apr 2026 09:42:32 +0000 (11:42 +0200)]
vdso/gettimeofday: Reload sequence counter after switch to time page in do_aux()
After switching to the real data pages, the sequence counter needs to be
reloaded from there. The code using vdso_read_begin_timens() assumed
this worked by 'continue' jumping to the *beginning* of the do-while
retry loop. However the 'continue' jumps to the *end* of said loop,
evaluating the exit condition. If the data page has a sequence counter
of '1' it will match the one from the time namespace page and prematurely
exit the retry loop. This would result in garbage returned to the caller.
Reload the sequence counter after switching the pages by using an inner
while loop again, which will loop at most once.
The loop generates slightly better code than an explicit reload through
'seq = vdso_read_begin()'.
Fixes: ed78b7b2c5ae ("vdso/gettimeofday: Add a helper to read the sequence lock of a time namespace aware clock") Reported-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Link: https://patch.msgid.link/20260422-vdso-aux-timens-loop-v1-1-e2dd8c7164cc@linutronix.de Closes: https://lore.kernel.org/lkml/CANiDSCsOy0P1if-gJZqOM5pTJ0RDcwVfru1B7KFbTOEMqjPKJw@mail.gmail.com/
Xiang Mei [Mon, 11 May 2026 06:21:38 +0000 (23:21 -0700)]
net/smc: reject CHID-0 ACCEPT that matches an empty ism_dev slot
On the SMC-D client, slot 0 of ini->ism_dev[]/ini->ism_chid[] is
reserved for an SMC-Dv1 device. smc_find_ism_v2_device_clnt()
populates V2 entries starting at index 1, so when no V1 device is
selected slot 0 is left in its kzalloc()'ed state with ism_dev[0] ==
NULL and ism_chid[0] == 0.
smc_v2_determine_accepted_chid() then matches the peer's CHID against
the array starting from index 0 using the CHID alone. A malicious
peer replying to a SMC-Dv2-only proposal with d1.chid == 0 matches
the empty slot, ini->ism_selected becomes 0, and the subsequent
ism_dev[0]->lgr_lock dereference in smc_conn_create() faults at
offsetof(struct smcd_dev, lgr_lock) == 0x68:
BUG: KASAN: null-ptr-deref in _raw_spin_lock_bh+0x79/0xe0
Write of size 4 at addr 0000000000000068 by task exploit/144
Call Trace:
_raw_spin_lock_bh
smc_conn_create (net/smc/smc_core.c:1997)
__smc_connect (net/smc/af_smc.c:1447)
smc_connect (net/smc/af_smc.c:1720)
__sys_connect
__x64_sys_connect
do_syscall_64
Require ism_dev[i] to be non-NULL before accepting a CHID match.
Fixes: a7c9c5f4af7f ("net/smc: CLC accept / confirm V2") Reported-by: Weiming Shi <bestswngs@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260511062138.2839584-1-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>