Anna Schumaker [Fri, 7 Feb 2025 20:42:21 +0000 (15:42 -0500)]
NFS: Add implid to sysfs
The Linux NFS server added support for returning this information during
an EXCHANGE_ID in Linux v6.13. This is something and admin might want to
query, so let's add it to sysfs.
NFS: Extend rdirplus mount option with "force|none"
There are certain users that wish to force the NFS client to choose
READDIRPLUS over READDIR for a particular mount. Update the "rdirplus" mount
option to optionally accept values. For "rdirplus=force", the NFS client
will always attempt to use READDDIRPLUS. The setting of "rdirplus=none" is
aliased to the existing "nordirplus".
Chuck Lever [Mon, 13 Jan 2025 15:32:42 +0000 (10:32 -0500)]
NFS: Refactor trace_nfs4_offload_cancel
Add a trace_nfs4_offload_status trace point that looks just like
trace_nfs4_offload_cancel. Promote that event to an event class to
avoid duplicating code.
An alternative approach would be to expand trace_nfs4_offload_status
to report more of the actual OFFLOAD_STATUS result.
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/20250113153235.48706-16-cel@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Chuck Lever [Mon, 13 Jan 2025 15:32:41 +0000 (10:32 -0500)]
NFS: Use NFSv4.2's OFFLOAD_STATUS operation
We've found that there are cases where a transport disconnection
results in the loss of callback RPCs. NFS servers typically do not
retransmit callback operations after a disconnect.
This can be a problem for the Linux NFS client's current
implementation of asynchronous COPY, which waits indefinitely for a
CB_OFFLOAD callback. If a transport disconnect occurs while an async
COPY is running, there's a good chance the client will never get the
completing CB_OFFLOAD.
Fix this by implementing the OFFLOAD_STATUS operation so that the
Linux NFS client can probe the NFS server if it doesn't see a
CB_OFFLOAD in a reasonable amount of time.
This patch implements a simplistic check. As future work, the client
might also be able to detect whether there is no forward progress on
the request asynchronous COPY operation, and CANCEL it.
Chuck Lever [Mon, 13 Jan 2025 15:32:40 +0000 (10:32 -0500)]
NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
Enable the Linux NFS client to observe the progress of an offloaded
asynchronous COPY operation. This new operation will be put to use
in a subsequent patch.
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/20250113153235.48706-14-cel@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
NeilBrown [Wed, 4 Dec 2024 02:53:09 +0000 (13:53 +1100)]
NFS: fix open_owner_id_maxsz and related fields.
A recent change increased the size of an NFSv4 open owner, but didn't
increase the corresponding max_sz defines. This is not know to have
caused failure, but should be fixed.
This patch also fixes some relates _maxsz fields that are wrong.
Note that the XXX_owner_id_maxsz values now are only the size of the id
and do NOT include the len field that will always preceed the id in xdr
encoding. I think this is clearer.
Reported-by: David Disseldorp <ddiss@suse.com> Fixes: d98f72272500 ("nfs: simplify and guarantee owner uniqueness.") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Tue, 18 Feb 2025 23:37:51 +0000 (18:37 -0500)]
NFSv4: Avoid unnecessary scans of filesystems for delayed delegations
The amount of looping through the list of delegations is occasionally
leading to soft lockups. If the state manager was asked to manage the
delayed return of delegations, then only scan those filesystems
containing delegations that were marked as being delayed.
Fixes: be20037725d1 ("NFSv4: Fix delegation return in cases where we have to retry") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Wed, 19 Feb 2025 00:03:21 +0000 (19:03 -0500)]
NFSv4: Avoid unnecessary scans of filesystems for expired delegations
The amount of looping through the list of delegations is occasionally
leading to soft lockups. If the state manager was asked to reap the
expired delegations, it should scan only those filesystems that hold
delegations that need to be reaped.
Fixes: 7f156ef0bf45 ("NFSv4: Clean up nfs_delegation_reap_expired()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Tue, 18 Feb 2025 23:14:26 +0000 (18:14 -0500)]
NFSv4: Avoid unnecessary scans of filesystems for returning delegations
The amount of looping through the list of delegations is occasionally
leading to soft lockups. If the state manager was asked to return
delegations asynchronously, it should only scan those filesystems that
hold delegations that need to be returned.
Fixes: af3b61bf6131 ("NFSv4: Clean up nfs_client_return_marked_delegations()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Tue, 18 Feb 2025 21:50:30 +0000 (16:50 -0500)]
NFSv4: Don't trigger uneccessary scans for return-on-close delegations
The amount of looping through the list of delegations is occasionally
leading to soft lockups. Avoid at least some loops by not requiring the
NFSv4 state manager to scan for delegations that are marked for
return-on-close. Instead, either mark them for immediate return (if
possible) or else leave it up to nfs4_inode_return_delegation_on_close()
to return them once the file is closed by the application.
Fixes: b757144fd77c ("NFSv4: Be less aggressive about returning delegations for open files") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Linus Torvalds [Sun, 16 Mar 2025 19:09:44 +0000 (09:09 -1000)]
Merge tag 'i2c-for-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- omap: fix irq ACKS to avoid irq storming and system hang
- ali1535, ali15x3, sis630: fix error path at probe exit
* tag 'i2c-for-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: sis630: Fix an error handling path in sis630_probe()
i2c: ali15x3: Fix an error handling path in ali15x3_probe()
i2c: ali1535: Fix an error handling path in ali1535_probe()
i2c: omap: fix IRQ storms
Linus Torvalds [Sun, 16 Mar 2025 19:05:00 +0000 (09:05 -1000)]
Merge tag 'trace-v6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
"Fix ref count of trace_array in error path of histogram file open
Tracing instances have a ref count to keep them around while files
within their directories are open. This prevents them from being
deleted while they are used.
The histogram code had some files that needed to take the ref count
and that was added, but the error paths did not decrement the ref
counts. This caused the instances from ever being removed if a
histogram file failed to open due to some error"
* tag 'trace-v6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Correct the refcount if the hist/hist_debug file fails to open
Linus Torvalds [Sun, 16 Mar 2025 06:39:55 +0000 (20:39 -1000)]
Merge tag 'usb-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB and Thunderbolt driver fixes and new
usb-serial device ids. Included in here are:
- new usb-serial device ids
- typec driver bugfix
- thunderbolt driver resume bugfix
All of these have been in linux-next with no reported issues"
* tag 'usb-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: tcpm: fix state transition for SNK_WAIT_CAPABILITIES state in run_state_machine()
USB: serial: ftdi_sio: add support for Altera USB Blaster 3
thunderbolt: Prevent use-after-free in resume from hibernate
USB: serial: option: fix Telit Cinterion FE990A name
USB: serial: option: add Telit Cinterion FE990B compositions
USB: serial: option: match on interface class for Telit FN990B
Linus Torvalds [Sun, 16 Mar 2025 01:46:29 +0000 (15:46 -1000)]
Merge tag 'input-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- several new device IDs added to xpad game controller driver
- support for imagis IST3038H variant of chip added to imagis touch
controller driver
- a fix for GPIO allocation for ads7846 touch controller driver
- a fix for iqs7222 driver to properly support status register
- a fix for goodix-berlin touch controller driver to use the right name
for the regulator
- more i8042 quirks to better handle several old Clevo devices.
* tag 'input-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
MAINTAINERS: Remove myself from the goodix touchscreen maintainers
Input: iqs7222 - preserve system status register
Input: i8042 - swap old quirk combination with new quirk for more devices
Input: i8042 - swap old quirk combination with new quirk for several devices
Input: i8042 - add required quirks for missing old boardnames
Input: i8042 - swap old quirk combination with new quirk for NHxxRZQ
Input: xpad - rename QH controller to Legion Go S
Input: xpad - add support for TECNO Pocket Go
Input: xpad - add support for ZOTAC Gaming Zone
Input: goodix-berlin - fix vddio regulator references
Input: goodix-berlin - fix comment referencing wrong regulator
Input: imagis - add support for imagis IST3038H
dt-bindings: input/touchscreen: imagis: add compatible for ist3038h
Input: xpad - add multiple supported devices
Input: xpad - add 8BitDo SN30 Pro, Hyperkin X91 and Gamesir G7 SE controllers
Input: ads7846 - fix gpiod allocation
Input: wdt87xx_i2c - fix compiler warning
Linus Torvalds [Sat, 15 Mar 2025 18:32:16 +0000 (08:32 -1000)]
Merge tag 'fsnotify_for_v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify reverts from Jan Kara:
"Syzbot has found out that fsnotify HSM events generated on page fault
can be generated while we already hold freeze protection for the
filesystem (when you do buffered write from a buffer which is mmapped
file on the same filesystem) which violates expectations for HSM
events and could lead to deadlocks of HSM clients with filesystem
freezing.
Since it's quite late in the cycle we've decided to revert changes
implementing HSM events on page fault for now and instead just
generate one event for the whole range on mmap(2) so that HSM client
can fetch the data at that moment"
* tag 'fsnotify_for_v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
Revert "fanotify: disable readahead if we have pre-content watches"
Revert "mm: don't allow huge faults for files with pre content watches"
Revert "fsnotify: generate pre-content permission event on page fault"
Revert "xfs: add pre-content fsnotify hook for DAX faults"
Revert "ext4: add pre-content fsnotify hook for DAX faults"
fsnotify: add pre-content hooks on mmap()
Linus Torvalds [Sat, 15 Mar 2025 04:43:37 +0000 (18:43 -1000)]
Merge tag 'v6.14-rc6-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Two fixes for oplock break/lease races
* tag 'v6.14-rc6-smb3-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: prevent connection release during oplock break notification
ksmbd: fix use-after-free in ksmbd_free_work_struct
Kent Overstreet [Fri, 14 Mar 2025 22:20:20 +0000 (18:20 -0400)]
bcachefs: fix build on 32 bit in get_random_u64_below()
bare 64 bit divides not allowed, whoops
arm-linux-gnueabi-ld: drivers/char/random.o: in function `__get_random_u64_below':
drivers/char/random.c:602:(.text+0xc70): undefined reference to `__aeabi_uldivmod'
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Fri, 14 Mar 2025 22:14:32 +0000 (12:14 -1000)]
Merge tag 'bcachefs-2025-03-14' of git://evilpiepirate.org/bcachefs
Pull bcachefs hotfix from Kent Overstreet:
"This one is high priority: a user hit an assertion in the upgrade to
6.14, and we don't have a reproducer, so this changes the assertion to
an emergency read-only with more info so we can debug it"
* tag 'bcachefs-2025-03-14' of git://evilpiepirate.org/bcachefs:
bcachefs: Change btree wb assert to runtime error
Linus Torvalds [Fri, 14 Mar 2025 21:31:57 +0000 (11:31 -1000)]
Merge tag 'for-6.14/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mikulas Patocka:
- dm-flakey: fix memory corruption in optional corrupt_bio_byte feature
* tag 'for-6.14/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm-flakey: Fix memory corruption in optional corrupt_bio_byte feature
Linus Torvalds [Fri, 14 Mar 2025 21:22:05 +0000 (11:22 -1000)]
Merge tag 'block-6.14-20250313' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- Concurrent pci error and hotplug handling fix (Keith)
- Endpoint function fixes (Damien)
- Fix for a regression introduced in this cycle with error checking for
batched request completions (Shin'ichiro)
* tag 'block-6.14-20250313' of git://git.kernel.dk/linux:
block: change blk_mq_add_to_batch() third argument type to bool
nvme: move error logging from nvme_end_req() to __nvme_end_req()
nvmet: pci-epf: Do not add an IRQ vector if not needed
nvmet: pci-epf: Set NVMET_PCI_EPF_Q_LIVE when a queue is fully created
nvme-pci: fix stuck reset on concurrent DPC and HP
Linus Torvalds [Fri, 14 Mar 2025 20:57:28 +0000 (10:57 -1000)]
Merge tag 'platform-drivers-x86-v6.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
"Fixes and new HW support.
The diff is a bit larger than I'd prefer at this point due to
unwinding the amd/pmf driver's error handling properly instead of
calling a deinit function that was a can full of worms.
Summary:
- amd/pmf:
- Fix error handling in amd_pmf_init_smart_pc()
- Fix missing hidden options for Smart PC
- surface: aggregator_registry: Add Support for Surface Pro 11"
* tag 'platform-drivers-x86-v6.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
MAINTAINERS: Update Ike Panhc's email address
platform/x86/amd: pmf: Fix missing hidden options for Smart PC
platform/surface: aggregator_registry: Add Support for Surface Pro 11
platform/x86/amd/pmf: fix cleanup in amd_pmf_init_smart_pc()
Linus Torvalds [Fri, 14 Mar 2025 20:39:41 +0000 (10:39 -1000)]
Merge tag 'gpio-fixes-for-v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"The first fix is a backport from my v6.15-rc1 queue that turned out to
be needed in v6.14 as well but as the former diverged from my fixes
branch I had to adjust the patch a bit.
The second one fixes a regression observed in user-space where closing
a file descriptor associated with a GPIO device results in a ~10ms
delay due to the atomic notifier calling rcu_synchronize() when
unregistering.
Summary:
- don't check the return value of gpio_chip::get_direction() when
registering a GPIO chip
- use raw notifier for line state events"
* tag 'gpio-fixes-for-v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: cdev: use raw notifier for line state events
gpiolib: don't check the retval of get_direction() when registering a chip
Linus Torvalds [Fri, 14 Mar 2025 20:35:39 +0000 (10:35 -1000)]
Merge tag 'sound-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of last-minute fixes.
Most of them are for ASoC, and the only one core fix is for reverting
the previous change, while the rest are all device-specific quirks and
fixes, which should be relatively safe to apply"
* tag 'sound-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: cs42l43: convert to SYSTEM_SLEEP_PM_OPS
ALSA: hda/realtek: Add mute LED quirk for HP Pavilion x360 14-dy1xxx
ASoC: codecs: wm0010: Fix error handling path in wm0010_spi_probe()
ASoC: rt722-sdca: add missing readable registers
ASoC: amd: yc: Support mic on another Lenovo ThinkPad E16 Gen 2 model
ASoC: cs42l43: Fix maximum ADC Volume
ASoC: ops: Consistently treat platform_max as control value
ASoC: rt1320: set wake_capable = 0 explicitly
ASoC: cs42l43: Add jack delay debounce after suspend
ASoC: tegra: Fix ADX S24_LE audio format
ASoC: codecs: wsa884x: report temps to hwmon in millidegree of Celsius
ASoC: Intel: sof_sdw: Fix unlikely uninitialized variable use in create_sdw_dailinks()
Linus Torvalds [Fri, 14 Mar 2025 20:24:57 +0000 (10:24 -1000)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The main one is a horrible macro fix for our TLB flushing code which
resulted in over-invalidation on the MMU notifier path.
Summary:
- Fix population of the vmemmap for regions of memory that are
smaller than a section (128 MiB)
- Fix range-based TLB over-invalidation when invoked via a MMU
notifier"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
Fix mmu notifiers for range-based invalidates
arm64: mm: Populate vmemmap at the page level if not section aligned
Linus Torvalds [Fri, 14 Mar 2025 20:07:16 +0000 (10:07 -1000)]
Merge tag 'x86-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
"Fix the bootup of SEV-SNP enabled guests under VMware hypervisors"
* tag 'x86-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vmware: Parse MP tables for SEV-SNP enabled guests under VMware hypervisors
Linus Torvalds [Fri, 14 Mar 2025 19:56:46 +0000 (09:56 -1000)]
Merge tag 'sched-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"Fix a sleeping-while-atomic bug caused by a recent optimization
utilizing static keys that didn't consider that the
static_key_disable() call could be triggered in atomic context.
Revert the optimization"
* tag 'sched-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/clock: Don't define sched_clock_irqtime as static key
Linus Torvalds [Fri, 14 Mar 2025 19:41:36 +0000 (09:41 -1000)]
Merge tag 'locking-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc locking fixes from Ingo Molnar:
- Restrict the Rust runtime from unintended access to dynamically
allocated LockClassKeys
- KernelDoc annotation fix
- Fix a lock ordering bug in semaphore::up(), related to trying to
printk() and wake up the console within critical sections
* tag 'locking-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/semaphore: Use wake_q to wake up processes outside lock critical section
locking/rtmutex: Use the 'struct' keyword in kernel-doc comment
rust: lockdep: Remove support for dynamically allocated LockClassKeys
Linus Torvalds [Fri, 14 Mar 2025 19:12:28 +0000 (09:12 -1000)]
Merge tag 'core-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fix from Ingo Molnar:
"Fix a Sparse false positive warning triggered by no_free_ptr()"
* tag 'core-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
<linux/cleanup.h>: Allow the passing of both iomem and non-iomem pointers to no_free_ptr()
Kent Overstreet [Fri, 14 Mar 2025 13:54:43 +0000 (09:54 -0400)]
bcachefs: Change btree wb assert to runtime error
We just had a report of the assert for "btree in write buffer for
non-write buffer btree" popping during the 6.14 upgrade.
- 150TB filesystem, after a reboot the upgrade was able to continue from
where it left off, so no major damage.
But with 6.14 about to come out we want to get this tracked down asap,
and need more data if other users hit this.
Convert the BUG_ON() to an emergency read-only, and print out btree, the
key itself, and stack trace from the original write buffer update (which
did not have this check before).
Reported-by: Stijn Tintel <stijn@linux-ipv6.be> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We have a central definition for this function since 2023, used by
a number of different parts of the kernel.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Tengda Wu [Fri, 14 Mar 2025 06:53:35 +0000 (06:53 +0000)]
tracing: Correct the refcount if the hist/hist_debug file fails to open
The function event_{hist,hist_debug}_open() maintains the refcount of
'file->tr' and 'file' through tracing_open_file_tr(). However, it does
not roll back these counts on subsequent failure paths, resulting in a
refcount leak.
A very obvious case is that if the hist/hist_debug file belongs to a
specific instance, the refcount leak will prevent the deletion of that
instance, as it relies on the condition 'tr->ref == 1' within
__remove_instance().
Fix this by calling tracing_release_file_tr() on all failure paths in
event_{hist,hist_debug}_open() to correct the refcount.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Zheng Yejian <zhengyejian1@huawei.com> Link: https://lore.kernel.org/20250314065335.1202817-1-wutengda@huaweicloud.com Fixes: 1cc111b9cddc ("tracing: Fix uaf issue when open the hist or hist_debug file") Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Fri, 14 Mar 2025 08:45:25 +0000 (22:45 -1000)]
Merge tag 'drm-fixes-2025-03-14' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Regular weekly fixes pull, the usual leaders in amdgpu/xe, a couple of
i915, and some scattered misc fixes.
panic:
- two clippy fixes
dp_mst
- locking fix
atomic:
- fix redundant DPMS calls
i915:
- Do cdclk post plane programming later
- Bump MMAP_GTT_VERSION: missing indication of partial mmaps support
xe:
- Release guc ids before cancelling work
- Fix new warnings around userptr
- Temporaritly disable D3Cold on BMG
- Retry and wait longer for GuC PC to start
- Remove redundant check in xe_vm_create_ioctl
* tag 'drm-fixes-2025-03-14' of https://gitlab.freedesktop.org/drm/kernel: (23 commits)
drm/amdgpu: NULL-check BO's backing store when determining GFX12 PTE flags
drm/amd/amdkfd: Evict all queues even HWS remove queue failed
drm/i915: Increase I915_PARAM_MMAP_GTT_VERSION version to indicate support for partial mmaps
drm/dp_mst: Fix locking when skipping CSN before topology probing
drm/amdgpu/vce2: fix ip block reference
drm/amd/display: Fix slab-use-after-free on hdcp_work
drm/amd/display: Assign normalized_pix_clk when color depth = 14
drm/amd/display: Restore correct backlight brightness after a GPU reset
drm/amd/display: fix default brightness
drm/amd/display: Disable unneeded hpd interrupts during dm_init
drm/amd: Keep display off while going into S4
drm/amd/display: fix missing .is_two_pixels_per_container
drm/amdgpu/display: Allow DCC for video formats on GFX12
drm/xe: remove redundant check in xe_vm_create_ioctl()
drm/atomic: Filter out redundant DPMS calls
drm/xe/guc_pc: Retry and wait longer for GuC PC start
drm/xe/pm: Temporarily disable D3Cold on BMG
drm/i915/cdclk: Do cdclk post plane programming later
drm/xe/userptr: Fix an incorrect assert
drm/xe: Release guc ids before cancelling work
...
usb: typec: tcpm: fix state transition for SNK_WAIT_CAPABILITIES state in run_state_machine()
A subtle error got introduced while manually fixing merge conflict in
tcpm.c for commit 85c4efbe6088 ("Merge v6.12-rc6 into usb-next"). As a
result of this error, the next state is unconditionally set to
SNK_WAIT_CAPABILITIES_TIMEOUT while handling SNK_WAIT_CAPABILITIES state
in run_state_machine(...).
Fix this by setting new state of TCPM state machine to `upcoming_state`
(that is set to different values based on conditions).
Cc: stable@vger.kernel.org Fixes: 85c4efbe60888 ("Merge v6.12-rc6 into usb-next") Signed-off-by: Amit Sunil Dhamne <amitsd@google.com> Reviewed-by: Badhri Jagan Sridharan <badhri@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250310-fix-snk-wait-timeout-v6-14-rc6-v1-1-5db14475798f@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge tag 'usb-serial-6.14-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial device ids for 6.14-rc7
Here are some new modem device ids and a couple of related fixes, and
support for Altera USB Blaster 3.
All have been in linux-next with no reported issues.
* tag 'usb-serial-6.14-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: ftdi_sio: add support for Altera USB Blaster 3
USB: serial: option: fix Telit Cinterion FE990A name
USB: serial: option: add Telit Cinterion FE990B compositions
USB: serial: option: match on interface class for Telit FN990B
Dave Airlie [Fri, 14 Mar 2025 03:42:13 +0000 (13:42 +1000)]
Merge tag 'drm-xe-fixes-2025-03-13' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
- Release guc ids before cancelling work (Tejas)
- Fix new warnings around userptr (Thomas)
- Temporaritly disable D3Cold on BMG (Rodrigo)
- Retry and wait longer for GuC PC to start (Rodrigo)
- Remove redundant check in xe_vm_create_ioctl (Xin)
Linus Torvalds [Fri, 14 Mar 2025 01:34:26 +0000 (15:34 -1000)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A few clk driver fixes for Samsung and Qualcomm clk drivers:
- Suspend on Google GS101 crashes when trying to save some clk
registers that we shouldn't be saving so we don't do that anymore
- The PLL lock time was wrong on the Tesla FSD which could lead to
the PLL never locking
- Qualcomm's display clk controller on SM8750 was trying to change
the frequency of a parent clk for the DSI device when it should
have stopped and adjusted the divider. The failure is that the clk
frequency was half what was expected, leading to broken display"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: samsung: update PLL locktime for PLL142XX used on FSD platform
clk: samsung: gs101: fix synchronous external abort in samsung_clk_save()
clk: qcom: dispcc-sm8750: Drop incorrect CLK_SET_RATE_PARENT on byte intf parent
Linus Torvalds [Fri, 14 Mar 2025 01:10:59 +0000 (15:10 -1000)]
Merge tag 'bcachefs-2025-03-13' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Roxana caught an unitialized value that might explain some of the
rebalance weirdness we're still tracking down - cool.
Otherwise pretty minor"
* tag 'bcachefs-2025-03-13' of git://evilpiepirate.org/bcachefs:
bcachefs: bch2_get_random_u64_below()
bcachefs: target_congested -> get_random_u32_below()
bcachefs: fix tiny leak in bch2_dev_add()
bcachefs: Make sure trans is unlocked when submitting read IO
bcachefs: Initialize from_inode members for bch_io_opts
bcachefs: Fix b->written overflow
Dave Airlie [Fri, 14 Mar 2025 01:09:31 +0000 (11:09 +1000)]
Merge tag 'drm-misc-fixes-2025-03-13' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
A null pointer check for gma500, two clippy fixes for panic, a fix for
an interaction between DPMS and atomic leading to dropped frames, and
a locking fix for dp_mst
Ajay Kaher [Thu, 13 Mar 2025 17:31:11 +0000 (17:31 +0000)]
x86/vmware: Parse MP tables for SEV-SNP enabled guests under VMware hypervisors
Under VMware hypervisors, SEV-SNP enabled VMs are fundamentally able to boot
without UEFI, but this regressed a year ago due to:
0f4a1e80989a ("x86/sev: Skip ROM range scans and validation for SEV-SNP guests")
In this case, mpparse_find_mptable() has to be called to parse MP
tables which contains the necessary boot information.
[ mingo: Updated the changelog. ]
Fixes: 0f4a1e80989a ("x86/sev: Skip ROM range scans and validation for SEV-SNP guests") Co-developed-by: Ye Li <ye.li@broadcom.com> Signed-off-by: Ye Li <ye.li@broadcom.com> Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ye Li <ye.li@broadcom.com> Reviewed-by: Kevin Loughlin <kevinloughlin@google.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250313173111.10918-1-ajay.kaher@broadcom.com
Linus Torvalds [Thu, 13 Mar 2025 17:58:48 +0000 (07:58 -1000)]
Merge tag 'net-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter, bluetooth and wireless.
No known regressions outstanding.
Current release - regressions:
- wifi: nl80211: fix assoc link handling
- eth: lan78xx: sanitize return values of register read/write
functions
Current release - new code bugs:
- ethtool: tsinfo: fix dump command
- bluetooth: btusb: configure altsetting for HCI_USER_CHANNEL
- eth: mlx5: DR, use the right action structs for STEv3
Previous releases - regressions:
- netfilter: nf_tables: make destruction work queue pernet
- gre: fix IPv6 link-local address generation.
- wifi: iwlwifi: fix TSO preparation
- bluetooth: revert "bluetooth: hci_core: fix sleeping function
called from invalid context"
- ovs: revert "openvswitch: switch to per-action label counting in
conntrack"
- eth:
- ice: fix switchdev slow-path in LAG
- bonding: fix incorrect MAC address setting to receive NS
messages
Previous releases - always broken:
- core: prevent TX of unreadable skbs
- sched: prevent creation of classes with TC_H_ROOT
- netfilter: nft_exthdr: fix offset with ipv4_find_option()
- wifi: cfg80211: cancel wiphy_work before freeing wiphy
- mctp: copy headers if cloned
- phy: nxp-c45-tja11xx: add errata for TJA112XA/B
- eth:
- bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx}
- mlx5: bridge, fix the crash caused by LAG state check"
* tag 'net-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
net: mana: cleanup mana struct after debugfs_remove()
net/mlx5e: Prevent bridge link show failure for non-eswitch-allowed devices
net/mlx5: Bridge, fix the crash caused by LAG state check
net/mlx5: Lag, Check shared fdb before creating MultiPort E-Switch
net/mlx5: Fix incorrect IRQ pool usage when releasing IRQs
net/mlx5: HWS, Rightsize bwc matcher priority
net/mlx5: DR, use the right action structs for STEv3
Revert "openvswitch: switch to per-action label counting in conntrack"
net: openvswitch: remove misbehaving actions length check
selftests: Add IPv6 link-local address generation tests for GRE devices.
gre: Fix IPv6 link-local address generation.
netfilter: nft_exthdr: fix offset with ipv4_find_option()
selftests/tc-testing: Add a test case for DRR class with TC_H_ROOT
net_sched: Prevent creation of classes with TC_H_ROOT
ipvs: prevent integer overflow in do_ip_vs_get_ctl()
selftests: netfilter: skip br_netfilter queue tests if kernel is tainted
netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree()
wifi: mac80211: fix MPDU length parsing for EHT 5/6 GHz
qlcnic: fix memory leak issues in qlcnic_sriov_common.c
rtase: Fix improper release of ring list entries in rtase_sw_reset
...
Linus Torvalds [Thu, 13 Mar 2025 17:53:25 +0000 (07:53 -1000)]
Merge tag 'vfs-6.14-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Bring in an RCU pathwalk fix for afs. This is brought in as a merge
from the vfs-6.15.shared.afs branch that needs this commit and other
trees already depend on it.
- Fix vboxfs unterminated string handling.
* tag 'vfs-6.14-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
vboxsf: Add __nonstring annotations for unterminated strings
afs: Fix afs_atcell_get_link() to handle RCU pathwalk
Jens Axboe [Thu, 13 Mar 2025 15:41:57 +0000 (09:41 -0600)]
Merge tag 'nvme-6.14-2025-03-13' of git://git.infradead.org/nvme into block-6.14
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.14
- Concurrent pci error and hotplug handling fix (Keith)
- Endpoint function fixes (Damien)"
* tag 'nvme-6.14-2025-03-13' of git://git.infradead.org/nvme:
nvmet: pci-epf: Do not add an IRQ vector if not needed
nvmet: pci-epf: Set NVMET_PCI_EPF_Q_LIVE when a queue is fully created
nvme-pci: fix stuck reset on concurrent DPC and HP
In the use case of buffered write whose input buffer is mmapped file on a
filesystem with a pre-content mark, the prefaulting of the buffer can
happen under the filesystem freeze protection (obtained in vfs_write())
which breaks assumptions of pre-content hook and introduces potential
deadlock of HSM handler in userspace with filesystem freezing.
Now that we have pre-content hooks at file mmap() time, disable the
pre-content event hooks on page fault to avoid the potential deadlock.
Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-fsdevel/7ehxrhbvehlrjwvrduoxsao5k3x4aw275patsb3krkwuq573yv@o2hskrfawbnc/ Fixes: 8392bc2ff8c8 ("fsnotify: generate pre-content permission event on page fault") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250312073852.2123409-5-amir73il@gmail.com
Paolo Abeni [Thu, 13 Mar 2025 14:04:26 +0000 (15:04 +0100)]
Merge tag 'nf-25-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for net:
1) Missing initialization of cpu and jiffies32 fields in conncount,
from Kohei Enju.
2) Skip several tests in case kernel is tainted, otherwise tests bogusly
report failure too as they also check for tainted kernel,
from Florian Westphal.
3) Fix a hyphothetical integer overflow in do_ip_vs_get_ctl() leading
to bogus error logs, from Dan Carpenter.
4) Fix incorrect offset in ipv4 option match in nft_exthdr, from
Alexey Kashavkin.
netfilter pull request 25-03-13
* tag 'nf-25-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_exthdr: fix offset with ipv4_find_option()
ipvs: prevent integer overflow in do_ip_vs_get_ctl()
selftests: netfilter: skip br_netfilter queue tests if kernel is tainted
netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree()
====================
Murad Masimov [Tue, 11 Mar 2025 14:22:06 +0000 (17:22 +0300)]
cifs: Fix integer overflow while processing closetimeo mount option
User-provided mount parameter closetimeo of type u32 is intended to have
an upper limit, but before it is validated, the value is converted from
seconds to jiffies which can lead to an integer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 5efdd9122eff ("smb3: allow deferred close timeout to be configurable") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
Murad Masimov [Tue, 11 Mar 2025 14:22:05 +0000 (17:22 +0300)]
cifs: Fix integer overflow while processing actimeo mount option
User-provided mount parameter actimeo of type u32 is intended to have
an upper limit, but before it is validated, the value is converted from
seconds to jiffies which can lead to an integer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 6d20e8406f09 ("cifs: add attribute cache timeout (actimeo) tunable") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
Murad Masimov [Tue, 11 Mar 2025 14:22:04 +0000 (17:22 +0300)]
cifs: Fix integer overflow while processing acdirmax mount option
User-provided mount parameter acdirmax of type u32 is intended to have
an upper limit, but before it is validated, the value is converted from
seconds to jiffies which can lead to an integer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 4c9f948142a5 ("cifs: Add new mount parameter "acdirmax" to allow caching directory metadata") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
Murad Masimov [Tue, 11 Mar 2025 14:22:03 +0000 (17:22 +0300)]
cifs: Fix integer overflow while processing acregmax mount option
User-provided mount parameter acregmax of type u32 is intended to have
an upper limit, but before it is validated, the value is converted from
seconds to jiffies which can lead to an integer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 5780464614f6 ("cifs: Add new parameter "acregmax" for distinct file and directory metadata timeout") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Wed, 12 Mar 2025 13:51:31 +0000 (10:51 -0300)]
smb: client: fix regression with guest option
When mounting a CIFS share with 'guest' mount option, mount.cifs(8)
will set empty password= and password2= options. Currently we only
handle empty strings from user= and password= options, so the mount
will fail with
cifs: Bad value for 'password2'
Fix this by handling empty string from password2= option as well.
Link: https://bbs.archlinux.org/viewtopic.php?id=303927 Reported-by: Adam Williamson <awilliam@redhat.com> Closes: https://lore.kernel.org/r/83c00b5fea81c07f6897a5dd3ef50fd3b290f56c.camel@redhat.com Fixes: 35f834265e0d ("smb3: fix broken reconnect when password changing on the server by allowing password rotation") Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
platform/x86/amd: pmf: Fix missing hidden options for Smart PC
amd_pmf_get_slider_info() checks the current profile to report correct
value to the TA inputs. If hidden options are in use then the wrong
values will be reported to TA.
Add the two compat options PLATFORM_PROFILE_BALANCED_PERFORMANCE and
PLATFORM_PROFILE_QUIET for this use.
Reported-by: Yijun Shen <Yijun.Shen@dell.com> Fixes: 9a43102daf64d ("platform/x86/amd: pmf: Add balanced-performance to hidden choices") Fixes: 44e94fece5170 ("platform/x86/amd: pmf: Add 'quiet' to hidden choices") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20250306034402.50478-1-superm1@kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Shradha Gupta [Tue, 11 Mar 2025 10:17:40 +0000 (03:17 -0700)]
net: mana: cleanup mana struct after debugfs_remove()
When on a MANA VM hibernation is triggered, as part of hibernate_snapshot(),
mana_gd_suspend() and mana_gd_resume() are called. If during this
mana_gd_resume(), a failure occurs with HWC creation, mana_port_debugfs
pointer does not get reinitialized and ends up pointing to older,
cleaned-up dentry.
Further in the hibernation path, as part of power_down(), mana_gd_shutdown()
is triggered. This call, unaware of the failures in resume, tries to cleanup
the already cleaned up mana_port_debugfs value and hits the following bug:
Carolina Jubran [Mon, 10 Mar 2025 22:01:44 +0000 (00:01 +0200)]
net/mlx5e: Prevent bridge link show failure for non-eswitch-allowed devices
mlx5_eswitch_get_vepa returns -EPERM if the device lacks
eswitch_manager capability, blocking mlx5e_bridge_getlink from
retrieving VEPA mode. Since mlx5e_bridge_getlink implements
ndo_bridge_getlink, returning -EPERM causes bridge link show to fail
instead of skipping devices without this capability.
To avoid this, return -EOPNOTSUPP from mlx5e_bridge_getlink when
mlx5_eswitch_get_vepa fails, ensuring the command continues processing
other devices while ignoring those without the necessary capability.
Fixes: 4b89251de024 ("net/mlx5: Support ndo bridge_setlink and getlink") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1741644104-97767-7-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jianbo Liu [Mon, 10 Mar 2025 22:01:43 +0000 (00:01 +0200)]
net/mlx5: Bridge, fix the crash caused by LAG state check
When removing LAG device from bridge, NETDEV_CHANGEUPPER event is
triggered. Driver finds the lower devices (PFs) to flush all the
offloaded entries. And mlx5_lag_is_shared_fdb is checked, it returns
false if one of PF is unloaded. In such case,
mlx5_esw_bridge_lag_rep_get() and its caller return NULL, instead of
the alive PF, and the flush is skipped.
Besides, the bridge fdb entry's lastuse is updated in mlx5 bridge
event handler. But this SWITCHDEV_FDB_ADD_TO_BRIDGE event can be
ignored in this case because the upper interface for bond is deleted,
and the entry will never be aged because lastuse is never updated.
To make things worse, as the entry is alive, mlx5 bridge workqueue
keeps sending that event, which is then handled by kernel bridge
notifier. It causes the following crash when accessing the passed bond
netdev which is already destroyed.
To fix this issue, remove such checks. LAG state is already checked in
commit 15f8f168952f ("net/mlx5: Bridge, verify LAG state when adding
bond to bridge"), driver still need to skip offload if LAG becomes
invalid state after initialization.
Shay Drory [Mon, 10 Mar 2025 22:01:41 +0000 (00:01 +0200)]
net/mlx5: Fix incorrect IRQ pool usage when releasing IRQs
mlx5_irq_pool_get() is a getter for completion IRQ pool only.
However, after the cited commit, mlx5_irq_pool_get() is called during
ctrl IRQ release flow to retrieve the pool, resulting in the use of an
incorrect IRQ pool.
Hence, use the newly introduced mlx5_irq_get_pool() getter to retrieve
the correct IRQ pool based on the IRQ itself. While at it, rename
mlx5_irq_pool_get() to mlx5_irq_table_get_comp_irq_pool() which
accurately reflects its purpose and improves code readability.
Vlad Dogaru [Mon, 10 Mar 2025 22:01:40 +0000 (00:01 +0200)]
net/mlx5: HWS, Rightsize bwc matcher priority
The bwc layer was clamping the matcher priority from 32 bits to 16 bits.
This didn't show up until a matcher was resized, since the initial
native matcher was created using the correct 32 bit value.
The fix also reorders fields to avoid some padding.
Fixes: 2111bb970c78 ("net/mlx5: HWS, added backward-compatible API handling") Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1741644104-97767-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
net/mlx5: DR, use the right action structs for STEv3
Some actions in ConnectX-8 (STEv3) have different structure,
and they are handled separately in ste_ctx_v3.
This separate handling was missing two actions: INSERT_HDR
and REMOVE_HDR, which broke SWS for Linux Bridge.
This patch resolves the issue by introducing dedicated
callbacks for the insert and remove header functions,
with version-specific implementations for each STE variant.
Fixes: 4d617b57574f ("net/mlx5: DR, add support for ConnectX-8 steering") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1741644104-97767-2-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Xin Long [Sat, 8 Mar 2025 18:05:43 +0000 (13:05 -0500)]
Revert "openvswitch: switch to per-action label counting in conntrack"
Currently, ovs_ct_set_labels() is only called for confirmed conntrack
entries (ct) within ovs_ct_commit(). However, if the conntrack entry
does not have the labels_ext extension, attempting to allocate it in
ovs_ct_get_conn_labels() for a confirmed entry triggers a warning in
nf_ct_ext_add():
WARN_ON(nf_ct_is_confirmed(ct));
This happens when the conntrack entry is created externally before OVS
increments net->ct.labels_used. The issue has become more likely since
commit fcb1aa5163b1 ("openvswitch: switch to per-action label counting
in conntrack"), which changed to use per-action label counting and
increment net->ct.labels_used when a flow with ct action is added.
Since there’s no straightforward way to fully resolve this issue at the
moment, this reverts the commit to avoid breaking existing use cases.
The actions length check is unreliable and produces different results
depending on the initial length of the provided netlink attribute and
the composition of the actual actions inside of it. For example, a
user can add 4088 empty clone() actions without triggering -EMSGSIZE,
on attempt to add 4089 such actions the operation will fail with the
-EMSGSIZE verdict. However, if another 16 KB of other actions will
be *appended* to the previous 4089 clone() actions, the check passes
and the flow is successfully installed into the openvswitch datapath.
The reason for a such a weird behavior is the way memory is allocated.
When ovs_flow_cmd_new() is invoked, it calls ovs_nla_copy_actions(),
that in turn calls nla_alloc_flow_actions() with either the actual
length of the user-provided actions or the MAX_ACTIONS_BUFSIZE. The
function adds the size of the sw_flow_actions structure and then the
actually allocated memory is rounded up to the closest power of two.
So, if the user-provided actions are larger than MAX_ACTIONS_BUFSIZE,
then MAX_ACTIONS_BUFSIZE + sizeof(*sfa) rounded up is 32K + 24 -> 64K.
Later, while copying individual actions, we look at ksize(), which is
64K, so this way the MAX_ACTIONS_BUFSIZE check is not actually
triggered and the user can easily allocate almost 64 KB of actions.
However, when the initial size is less than MAX_ACTIONS_BUFSIZE, but
the actions contain ones that require size increase while copying
(such as clone() or sample()), then the limit check will be performed
during the reserve_sfa_size() and the user will not be allowed to
create actions that yield more than 32 KB internally.
This is one part of the problem. The other part is that it's not
actually possible for the userspace application to know beforehand
if the particular set of actions will be rejected or not.
Certain actions require more space in the internal representation,
e.g. an empty clone() takes 4 bytes in the action list passed in by
the user, but it takes 12 bytes in the internal representation due
to an extra nested attribute, and some actions require less space in
the internal representations, e.g. set(tunnel(..)) normally takes
64+ bytes in the action list provided by the user, but only needs to
store a single pointer in the internal implementation, since all the
data is stored in the tunnel_info structure instead.
And the action size limit is applied to the internal representation,
not to the action list passed by the user. So, it's not possible for
the userpsace application to predict if the certain combination of
actions will be rejected or not, because it is not possible for it to
calculate how much space these actions will take in the internal
representation without knowing kernel internals.
All that is causing random failures in ovs-vswitchd in userspace and
inability to handle certain traffic patterns as a result. For example,
it is reported that adding a bit more than a 1100 VMs in an OpenStack
setup breaks the network due to OVS not being able to handle ARP
traffic anymore in some cases (it tries to install a proper datapath
flow, but the kernel rejects it with -EMSGSIZE, even though the action
list isn't actually that large.)
Kernel behavior must be consistent and predictable in order for the
userspace application to use it in a reasonable way. ovs-vswitchd has
a mechanism to re-direct parts of the traffic and partially handle it
in userspace if the required action list is oversized, but that doesn't
work properly if we can't actually tell if the action list is oversized
or not.
Solution for this is to check the size of the user-provided actions
instead of the internal representation. This commit just removes the
check from the internal part because there is already an implicit size
check imposed by the netlink protocol. The attribute can't be larger
than 64 KB. Realistically, we could reduce the limit to 32 KB, but
we'll be risking to break some existing setups that rely on the fact
that it's possible to create nearly 64 KB action lists today.
Vast majority of flows in real setups are below 100-ish bytes. So
removal of the limit will not change real memory consumption on the
system. The absolutely worst case scenario is if someone adds a flow
with 64 KB of empty clone() actions. That will yield a 192 KB in the
internal representation consuming 256 KB block of memory. However,
that list of actions is not meaningful and also a no-op. Real world
very large action lists (that can occur for a rare cases of BUM
traffic handling) are unlikely to contain a large number of clones and
will likely have a lot of tunnel attributes making the internal
representation comparable in size to the original action list.
So, it should be fine to just remove the limit.
Commit in the 'Fixes' tag is the first one that introduced the
difference between internal representation and the user-provided action
lists, but there were many more afterwards that lead to the situation
we have today.
====================
gre: Fix regressions in IPv6 link-local address generation.
IPv6 link-local address generation has some special cases for GRE
devices. This has led to several regressions in the past, and some of
them are still not fixed. This series fixes the remaining problems,
like the ipv6.conf.<dev>.addr_gen_mode sysctl being ignored and the
router discovery process not being started (see details in patch 1).
To avoid any further regressions, patch 2 adds selftests covering
IPv4 and IPv6 gre/gretap devices with all combinations of currently
supported addr_gen_mode values.
====================
Guillaume Nault [Fri, 7 Mar 2025 19:28:58 +0000 (20:28 +0100)]
selftests: Add IPv6 link-local address generation tests for GRE devices.
GRE devices have their special code for IPv6 link-local address
generation that has been the source of several regressions in the past.
Add selftest to check that all gre, ip6gre, gretap and ip6gretap get an
IPv6 link-link local address in accordance with the
net.ipv6.conf.<dev>.addr_gen_mode sysctl.
Guillaume Nault [Fri, 7 Mar 2025 19:28:53 +0000 (20:28 +0100)]
gre: Fix IPv6 link-local address generation.
Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE
devices in most cases and fall back to using add_v4_addrs() only in
case the GRE configuration is incompatible with addrconf_addr_gen().
GRE used to use addrconf_addr_gen() until commit e5dd729460ca
("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL
address") restricted this use to gretap and ip6gretap devices, and
created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.
The original problem came when commit 9af28511be10 ("addrconf: refuse
isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its
addr parameter was 0. The commit says that this would create an invalid
address, however, I couldn't find any RFC saying that the generated
interface identifier would be wrong. Anyway, since gre over IPv4
devices pass their local tunnel address to __ipv6_isatap_ifid(), that
commit broke their IPv6 link-local address generation when the local
address was unspecified.
Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT
interfaces when computing v6LL address") tried to fix that case by
defining add_v4_addrs() and calling it to generate the IPv6 link-local
address instead of using addrconf_addr_gen() (apart for gretap and
ip6gretap devices, which would still use the regular
addrconf_addr_gen(), since they have a MAC address).
That broke several use cases because add_v4_addrs() isn't properly
integrated into the rest of IPv6 Neighbor Discovery code. Several of
these shortcomings have been fixed over time, but add_v4_addrs()
remains broken on several aspects. In particular, it doesn't send any
Router Sollicitations, so the SLAAC process doesn't start until the
interface receives a Router Advertisement. Also, add_v4_addrs() mostly
ignores the address generation mode of the interface
(/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the
IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.
Fix the situation by using add_v4_addrs() only in the specific scenario
where the normal method would fail. That is, for interfaces that have
all of the following characteristics:
* run over IPv4,
* transport IP packets directly, not Ethernet (that is, not gretap
interfaces),
* tunnel endpoint is INADDR_ANY (that is, 0),
* device address generation mode is EUI64.
In all other cases, revert back to the regular addrconf_addr_gen().
Also, remove the special case for ip6gre interfaces in add_v4_addrs(),
since ip6gre devices now always use addrconf_addr_gen() instead.
netfilter: nft_exthdr: fix offset with ipv4_find_option()
There is an incorrect calculation in the offset variable which causes
the nft_skb_copy_to_reg() function to always return -EFAULT. Adding the
start variable is redundant. In the __ip_options_compile() function the
correct offset is specified when finding the function. There is no need
to add the size of the iphdr structure to the offset.
Fixes: dbb5281a1f84 ("netfilter: nf_tables: add support for matching IPv4 options") Signed-off-by: Alexey Kashavkin <akashavkin@gmail.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
gpio: cdev: use raw notifier for line state events
We use a notifier to implement the mechanism of informing the user-space
about changes in GPIO line status. We register with the notifier when
the GPIO character device file is opened and unregister when the last
reference to the associated file descriptor is dropped.
Since commit fcc8b637c542 ("gpiolib: switch the line state notifier to
atomic") we use the atomic notifier variant. Atomic notifiers call
rcu_synchronize in atomic_notifier_chain_unregister() which caused a
significant performance regression in some circumstances, observed by
user-space when calling close() on the GPIO device file descriptor.
Replace the atomic notifier with the raw variant and provide
synchronization with a read-write spinlock.
Fixes: fcc8b637c542 ("gpiolib: switch the line state notifier to atomic") Reported-by: David Jander <david@protonic.nl> Closes: https://lore.kernel.org/all/20250311110034.53959031@erd003.prtnl/ Tested-by: David Jander <david@protonic.nl> Tested-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20250311-gpiolib-line-state-raw-notifier-v2-1-138374581e1e@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
gpiolib: don't check the retval of get_direction() when registering a chip
During chip registration we should neither check the return value of
gc->get_direction() nor hold the SRCU lock when calling it. The former
is because pin controllers may have pins set to alternate functions and
return errors from their get_direction() callbacks. That's alright - we
should default to the safe INPUT state and not bail-out. The latter is
not needed because we haven't registered the chip yet so there's nothing
to protect against dynamic removal. In fact: we currently hit a lockdep
splat. Revert to calling the gc->get_direction() callback directly and
*not* checking its value.
Fixes: 9d846b1aebbe ("gpiolib: check the return value of gpio_chip::get_direction()") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Closes: https://lore.kernel.org/all/81f890fc-6688-42f0-9756-567efc8bb97a@samsung.com/ Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20250226-retval-fixes-v2-1-c8dc57182441@linaro.org Tested-by: Gene C <arch@sapience.com> Link: https://lore.kernel.org/r/20250311175631.83779-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Takashi Iwai [Thu, 13 Mar 2025 06:33:48 +0000 (07:33 +0100)]
Merge tag 'asoc-fix-v6.14-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.14
The bulk of this is driver specific fixes, mostly unremarkable. There's
also one core fix from Charles, fixing up confusion around the limiting
of maximum control values.
Linus Torvalds [Wed, 12 Mar 2025 21:52:04 +0000 (11:52 -1000)]
Merge tag 'sched_ext-for-6.14-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fix from Tejun Heo:
"BPF schedulers could trigger a crash by passing in an invalid CPU to
the scx_bpf_select_cpu_dfl() helper.
Fix it by verifying input validity"
* tag 'sched_ext-for-6.14-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()
Linus Torvalds [Wed, 12 Mar 2025 21:47:24 +0000 (11:47 -1000)]
Merge tag 'spi-fix-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of driver specific fixes, an error handling fix for the Atmel
QuadSPI driver and a fix for a nasty synchronisation issue in the data
path for the Microchip driver which affects larger transfers.
There's also a MAINTAINERS update for the Samsung driver"
* tag 'spi-fix-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: microchip-core: prevent RX overflows when transmit size > FIFO size
MAINTAINERS: add tambarus as R for Samsung SPI
spi: atmel-quadspi: remove references to runtime PM on error path
Cong Wang [Thu, 6 Mar 2025 23:23:55 +0000 (15:23 -0800)]
selftests/tc-testing: Add a test case for DRR class with TC_H_ROOT
Integrate the reproduer from Mingi to TDC.
All test results:
1..4
ok 1 0385 - Create DRR with default setting
ok 2 2375 - Delete DRR with handle
ok 3 3092 - Show DRR class
ok 4 4009 - Reject creation of DRR class with classid TC_H_ROOT
Cong Wang [Thu, 6 Mar 2025 23:23:54 +0000 (15:23 -0800)]
net_sched: Prevent creation of classes with TC_H_ROOT
The function qdisc_tree_reduce_backlog() uses TC_H_ROOT as a termination
condition when traversing up the qdisc tree to update parent backlog
counters. However, if a class is created with classid TC_H_ROOT, the
traversal terminates prematurely at this class instead of reaching the
actual root qdisc, causing parent statistics to be incorrectly maintained.
In case of DRR, this could lead to a crash as reported by Mingi Cho.
Prevent the creation of any Qdisc class with classid TC_H_ROOT
(0xFFFFFFFF) across all qdisc types, as suggested by Jamal.
Reported-by: Mingi Cho <mincho@theori.io> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 066a3b5b2346 ("[NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop") Link: https://patch.msgid.link/20250306232355.93864-2-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yifan Zha [Wed, 5 Mar 2025 05:14:55 +0000 (13:14 +0800)]
drm/amd/amdkfd: Evict all queues even HWS remove queue failed
[Why]
If reset is detected and kfd need to evict working queues, HWS moving queue will be failed.
Then remaining queues are not evicted and in active state.
After reset done, kfd uses HWS to termination remaining activated queues but HWS is resetted.
So remove queue will be failed again.
[How]
Keep removing all queues even if HWS returns failed.
It will not affect cpsch as it checks reset_domain->sem.
v2: If any queue failed, evict queue returns error.
v3: Declare err inside the if-block.
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 42c854b8fb0cce512534aa2b7141948e80c6ebb0) Cc: stable@vger.kernel.org
Amir Goldstein [Wed, 12 Mar 2025 07:38:47 +0000 (08:38 +0100)]
fsnotify: add pre-content hooks on mmap()
Pre-content hooks in page faults introduces potential deadlock of HSM
handler in userspace with filesystem freezing.
The requirement with pre-content event is that for every accessed file
range an event covering at least this range will be generated at least
once before the file data is accesses.
In preparation to disabling pre-content event hooks on page faults,
add pre-content hooks at mmap() variants for the entire mmaped range,
so HSM can fill content when user requests to map a portion of the file.
Note that exec() variant also calls vm_mmap_pgoff() internally to map
code sections, so pre-content hooks are also generated in this case.
Boon Khai Ng [Wed, 12 Mar 2025 03:05:44 +0000 (11:05 +0800)]
USB: serial: ftdi_sio: add support for Altera USB Blaster 3
The Altera USB Blaster 3, available as both a cable and an on-board
solution, is primarily used for programming and debugging FPGAs.
It interfaces with host software such as Quartus Programmer,
System Console, SignalTap, and Nios Debugger. The device utilizes
either an FT2232 or FT4232 chip.
Enabling the support for various configurations of the on-board
USB Blaster 3 by including the appropriate VID/PID pairs,
allowing it to function as a serial device via ftdi_sio.
Note that this check-in does not include support for the
cable solution, as it does not support UART functionality.
The supported configurations are determined by the
hardware design and include:
1) PID 0x6022, FT2232, 1 JTAG port (Port A) + Port B as UART
2) PID 0x6025, FT4232, 1 JTAG port (Port A) + Port C as UART
3) PID 0x6026, FT4232, 1 JTAG port (Port A) + Port C, D as UART
4) PID 0x6029, FT4232, 1 JTAG port (Port B) + Port C as UART
5) PID 0x602a, FT4232, 1 JTAG port (Port B) + Port C, D as UART
6) PID 0x602c, FT4232, 1 JTAG port (Port A) + Port B as UART
7) PID 0x602d, FT4232, 1 JTAG port (Port A) + Port B, C as UART
8) PID 0x602e, FT4232, 1 JTAG port (Port A) + Port B, C, D as UART
These configurations allow for flexibility in how the USB Blaster 3 is
used, depending on the specific needs of the hardware design.
Signed-off-by: Boon Khai Ng <boon.khai.ng@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>