Pavan Chebbi [Fri, 13 Oct 2023 13:59:19 +0000 (06:59 -0700)]
tg3: Improve PTP TX timestamping logic
When we are trying to timestamp a TX packet, there may be
occasions when the TX timestamp register is still not
updated with the latest timestamp even if the timestamp
packet descriptor is marked as complete.
This usually happens in cases where the system is under
stress or flow control is affecting the transmit side.
We will solve this problem by saving the snapshot of the
timestamp register when we are posting the TX descriptor.
At this time, the register contains previously timestamped
packet's value and valid timestamp of the current packet must
be different than this.
Upon completion of the current descriptor, we will check if
the timestamp register is updated or not before timestamping
the skb. If not updated, we will schedule the ptp worker to
fetch the updated time later and timestamp the skb.
Also now we restrict number of outstanding PTP TX packet
requests to 1.
Reported-by: Simon White <Simon.White@viavisolutions.com> Link: https://lore.kernel.org/netdev/CACKFLikGdN9XPtWk-fdrzxdcD=+bv-GHBvfVfSpJzHY7hrW39g@mail.gmail.com/ Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 11 Oct 2023 02:42:24 +0000 (19:42 -0700)]
docs: try to encourage (netdev?) reviewers
Add a section to netdev maintainer doc encouraging reviewers
to chime in on the mailing list.
The questions about "when is it okay to share feedback"
keep coming up (most recently at netconf) and the answer
is "pretty much always".
Extend the section of 7.AdvancedTopics.rst which deals
with reviews a little bit to add stuff we had been recommending
locally.
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 15 Oct 2023 13:25:03 +0000 (14:25 +0100)]
Merge branch 'sfc-conntrack-offload'
Edward Cree says:
====================
sfc: support conntrack NAT offload
The EF100 MAE supports performing NAT (and NPT) on packets which match in
the conntrack table. This series adds that capability to the driver.
====================
Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 10 Oct 2023 21:52:00 +0000 (22:52 +0100)]
sfc: support offloading ct(nat) action in RHS rules
If an IP address and/or L4 port for NAPT is available from a CT match,
the MAE will perform the edits; if no CT lookup has been performed for
this packet, the CT lookup did not return a match, or the matched CT
entry did not include NAPT, the action will have no effect.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 10 Oct 2023 21:51:59 +0000 (22:51 +0100)]
sfc: parse mangle actions (NAT) in conntrack entries
The MAE can edit either address, L4 port, or both, for either source
or destination. These can't be mixed; i.e. it can edit source addr
and source port, but not (say) source addr and dest port.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
this patchset is first of three parts of another big patchset for
MSG_ZEROCOPY flag support:
https://lore.kernel.org/netdev/20230701063947.3422088-1-AVKrasnov@sberdevices.ru/
During review of this series, Stefano Garzarella <sgarzare@redhat.com>
suggested to split it for three parts to simplify review and merging:
1) virtio and vhost updates (for fragged skbs) <--- this patchset
2) AF_VSOCK updates (allows to enable MSG_ZEROCOPY mode and read
tx completions) and update for Documentation/.
3) Updates for tests and utils.
This series enables handling of fragged skbs in virtio and vhost parts.
Newly logic won't be triggered, because SO_ZEROCOPY options is still
impossible to enable at this moment (next bunch of patches from big
set above will enable it).
I've included changelog to some patches anyway, because there were some
comments during review of last big patchset from the link above.
Link to v1:
https://lore.kernel.org/netdev/20230717210051.856388-1-AVKrasnov@sberdevices.ru/
Link to v2:
https://lore.kernel.org/netdev/20230718180237.3248179-1-AVKrasnov@sberdevices.ru/
Link to v3:
https://lore.kernel.org/netdev/20230720214245.457298-1-AVKrasnov@sberdevices.ru/
Link to v4:
https://lore.kernel.org/netdev/20230727222627.1895355-1-AVKrasnov@sberdevices.ru/
Link to v5:
https://lore.kernel.org/netdev/20230730085905.3420811-1-AVKrasnov@sberdevices.ru/
Link to v6:
https://lore.kernel.org/netdev/20230814212720.3679058-1-AVKrasnov@sberdevices.ru/
Link to v7:
https://lore.kernel.org/netdev/20230827085436.941183-1-avkrasnov@salutedevices.com/
Link to v8:
https://lore.kernel.org/netdev/20230911202234.1932024-1-avkrasnov@salutedevices.com/
Changelog:
v3 -> v4:
* Patchset rebased and tested on new HEAD of net-next (see hash above).
v4 -> v5:
* See per-patch changelog after ---.
v5 -> v6:
* Patchset rebased and tested on new HEAD of net-next (see hash above).
* See per-patch changelog after ---.
v6 -> v7:
* Patchset rebased and tested on new HEAD of net-next (see hash above).
* See per-patch changelog after ---.
v7 -> v8:
* Patchset rebased and tested on new HEAD of net-next (see hash above).
* See per-patch changelog after ---.
v8 -> v9:
* Patchset rebased and tested on new HEAD of net-next (see hash above).
* See per-patch changelog after ---.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:24 +0000 (22:15 +0300)]
test/vsock: io_uring rx/tx tests
This adds set of tests which use io_uring for rx/tx. This test suite is
implemented as separated util like 'vsock_test' and has the same set of
input arguments as 'vsock_test'. These tests only cover cases of data
transmission (no connect/bind/accept etc).
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:23 +0000 (22:15 +0300)]
test/vsock: MSG_ZEROCOPY support for vsock_perf
To use this option pass '--zerocopy' parameter:
./vsock_perf --zerocopy --sender <cid> ...
With this option MSG_ZEROCOPY flag will be passed to the 'send()' call.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:22 +0000 (22:15 +0300)]
test/vsock: MSG_ZEROCOPY flag tests
This adds three tests for MSG_ZEROCOPY feature:
1) SOCK_STREAM tx with different buffers.
2) SOCK_SEQPACKET tx with different buffers.
3) SOCK_STREAM test to read empty error queue of the socket.
Patch also works as preparation for the next patches for tools in this
patchset: vsock_perf and vsock_uring_test:
1) Adds several new functions to util.c - they will be also used by
vsock_uring_test.
2) Adds two new functions for MSG_ZEROCOPY handling to a new source
file - such source will be shared between vsock_test, vsock_perf and
vsock_uring_test, thus avoiding code copy-pasting.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:21 +0000 (22:15 +0300)]
docs: net: description of MSG_ZEROCOPY for AF_VSOCK
This adds description of MSG_ZEROCOPY flag support for AF_VSOCK type of
socket.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:20 +0000 (22:15 +0300)]
vsock: enable setting SO_ZEROCOPY
For AF_VSOCK, zerocopy tx mode depends on transport, so this option must
be set in AF_VSOCK implementation where transport is accessible (if
transport is not set during setting SO_ZEROCOPY: for example socket is
not connected, then SO_ZEROCOPY will be enabled, but once transport will
be assigned, support of this type of transmission will be checked).
To handle SO_ZEROCOPY, AF_VSOCK implementation uses SOCK_CUSTOM_SOCKOPT
bit, thus handling SOL_SOCKET option operations, but all of them except
SO_ZEROCOPY will be forwarded to the generic handler by calling
'sock_setsockopt()'.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:19 +0000 (22:15 +0300)]
vsock/loopback: support MSG_ZEROCOPY for transport
Add 'msgzerocopy_allow()' callback for loopback transport.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:18 +0000 (22:15 +0300)]
vsock/virtio: support MSG_ZEROCOPY for transport
Add 'msgzerocopy_allow()' callback for virtio transport.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:17 +0000 (22:15 +0300)]
vhost/vsock: support MSG_ZEROCOPY for transport
Add 'msgzerocopy_allow()' callback for vhost transport.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:16 +0000 (22:15 +0300)]
vsock: enable SOCK_SUPPORT_ZC bit
This bit is used by io_uring in case of zerocopy tx mode. io_uring code
checks, that socket has this feature. This patch sets it in two places:
1) For socket in 'connect()' call.
2) For new socket which is returned by 'accept()' call.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:15 +0000 (22:15 +0300)]
vsock: check for MSG_ZEROCOPY support on send
This feature totally depends on transport, so if transport doesn't
support it, return error.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:14 +0000 (22:15 +0300)]
vsock: read from socket's error queue
This adds handling of MSG_ERRQUEUE input flag in receive call. This flag
is used to read socket's error queue instead of data queue. Possible
scenario of error queue usage is receiving completions for transmission
with MSG_ZEROCOPY flag. This patch also adds new defines: 'SOL_VSOCK'
and 'VSOCK_RECVERR'.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arseniy Krasnov [Tue, 10 Oct 2023 19:15:13 +0000 (22:15 +0300)]
vsock: set EPOLLERR on non-empty error queue
If socket's error queue is not empty, EPOLLERR must be set. Otherwise,
reader of error queue won't detect data in it using EPOLLERR bit.
Currently for AF_VSOCK this is actual only with MSG_ZEROCOPY, as this
feature is the only user of an error queue of the socket.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lukas Bulwahn [Thu, 12 Oct 2023 06:34:43 +0000 (08:34 +0200)]
appletalk: remove special handling code for ipddp
After commit 1dab47139e61 ("appletalk: remove ipddp driver") removes the
config IPDDP, there is some minor code clean-up possible in the appletalk
network layer.
Remove some code in appletalk layer after the ipddp driver is gone.
Justin Stitt [Thu, 12 Oct 2023 18:35:41 +0000 (18:35 +0000)]
qed: replace uses of strncpy
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
This patch eliminates three uses of strncpy():
Firstly, `dest` is expected to be NUL-terminated which is evident by the
manual setting of a NUL-byte at size - 1. For this use specifically,
strscpy() is a viable replacement due to the fact that it guarantees
NUL-termination on the destination buffer.
The next two cases should simply be memcpy() as the size of the src
string is always 3 and the destination string just wants the first 3
bytes changed.
To be clear, there are no buffer overread bugs in the current code as
the sizes and offsets are carefully managed such that buffers are
NUL-terminated. However, with these changes, the code is now more robust
and less ambiguous (and hopefully easier to read).
Heiner Kallweit [Thu, 12 Oct 2023 06:51:13 +0000 (08:51 +0200)]
r8169: fix rare issue with broken rx after link-down on RTL8125
In very rare cases (I've seen two reports so far about different
RTL8125 chip versions) it seems the MAC locks up when link goes down
and requires a software reset to get revived.
Realtek doesn't publish hw errata information, therefore the root cause
is unknown. Realtek vendor drivers do a full hw re-initialization on
each link-up event, the slimmed-down variant here was reported to fix
the issue for the reporting user.
It's not fully clear which parts of the NIC are reset as part of the
software reset, therefore I can't rule out side effects.
====================
net: netconsole: configfs entries for boot target
There is a limitation in netconsole, where it is impossible to
disable or modify the target created from the command line parameter.
(netconsole=...).
"netconsole" cmdline parameter sets the remote IP, and if the remote IP
changes, the machine needs to be rebooted (with the new remote IP set in
the command line parameter).
This allows the user to modify a target without the need to restart the
machine.
This functionality sits on top of the dynamic target reconfiguration that is
already implemented in netconsole.
The way to modify a boot time target is creating special named configfs
directories, that will be associated with the targets coming from
`netconsole=...`.
Example:
Let's suppose you have two netconsole targets defined at boot time::
Breno Leitao [Thu, 12 Oct 2023 11:14:00 +0000 (04:14 -0700)]
netconsole: Attach cmdline target to dynamic target
Enable the attachment of a dynamic target to the target created during
boot time. The boot-time targets are named as "cmdline\d", where "\d" is
a number starting at 0.
If the user creates a dynamic target named "cmdline0", it will attach to
the first target created at boot time (as defined in the
`netconsole=...` command line argument). `cmdline1` will attach to the
second target and so forth.
If there is no netconsole target created at boot time, then, the target
name could be reused.
Breno Leitao [Thu, 12 Oct 2023 11:13:59 +0000 (04:13 -0700)]
netconsole: Initialize configfs_item for default targets
For netconsole targets allocated during the boot time (passing
netconsole=... argument), netconsole_target->item is not initialized.
That is not a problem because it is not used inside configfs.
An upcoming patch will be using it, thus, initialize the targets with
the name 'cmdline' plus a counter starting from 0. This name will match
entries in the configfs later.
Breno Leitao [Thu, 12 Oct 2023 11:13:58 +0000 (04:13 -0700)]
netconsole: move init/cleanup functions lower
Move alloc_param_target() and its counterpart (free_param_target())
to the bottom of the file. These functions are called mostly at
initialization/cleanup of the module, and they should be just above the
callers, at the bottom of the file.
From a practical perspective, having alloc_param_target() at the bottom
of the file will avoid forward declaration later (in the following
patch).
Nothing changed other than the functions location.
Justin Stitt [Thu, 12 Oct 2023 20:38:19 +0000 (20:38 +0000)]
sfc: replace deprecated strncpy with strscpy
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
`desc` is expected to be NUL-terminated as evident by the manual
NUL-byte assignment. Moreover, NUL-padding does not seem to be
necessary.
The only caller of efx_mcdi_nvram_metadata() is
efx_devlink_info_nvram_partition() which provides a NULL for `desc`:
| rc = efx_mcdi_nvram_metadata(efx, partition_type, NULL, version, NULL, 0);
Due to this, I am not sure this code is even reached but we should still
favor something other than strncpy.
Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.
Justin Stitt [Thu, 12 Oct 2023 22:25:12 +0000 (22:25 +0000)]
net: phy: tja11xx: replace deprecated strncpy with ethtool_sprintf
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
ethtool_sprintf() is designed specifically for get_strings() usage.
Let's replace strncpy in favor of this dedicated helper function.
Justin Stitt [Wed, 11 Oct 2023 21:53:44 +0000 (21:53 +0000)]
ionic: replace deprecated strncpy with strscpy
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
NUL-padding is not needed due to `ident` being memset'd to 0 just before
the copy.
Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.
Justin Stitt [Wed, 11 Oct 2023 21:37:18 +0000 (21:37 +0000)]
net: sparx5: replace deprecated strncpy with ethtool_sprintf
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
ethtool_sprintf() is designed specifically for get_strings() usage.
Let's replace strncpy() in favor of this more robust and easier to
understand interface.
Justin Stitt [Wed, 11 Oct 2023 21:04:37 +0000 (21:04 +0000)]
net/mlx4_core: replace deprecated strncpy with strscpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
We expect `dst` to be NUL-terminated based on its use with format
strings:
| mlx4_dbg(dev, "Reporting Driver Version to FW: %s\n", dst);
Moreover, NUL-padding is not required.
Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.
Justin Stitt [Wed, 11 Oct 2023 21:48:39 +0000 (21:48 +0000)]
nfp: replace deprecated strncpy with strscpy
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
We expect res->name to be NUL-terminated based on its usage with format
strings:
| dev_err(cpp->dev.parent, "Dangling area: %d:%d:%d:0x%0llx-0x%0llx%s%s\n",
| NFP_CPP_ID_TARGET_of(res->cpp_id),
| NFP_CPP_ID_ACTION_of(res->cpp_id),
| NFP_CPP_ID_TOKEN_of(res->cpp_id),
| res->start, res->end,
| res->name ? " " : "",
| res->name ? res->name : "");
... and with strcmp()
| if (!strcmp(res->name, NFP_RESOURCE_TBL_NAME)) {
Moreover, NUL-padding is not required as `res` is already
zero-allocated:
| res = kzalloc(sizeof(*res), GFP_KERNEL);
Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.
Let's also opt to use the more idiomatic strscpy() usage of (dest, src,
sizeof(dest)) rather than (dest, src, SOME_LEN).
Typically the pattern of 1) allocate memory for string, 2) copy string
into freshly-allocated memory is a candidate for kmemdup_nul() but in
this case we are allocating the entirety of the `res` struct and that
should stay as is. As mentioned above, simple 1:1 replacement of strncpy
-> strscpy :)
Ido Schimmel [Wed, 11 Oct 2023 14:39:12 +0000 (16:39 +0200)]
mlxsw: pci: Allocate skbs using GFP_KERNEL during initialization
The driver allocates skbs during initialization and during Rx
processing. Take advantage of the fact that the former happens in
process context and allocate the skbs using GFP_KERNEL to decrease the
probability of allocation failure.
octeontx2-af: Enable hardware timestamping for VFs
Currently for VFs, mailbox returns ENODEV error when hardware timestamping
enable is requested. This patch fixes this issue. Modified this patch to
return EPERM error for the PF/VFs which are not attached to CGX/RPM.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231011121551.1205211-1-saikrishnag@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Stitt [Tue, 10 Oct 2023 22:32:35 +0000 (22:32 +0000)]
net: dsa: vsc73xx: replace deprecated strncpy with ethtool_sprintf
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
ethtool_sprintf() is designed specifically for get_strings() usage.
Let's replace strncpy in favor of this more robust and easier to
understand interface.
This change could result in misaligned strings when if(cnt) fails. To
combat this, use ternary to place empty string in buffer and properly
increment pointer to next string slot.
Heng Guo [Wed, 11 Oct 2023 01:51:37 +0000 (09:51 +0800)]
net: fix IPSTATS_MIB_OUTFORWDATAGRAMS increment after fragment check
Reproduce environment:
network with 3 VM linuxs is connected as below:
VM1<---->VM2(latest kernel 6.5.0-rc7)<---->VM3
VM1: eth0 ip: 192.168.122.207 MTU 1800
VM2: eth0 ip: 192.168.122.208, eth1 ip: 192.168.123.224 MTU 1500
VM3: eth0 ip: 192.168.123.240 MTU 1800
Reproduce:
VM1 send 1600 bytes UDP data to VM3 using tools scapy with flags='DF'.
scapy command:
send(IP(dst="192.168.123.240",flags='DF')/UDP()/str('0'*1600),count=1,
inter=1.000000)
Issue description and patch:
ip_exceeds_mtu() in ip_forward() drops this IP datagram because skb len
(1600 sending by scapy) is over MTU(1500 in VM2) if "DF" is set.
According to RFC 4293 "3.2.3. IP Statistics Tables",
+-------+------>------+----->-----+----->-----+
| InForwDatagrams (6) | OutForwDatagrams (6) |
| V +->-+ OutFragReqds
| InNoRoutes | | (packets)
/ (local packet (3) | |
| IF is that of the address | +--> OutFragFails
| and may not be the receiving IF) | | (packets)
the IPSTATS_MIB_OUTFORWDATAGRAMS should be counted before fragment
check.
The existing implementation, instead, would incease the counter after
fragment check: ip_exceeds_mtu() in ipv4 and ip6_pkt_too_big() in ipv6.
So do patch to move IPSTATS_MIB_OUTFORWDATAGRAMS counter to ip_forward()
for ipv4 and ip6_forward() for ipv6.
Test result with patch:
Before IP data is sent.
----------------------------------------------------------------------
root@qemux86-64:~# cat /proc/net/snmp
Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors
ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests
OutDiscards OutNoRoutes ReasmTimeout ReasmReqdss
Ip: 1 64 6 0 2 2 0 0 2 4 0 0 0 0 0 0 0 0 0
......
root@qemux86-64:~#
----------------------------------------------------------------------
After IP data is sent.
----------------------------------------------------------------------
root@qemux86-64:~# cat /proc/net/snmp
Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors
ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests
OutDiscards OutNoRoutes ReasmTimeout ReasmReqdss
Ip: 1 64 7 0 2 3 0 0 2 5 0 0 0 0 0 0 0 1 0
......
root@qemux86-64:~#
----------------------------------------------------------------------
ForwDatagrams is updated from 2 to 3.
Jakub Kicinski [Fri, 13 Oct 2023 16:35:34 +0000 (09:35 -0700)]
Merge branch 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Leon Romanovsky says:
====================
This PR is collected from
https://lore.kernel.org/all/cover.1695296682.git.leon@kernel.org
This series from Patrisious extends mlx5 to support IPsec packet offload
in multiport devices (MPV, see [1] for more details).
These devices have single flow steering logic and two netdev interfaces,
which require extra logic to manage IPsec configurations as they performed
on netdevs.
* 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Handle IPsec steering upon master unbind/bind
net/mlx5: Configure IPsec steering for ingress RoCEv2 MPV traffic
net/mlx5: Configure IPsec steering for egress RoCEv2 MPV traffic
net/mlx5: Add create alias flow table function to ipsec roce
net/mlx5: Implement alias object allow and create functions
net/mlx5: Add alias flow table bits
net/mlx5: Store devcom pointer inside IPsec RoCE
net/mlx5: Register mlx5e priv to devcom in MPV mode
RDMA/mlx5: Send events from IB driver about device affiliation state
net/mlx5: Introduce ifc bits for migration in a chunk mode
David S. Miller [Fri, 13 Oct 2023 10:26:11 +0000 (11:26 +0100)]
Merge branch 'tls-cleanups'
Sabrina Dubroca says:
====================
net: tls: various code cleanups and improvements
This series contains multiple cleanups and simplifications for the
config code of both TLS_SW and TLS_HW.
It also modifies the chcr_ktls driver to use driver_state like all
other drivers, so that we can then make driver_state fixed size
instead of a flex array always allocated to that same fixed size. As
reported by Gustavo A. R. Silva, the way chcr_ktls misuses
driver_state irritates GCC [1].
Patches 1 and 2 are follow-ups to my previous cipher_desc series.
Sabrina Dubroca [Mon, 9 Oct 2023 20:50:54 +0000 (22:50 +0200)]
tls: use fixed size for tls_offload_context_{tx,rx}.driver_state
driver_state is a flex array, but is always allocated by the tls core
to a fixed size (TLS_DRIVER_STATE_SIZE_{TX,RX}). Simplify the code by
making that size explicit so that sizeof(struct
tls_offload_context_{tx,rx}) works.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Mon, 9 Oct 2023 20:50:53 +0000 (22:50 +0200)]
chcr_ktls: use tls_offload_context_tx and driver_state like other drivers
chcr_ktls uses the space reserved in driver_state by
tls_set_device_offload, but makes up into own wrapper around
tls_offload_context_tx instead of accessing driver_state via the
__tls_driver_ctx helper.
In this driver, driver_state is only used to store a pointer to a
larger context struct allocated by the driver.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Mon, 9 Oct 2023 20:50:51 +0000 (22:50 +0200)]
tls: remove tls_context argument from tls_set_device_offload
It's not really needed since we end up refetching it as tls_ctx. We
can also remove the NULL check, since we have already dereferenced ctx
in do_tls_setsockopt_conf.
While at it, fix up the reverse xmas tree ordering.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Mon, 9 Oct 2023 20:50:50 +0000 (22:50 +0200)]
tls: remove tls_context argument from tls_set_sw_offload
It's not really needed since we end up refetching it as tls_ctx. We
can also remove the NULL check, since we have already dereferenced ctx
in do_tls_setsockopt_conf.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Mon, 9 Oct 2023 20:50:42 +0000 (22:50 +0200)]
tls: drop unnecessary cipher_type checks in tls offload
We should never reach tls_device_reencrypt, tls_enc_record, or
tls_enc_skb with a cipher_type that can't be offloaded. Replace those
checks with a DEBUG_NET_WARN_ON_ONCE, and use cipher_desc instead of
hard-coding offloadable cipher types.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 10 Oct 2023 14:44:00 +0000 (16:44 +0200)]
selftests: netdevsim: use suitable existing dummy file for flash test
The file name used in flash test was "dummy" because at the time test
was written, drivers were responsible for file request and as netdevsim
didn't do that, name was unused. However, the file load request is
now done in devlink code and therefore the file has to exist.
Use first random file from /lib/firmware for this purpose.
Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Luca Fancellu [Tue, 10 Oct 2023 14:26:30 +0000 (15:26 +0100)]
xen-netback: add software timestamp capabilities
Add software timestamp capabilities to the xen-netback driver
by advertising it on the struct ethtool_ops and calling
skb_tx_timestamp before passing the buffer to the queue.
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Justin Stitt [Mon, 9 Oct 2023 23:19:57 +0000 (23:19 +0000)]
ibmvnic: replace deprecated strncpy with strscpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
NUL-padding is not required as the buffer is already memset to 0:
| memset(adapter->fw_version, 0, 32);
Note that another usage of strscpy exists on the same buffer:
| strscpy((char *)adapter->fw_version, "N/A", sizeof(adapter->fw_version));
Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.
Justin Stitt [Mon, 9 Oct 2023 23:05:41 +0000 (23:05 +0000)]
net: fec: replace deprecated strncpy with ethtool_sprintf
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
ethtool_sprintf() is designed specifically for get_strings() usage.
Let's replace strncpy in favor of this more robust and easier to
understand interface.
Also, while we're here, let's change memcpy() over to ethtool_sprintf()
for consistency.
Rob Herring [Mon, 9 Oct 2023 17:29:04 +0000 (12:29 -0500)]
net: mdio: xgene: Use device_get_match_data()
Use preferred device_get_match_data() instead of of_match_device() and
acpi_match_device() to get the driver match data. With this, adjust the
includes to explicitly include the correct headers.
Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net: wwan: t7xx: Add __counted_by for struct t7xx_fsm_event and use struct_size()
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
While there, use struct_size() helper, instead of the open-coded
version, to calculate the size for the allocation of the whole
flexible structure, including of course, the flexible-array member.
This code was found with the help of Coccinelle, and audited and
fixed manually.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Rob Herring [Mon, 9 Oct 2023 17:29:00 +0000 (12:29 -0500)]
net: ethernet: wiznet: Use spi_get_device_match_data()
Use preferred spi_get_device_match_data() instead of of_match_device() and
spi_get_device_id() to get the driver match data. With this, adjust the
includes to explicitly include the correct headers.
Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Rob Herring [Mon, 9 Oct 2023 17:28:58 +0000 (12:28 -0500)]
net: ethernet: Use device_get_match_data()
Use preferred device_get_match_data() instead of of_match_device() to
get the driver match data. With this, adjust the includes to explicitly
include the correct headers.
Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Wolsieffer [Mon, 9 Oct 2023 14:59:04 +0000 (10:59 -0400)]
net: stmmac: dwmac-stm32: refactor clock config
Currently, clock configuration is spread throughout the driver and
partially duplicated for the STM32MP1 and STM32 MCU variants. This makes
it difficult to keep track of which clocks need to be enabled or disabled
in various scenarios.
This patch adds symmetric stm32_dwmac_clk_enable/disable() functions
that handle all clock configuration, including quirks required while
suspending or resuming. syscfg_clk and clk_eth_ck are not present on
STM32 MCUs, but it is fine to try to configure them anyway since NULL
clocks are ignored.
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 13 Oct 2023 09:00:32 +0000 (10:00 +0100)]
Merge branch 'vxlan-fdb-flushing'
Amit Cohen says:
====================
Extend VXLAN driver to support FDB flushing
The merge commit 92716869375b ("Merge branch 'br-flush-filtering'") added
support for FDB flushing in bridge driver. Extend VXLAN driver to support
FDB flushing also. Add support for filtering by fields which are relevant
for VXLAN FDBs:
* Source VNI
* Nexthop ID
* 'router' flag
* Destination VNI
* Destination Port
* Destination IP
Without this set, flush for VXLAN device fails:
$ bridge fdb flush dev vx10
RTNETLINK answers: Operation not supported
With this set, such flush works with the relevant arguments, for example:
$ bridge fdb flush dev vx10 vni 5000 dst 193.2.2.1
< flush all vx10 entries with VNI 5000 and destination IP 193.2.2.1>
Some preparations are required, handle them before adding flushing support
in VXLAN driver. See more details in commit messages.
Patch set overview:
Patch #1 prepares flush policy to be used by VXLAN driver
Patches #2-#3 are preparations in VXLAN driver
Patch #4 adds an initial support for flushing in VXLAN driver
Patches #5-#9 add support for filtering by several attributes
Patch #10 adds a test for FDB flush with VXLAN
Patch #11 extends the test to check FDB flush with bridge
====================
Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:18 +0000 (13:06 +0300)]
selftests: fdb_flush: Add test cases for FDB flush with bridge device
Extend the test to check flushing with bridge device, test flush by device
and by VID.
Add test case for flushing with "self" and "master" and attributes that are
supported only in one driver, this is unrecommended configuration, check it
to verify that user gets an error.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:17 +0000 (13:06 +0300)]
selftests: Add test cases for FDB flush with VXLAN device
Test all the supported arguments for FDB flush. The test checks
configuration, not traffic. Note that the flag 'offloaded' is not checked
as it is not relevant when there is no hardware.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:16 +0000 (13:06 +0300)]
vxlan: vxlan_core: Support FDB flushing by destination IP
Add support for flush VXLAN FDB entries by destination IP. FDB entry is
stored as {MAC, SRC_VNI} + remote. The destination IP is an attribute of
the remote. For multicast entries, the VXLAN driver stores a linked list
of remotes for a given key.
In user space, each remote is represented as a separate entry, so when
flush is sent with filter of 'destination IP', flush only the match
remotes. In case that there are no additional remotes, destroy the entry.
For example, the following are stored as one entry with several remotes:
$ bridge fdb show dev vx10
00:00:00:00:00:00 dst 192.1.1.3 self permanent
00:00:00:00:00:00 dst 192.1.1.1 self permanent
00:00:00:00:00:00 dst 192.1.1.2 self permanent
00:00:00:00:00:00 dst 192.1.1.1 vni 1000 self permanent
When user flush by destination IP x, only the relevant remotes will be
flushed:
$ bridge fdb flush dev vx10 dst 192.1.1.1
$ bridge fdb show dev vx10
00:00:00:00:00:00 dst 192.1.1.3 self permanent
00:00:00:00:00:00 dst 192.1.1.2 self permanent
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:15 +0000 (13:06 +0300)]
vxlan: vxlan_core: Support FDB flushing by destination port
Add support for flush VXLAN FDB entries by destination port. FDB entry
is stored as {MAC, SRC_VNI} + remote. The destination port is an attribute
of the remote. For multicast entries, the VXLAN driver stores a linked list
of remotes for a given key.
In user space, each remote is represented as a separate entry, so when
flush is sent with filter of 'destination port', flush only the match
remotes. In case that there are no additional remotes, destroy the entry.
For example, the following are stored as one entry with several remotes:
$ bridge fdb show dev vx10
00:00:00:00:00:00 dst 192.1.1.1 port 1111 vni 2000 self permanent
00:00:00:00:00:00 dst 192.1.1.1 port 1111 vni 3000 self permanent
00:00:00:00:00:00 dst 192.1.1.1 port 2222 vni 2000 self permanent
00:00:00:00:00:00 dst 192.1.1.1 vni 3000 self permanent
When user flush by port x, only the relevant remotes will be flushed:
$ bridge fdb flush dev vx10 port 1111
$ bridge fdb show dev vx10
00:00:00:00:00:00 dst 192.1.1.1 port 2222 vni 2000 self permanent
00:00:00:00:00:00 dst 192.1.1.1 vni 3000 self permanent
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:14 +0000 (13:06 +0300)]
vxlan: vxlan_core: Support FDB flushing by destination VNI
Add support for flush VXLAN FDB entries by destination VNI. FDB entry is
stored as {MAC, SRC_VNI} + remote. The destination VNI is an attribute
of the remote. For multicast entries, the VXLAN driver stores a linked list
of remotes for a given key.
In user space, each remote is represented as a separate entry, so when
flush is sent with filter of 'destination VNI', flush only the match
remotes. In case that there are no additional remotes, destroy the entry.
For example, the following are stored as one entry with several remotes:
$ bridge fdb show dev vx10
00:00:00:00:00:00 dst 192.1.1.1 vni 3000 self permanent
00:00:00:00:00:00 dst 192.1.1.1 vni 4000 self permanent
00:00:00:00:00:00 dst 192.1.1.1 vni 2000 self permanent
00:00:00:00:00:00 dst 192.1.1.2 vni 2000 self permanent
When user flush by VNI x, only the relevant remotes will be flushed:
$ bridge fdb flush dev vx10 vni 2000
$ bridge fdb show dev vx10
00:00:00:00:00:00 dst 192.1.1.1 vni 3000 self permanent
00:00:00:00:00:00 dst 192.1.1.1 vni 4000 self permanent
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:13 +0000 (13:06 +0300)]
vxlan: vxlan_core: Support FDB flushing by nexthop ID
Add support for flush VXLAN FDB entries by nexthop ID.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:12 +0000 (13:06 +0300)]
vxlan: vxlan_core: Support FDB flushing by source VNI
Add support for flush VXLAN FDB entries by source VNI.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:11 +0000 (13:06 +0300)]
vxlan: vxlan_core: Add support for FDB flush
The merge commit 92716869375b ("Merge branch 'br-flush-filtering'")
added support for FDB flushing in bridge driver only, the VXLAN driver does
not support such flushing. Extend VXLAN driver to support FDB flushing.
In this commit, add support for flushing with state and flags, which are
the fields that supported in the bridge driver.
Note that bridge driver supports 'NTF_USE' flag, but there is no point to
support this flag for flushing as it is ignored when flags are stored.
'NTF_STICKY' is not relevant for VXLAN driver.
'NTF_ROUTER' is not supported in bridge driver for flush as it is not
relevant for bridge, add it for VXLAN.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:10 +0000 (13:06 +0300)]
vxlan: vxlan_core: Do not skip default entry in vxlan_flush() by default
Currently, the function vxlan_flush() does not flush the default FDB entry
(an entry with all_zeros_mac and default VNI), as it is deleted at
vxlan_uninit(). When this function will be used for flushing FDB entries
from user space, it will have to flush also the default entry in case that
other parameters match (e.g., VNI, flags).
Extend 'struct vxlan_fdb_flush_desc' to include an indication whether
the default entry should be flushed or not. The default value (false)
indicates to flush it, adjust all the existing callers to set
'.ignore_default_entry' to true, so the current behavior will not be
changed.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:09 +0000 (13:06 +0300)]
vxlan: vxlan_core: Make vxlan_flush() more generic for future use
The function vxlan_flush() gets a boolean called 'do_all' and in case
that it is false, it does not flush entries with state 'NUD_PERMANENT'
or 'NUD_NOARP'. The following patches will add support for FDB flush
with parameters from user space. Make the function more generic, so it
can be used later.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 9 Oct 2023 10:06:08 +0000 (13:06 +0300)]
net: Handle bulk delete policy in bridge driver
The merge commit 92716869375b ("Merge branch 'br-flush-filtering'")
added support for FDB flushing in bridge driver. The following patches
will extend VXLAN driver to support FDB flushing as well. The netlink
message for bulk delete is shared between the drivers. With the existing
implementation, there is no way to prevent user from flushing with
attributes that are not supported per driver. For example, when VNI will
be added, user will not get an error for flush FDB entries in bridge
with VNI, although this attribute is not relevant for bridge.
As preparation for support of FDB flush in VXLAN driver, move the policy
to be handled in bridge driver, later a new policy for VXLAN will be
added in VXLAN driver. Do not pass 'vid' as part of ndo_fdb_del_bulk(),
as this field is relevant only for bridge.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
kernel/bpf/verifier.c 829955981c55 ("bpf: Fix verifier log for async callback return values") a923819fb2c5 ("bpf: Treat first argument as return value for bpf_throw")
Linus Torvalds [Thu, 12 Oct 2023 20:07:00 +0000 (13:07 -0700)]
Merge tag 'net-6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from CAN and BPF.
We have a regression in TC currently under investigation, otherwise
the things that stand off most are probably the TCP and AF_PACKET
fixes, with both issues coming from 6.5.
Previous releases - regressions:
- af_packet: fix fortified memcpy() without flex array.
- tcp: fix crashes trying to free half-baked MTU probes
- xdp: fix zero-size allocation warning in xskq_create()
- can: sja1000: always restart the tx queue after an overrun
- eth: mlx5e: again mutually exclude RX-FCS and RX-port-timestamp
- eth: nfp: avoid rmmod nfp crash issues
- eth: octeontx2-pf: fix page pool frag allocation warning
Previous releases - always broken:
- mctp: perform route lookups under a RCU read-side lock
- bpf: s390: fix clobbering the caller's backchain in the trampoline
- phy: lynx-28g: cancel the CDR check work item on the remove path
- dsa: qca8k: fix qca8k driver for Turris 1.x
- eth: ravb: fix use-after-free issue in ravb_tx_timeout_work()
- eth: ixgbe: fix crash with empty VF macvlan list"
* tag 'net-6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
rswitch: Fix imbalance phy_power_off() calling
rswitch: Fix renesas_eth_sw_remove() implementation
octeontx2-pf: Fix page pool frag allocation warning
nfc: nci: assert requested protocol is valid
af_packet: Fix fortified memcpy() without flex array.
net: tcp: fix crashes trying to free half-baked MTU probes
net/smc: Fix pos miscalculation in statistics
nfp: flower: avoid rmmod nfp crash issues
net: usb: dm9601: fix uninitialized variable use in dm9601_mdio_read
ethtool: Fix mod state of verbose no_mask bitset
net: nfc: fix races in nfc_llcp_sock_get() and nfc_llcp_sock_get_sn()
mctp: perform route lookups under a RCU read-side lock
net: skbuff: fix kernel-doc typos
s390/bpf: Fix unwinding past the trampoline
s390/bpf: Fix clobbering the caller's backchain in the trampoline
net/mlx5e: Again mutually exclude RX-FCS and RX-port-timestamp
net/smc: Fix dependency of SMC on ISM
ixgbe: fix crash with empty VF macvlan list
net/mlx5e: macsec: use update_pn flag instead of PN comparation
net: phy: mscc: macsec: reject PN update requests
...
Linus Torvalds [Thu, 12 Oct 2023 18:52:23 +0000 (11:52 -0700)]
Merge tag 'soc-fixes-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"AngeloGioacchino Del Regno is stepping in as co-maintainer for the
MediaTek SoC platform and starts by sending some dts fixes for the
mt8195 platform that had been pending for a while.
On the ixp4xx platform, Krzysztof Halasa steps down as co-maintainer,
reflecting that Linus Walleij has been handling this on his own for
the past few years.
Generic RISC-V kernels are now marked as incompatible with the RZ/Five
platform that requires custom hacks both for managing its DMA bounce
buffers and for addressing low virtual memory.
Finally, there is one bugfix for the AMDTEE firmware driver to prevent
a use-after-free bug"
* tag 'soc-fixes-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
IXP4xx MAINTAINERS entries
arm64: dts: mediatek: mt8195: Set DSU PMU status to fail
arm64: dts: mediatek: fix t-phy unit name
arm64: dts: mediatek: mt8195-demo: update and reorder reserved memory regions
arm64: dts: mediatek: mt8195-demo: fix the memory size to 8GB
MAINTAINERS: Add Angelo as MediaTek SoC co-maintainer
soc: renesas: Make ARCH_R9A07G043 (riscv version) depend on NONPORTABLE
tee: amdtee: fix use-after-free vulnerability in amdtee_close_session
Linus Torvalds [Thu, 12 Oct 2023 17:48:19 +0000 (10:48 -0700)]
Merge tag 'pinctrl-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Some pin control fixes for v6.6 which have been stacking up in my
tree.
Dmitry's fix to some locking in the core is the most substantial, that
was a really neat fix.
The rest is the usual assorted spray of minor driver fixes.
- Drop some minor code causing warnings in the Lantiq driver
- Fix out of bounds write in the Nuvoton driver
- Fix lost IRQs with CONFIG_PM in the Starfive driver
- Fix a locking issue in find_pinctrl()
- Revert a regressive Tegra debug patch
- Fix the Renesas RZN1 pin muxing"
* tag 'pinctrl-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: renesas: rzn1: Enable missing PINMUX
Revert "pinctrl: tegra: Add support to display pin function"
pinctrl: avoid unsafe code pattern in find_pinctrl()
pinctrl: starfive: jh7110: Add system pm ops to save and restore context
pinctrl: starfive: jh7110: Fix failure to set irq after CONFIG_PM is enabled
pinctrl: nuvoton: wpcm450: fix out of bounds write
pinctrl: lantiq: Remove unsued declaration ltq_pinctrl_unregister()
Florian Westphal [Thu, 12 Oct 2023 12:08:56 +0000 (14:08 +0200)]
net: gso_test: fix build with gcc-12 and earlier
gcc 12 errors out with:
net/core/gso_test.c:58:48: error: initializer element is not constant
58 | .segs = (const unsigned int[]) { gso_size },
This version isn't old (2022), so switch to preprocessor-bsaed constant
instead of 'static const int'.
Cc: Willem de Bruijn <willemb@google.com> Reported-by: Tasmiya Nalatwad <tasmiya@linux.vnet.ibm.com> Closes: https://lore.kernel.org/netdev/79fbe35c-4dd1-4f27-acb2-7a60794bc348@linux.vnet.ibm.com/ Fixes: 1b4fa28a8b07 ("net: parametrize skb_segment unit test to expand coverage") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231012120901.10765-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Fix functions calling order and a condition in renesas_eth_sw_remove().
Otherwise, kernel NULL pointer dereference happens from phy_stop() if
a net device opens.
Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jeremy Cline [Mon, 9 Oct 2023 20:00:54 +0000 (16:00 -0400)]
nfc: nci: assert requested protocol is valid
The protocol is used in a bit mask to determine if the protocol is
supported. Assert the provided protocol is less than the maximum
defined so it doesn't potentially perform a shift-out-of-bounds and
provide a clearer error for undefined protocols vs unsupported ones.
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reported-and-tested-by: syzbot+0839b78e119aae1fec78@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0839b78e119aae1fec78 Signed-off-by: Jeremy Cline <jeremy@jcline.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231009200054.82557-1-jeremy@jcline.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
af_packet: Fix fortified memcpy() without flex array.
Sergei Trofimovich reported a regression [0] caused by commit a0ade8404c3b
("af_packet: Fix warning of fortified memcpy() in packet_getname().").
It introduced a flex array sll_addr_flex in struct sockaddr_ll as a
union-ed member with sll_addr to work around the fortified memcpy() check.
However, a userspace program uses a struct that has struct sockaddr_ll in
the middle, where a flex array is illegal to exist.
include/linux/if_packet.h:24:17: error: flexible array member 'sockaddr_ll::<unnamed union>::<unnamed struct>::sll_addr_flex' not at end of 'struct packet_info_t'
24 | __DECLARE_FLEX_ARRAY(unsigned char, sll_addr_flex);
| ^~~~~~~~~~~~~~~~~~~~
To fix the regression, let's go back to the first attempt [1] telling
memcpy() the actual size of the array.
Ralph Siemsen [Wed, 4 Oct 2023 20:00:08 +0000 (16:00 -0400)]
pinctrl: renesas: rzn1: Enable missing PINMUX
Enable pin muxing (eg. programmable function), so that the RZ/N1 GPIO
pins will be configured as specified by the pinmux in the DTS.
This used to be enabled implicitly via CONFIG_GENERIC_PINMUX_FUNCTIONS,
however that was removed, since the RZ/N1 driver does not call any of
the generic pinmux functions.
Fixes: 1308fb4e4eae14e6 ("pinctrl: rzn1: Do not select GENERIC_PIN{CTRL_GROUPS,MUX_FUNCTIONS}") Signed-off-by: Ralph Siemsen <ralph.siemsen@linaro.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20231004200008.1306798-1-ralph.siemsen@linaro.org Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Jakub Kicinski [Thu, 12 Oct 2023 00:37:57 +0000 (17:37 -0700)]
Merge tag 'nf-next-23-10-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Florian Westphal says:
====================
netfilter updates for next
First 5 patches, from Phil Sutter, clean up nftables dumpers to
use the context buffer in the netlink_callback structure rather
than a kmalloc'd buffer.
Patch 6, from myself, zaps dead code and replaces the helper function
with a small inlined helper.
Patch 7, also from myself, removes another pr_debug and replaces it
with the existing nf_log-based debug helpers.
Last patch, from George Guo, gets nft_table comments back in
sync with the structure members.
* tag 'nf-next-23-10-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: cleanup struct nft_table
netfilter: conntrack: prefer tcp_error_log to pr_debug
netfilter: conntrack: simplify nf_conntrack_alter_reply
netfilter: nf_tables: Don't allocate nft_rule_dump_ctx
netfilter: nf_tables: Carry s_idx in nft_rule_dump_ctx
netfilter: nf_tables: Carry reset flag in nft_rule_dump_ctx
netfilter: nf_tables: Drop pointless memset when dumping rules
netfilter: nf_tables: Always allocate nft_rule_dump_ctx
====================
Rework network interface logic. Before this change, the code flow was:
1. Disable interrupt
2. Try to schedule a NAPI
3. Check if it was possible (NAPI is not already scheduled)
4. emit BUG() if we receive interrupt while a NAPI is scheduled
If some application busy poll or set gro_flush_timeout low enough, it's
possible to reach the BUG() condition. Given that the condition may
happen and it wouldn't be a bug, rework the logic to permit such case
and prevent stall with interrupt never enabled again.
Disable the interrupt only if the NAPI can be scheduled (aka it's not
already scheduled) and drop the printk and BUG() call. With these
change, in the event of a NAPI already scheduled, the interrupt is
simply ignored with nothing done.
netdev: replace napi_reschedule with napi_schedule
Now that napi_schedule return a bool, we can drop napi_reschedule that
does the same exact function. The function comes from a very old commit bfe13f54f502 ("ibm_emac: Convert to use napi_struct independent of struct
net_device") and the purpose is actually deprecated in favour of
different logic.
Convert every user of napi_reschedule to napi_schedule.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> # ath10k Acked-by: Nick Child <nnac123@linux.ibm.com> # ibm Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for can/dev/rx-offload.c Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20231009133754.9834-3-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
netdev: make napi_schedule return bool on NAPI successful schedule
Change napi_schedule to return a bool on NAPI successful schedule.
This might be useful for some driver to do additional steps after a
NAPI has been scheduled.
Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231009133754.9834-2-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Stitt [Mon, 9 Oct 2023 17:45:33 +0000 (17:45 +0000)]
bna: replace deprecated strncpy with strscpy_pad
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
bfa_ioc_get_adapter_manufacturer() simply copies a string literal into
`manufacturer`.