]> git.apps.os.sepia.ceph.com Git - ceph-client.git/log
ceph-client.git
13 months agoice: reword comments referring to control queues
Jacob Keller [Tue, 6 Aug 2024 20:46:25 +0000 (13:46 -0700)]
ice: reword comments referring to control queues

Many comments in ice_controlq.c use the term "Admin queue" despite the code
being intended for arbitrary control queues, not just the Admin queue.
Reword the comments to make it clear that this code is the generic control
queue logic that is shared by all of the control queues, and is not
specific to the Admin queue.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
13 months agoice: stop intermixing AQ commands/responses debug dumps
Przemek Kitszel [Tue, 6 Aug 2024 20:46:24 +0000 (13:46 -0700)]
ice: stop intermixing AQ commands/responses debug dumps

The ice_debug_cq() function is called to generate a debug log of control
queue messages both sent and received. It currently does this over a
potential total of 6 different printk invocations.

The main logic prints over 4 calls to ice_debug():

 1. The metadata including opcode, flags, datalength and return value.
 2. The cookie in the descriptor.
 3. The parameter values.
 4. The address for the databuffer.

In addition, if the descriptor has a data buffer, it can be logged with two
additional prints:

 5. A message indicating the start of the data buffer.
 6. The actual data buffer, printed using print_hex_dump_debug.

This can lead to trouble in the event that two different PFs are logging
messages. The messages become intermixed and it may not be possible to
determine which part of the output belongs to which control queue message.

To fix this, it needs to be possible to unambiguously determine which
messages belong together. This is trivial for the messages that comprise
the main printing. Combine them together into a single invocation of
ice_debug().

The message containing a hex-dump of the data buffer is a bit more
complicated. This is printed separately as part of print_hex_dump_debug.
This function takes a prefix, which is currently always set to
KBUILD_MODNAME. Extend this prefix to include the buffer address for the
databuffer, which is printed as part of the main print, and which is
guaranteed to be unique for each buffer.

Refactor the ice_debug_array(), introducing an ice_debug_array_w_prefix().
Build the prefix by combining KBUILD_MODNAME with the databuffer address
using snprintf().

These changes make it possible to unambiguously determine what data belongs
to what control queue message.

Reported-by: Jacek Wierzbicki <jacek.wierzbicki@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
13 months agoice: do not clutter debug logs with unused data
Bruce Allan [Tue, 6 Aug 2024 20:46:23 +0000 (13:46 -0700)]
ice: do not clutter debug logs with unused data

Currently, debug logs are unnecessarily cluttered with the contents of
command data buffers even if the receiver of that command (i.e. FW or MBX)
are not told to read the buffer.  Change to only log command data buffers
when the RD flag (indicates receiver needs to read the buffer) is set.
Continue to log response data buffer when the returned datalen is non-zero.

Also, rename a local variable to reflect what is in the hardware
specification and how it is used elsewhere in the code, use local variables
instead of duplicating endian conversions unnecessarily and remove an
unnecessary assignment.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
13 months agoice: improve debug print for control queue messages
Jacob Keller [Tue, 6 Aug 2024 20:46:22 +0000 (13:46 -0700)]
ice: improve debug print for control queue messages

The ice_debug_cq function is called to print debug data for a control queue
descriptor in multiple places. This includes both before we send a message
on a transmit queue, after the writeback completion of a message on the
transmit queue, and when we receive a message on a receive queue.

This function does not include data about *which* control queue the message
is on, nor whether it was what we sent to the queue or what we received
from the queue.

Modify ice_debug_cq to take two extra parameters, a pointer to the control
queue and a boolean indicating if this was a response or a command. Improve
the debug messages by replacing "CQ CMD" with a string indicating which
specific control queue (based on cq->qtype) and whether this was a command
sent by the PF or a response from the queue.

This helps make the log output easier to understand and consume when
debugging.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
13 months agoice: implement and use rd32_poll_timeout for ice_sq_done timeout
Jacob Keller [Tue, 6 Aug 2024 20:46:21 +0000 (13:46 -0700)]
ice: implement and use rd32_poll_timeout for ice_sq_done timeout

The ice_sq_done function is used to check the control queue head register
and determine whether or not the control queue processing is done. This
function is called in a loop checking against jiffies for a specified
timeout.

The pattern of reading a register in a loop until a condition is true or a
timeout is reached is a relatively common pattern. In fact, the kernel
provides a read_poll_timeout function implementing this behavior in
<linux/iopoll.h>

Use of read_poll_timeout is preferred over directly coding these loops.
However, using it in the ice driver is a bit more difficult because of the
rd32 wrapper. Implement a rd32_poll_timeout wrapper based on
read_poll_timeout.

Refactor ice_sq_done to use rd32_poll_timeout, replacing the loop calling
ice_sq_done in ice_sq_send_cmd. This simplifies the logic down to a single
ice_sq_done() call.

The implementation of rd32_poll_timeout uses microseconds for its timeout
value, so update the CQ timeout macros used to be specified in microseconds
units as well instead of using HZ for jiffies.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
13 months agonet: netlink: Remove the dump_cb_mutex field from struct netlink_sock
Christophe JAILLET [Thu, 22 Aug 2024 07:03:20 +0000 (09:03 +0200)]
net: netlink: Remove the dump_cb_mutex field from struct netlink_sock

Commit 5fbf57a937f4 ("net: netlink: remove the cb_mutex "injection" from
netlink core") has removed the usage of the 'dump_cb_mutex' field from the
struct netlink_sock.

Remove the field itself now. It saves a few bytes in the structure.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: refactor ->ndo_bpf calls into dev_xdp_propagate
Mina Almasry [Thu, 22 Aug 2024 05:51:54 +0000 (05:51 +0000)]
net: refactor ->ndo_bpf calls into dev_xdp_propagate

When net devices propagate xdp configurations to slave devices,
we will need to perform a memory provider check to ensure we're
not binding xdp to a device using unreadable netmem.

Currently the ->ndo_bpf calls in a few places. Adding checks to all
these places would not be ideal.

Refactor all the ->ndo_bpf calls into one place where we can add this
check in the future.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge branch 'net-redundant-judgments'
David S. Miller [Fri, 23 Aug 2024 13:27:46 +0000 (14:27 +0100)]
Merge branch 'net-redundant-judgments'

Li Zetao says:

====================
net: Delete some redundant judgments

This patchset aims to remove some unnecessary judgments and make the
code more concise. In some network modules, rtnl_set_sk_err is used to
record error information, but the err is repeatedly judged to be less
than 0 on the error path. Deleted these redundant judgments.

No functional change intended.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: mpls: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:52 +0000 (12:32 +0800)]
net: mpls: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet/ipv6: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:51 +0000 (12:32 +0800)]
net/ipv6: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoip6mr: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:50 +0000 (12:32 +0800)]
ip6mr: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: nexthop: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:49 +0000 (12:32 +0800)]
net: nexthop: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoipmr: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:48 +0000 (12:32 +0800)]
ipmr: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoipv4: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:47 +0000 (12:32 +0800)]
ipv4: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agortnetlink: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:46 +0000 (12:32 +0800)]
rtnetlink: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoneighbour: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:45 +0000 (12:32 +0800)]
neighbour: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agofib: rules: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:44 +0000 (12:32 +0800)]
fib: rules: delete redundant judgment statements

The initial value of err is -ENOMEM, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: vxlan: delete redundant judgment statements
Li Zetao [Thu, 22 Aug 2024 04:32:43 +0000 (12:32 +0800)]
net: vxlan: delete redundant judgment statements

The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge branch 'phy-listing-and-topology-tracking'
David S. Miller [Fri, 23 Aug 2024 12:04:35 +0000 (13:04 +0100)]
Merge branch 'phy-listing-and-topology-tracking'

Maxime Chevallier says:

====================
Introduce PHY listing and link_topology tracking

This is V18 of the phy_link_topology series, aiming at improving support
for multiple PHYs being attached to the same MAC.

V18 is a simple rebase of the V17 on top of net-next, gathering the
tested-by and reviewed-by tags from Christophe (thanks !).

This iteration is also one patch shorter than V17 (patch 12/14 in V17 is gone),
as one of the patches used to fix an issue that has now been resolved by
Simon Horman in

743ff02152bc ethtool: Don't check for NULL info in prepare_data callbacks

As a remainder, here's what the PHY listings would look like :
 - eth0 has a 88x3310 acting as media converter, and an SFP module with
   an embedded 88e1111 PHY
 - eth2 has a 88e1510 PHY

PHY for eth0:
PHY index: 1
Driver name: mv88x3310
PHY device name: f212a600.mdio-mii:00
Downstream SFP bus name: sfp-eth0
Upstream type: MAC

PHY for eth0:
PHY index: 2
Driver name: Marvell 88E1111
PHY device name: i2c:sfp-eth0:16
Upstream type: PHY
Upstream PHY index: 1
Upstream SFP name: sfp-eth0

PHY for eth2:
PHY index: 1
Driver name: Marvell 88E1510
PHY device name: f212a200.mdio-mii:00
Upstream type: MAC

Ethtool patches : https://github.com/minimaxwell/ethtool/tree/mc/topo-v16
(this branch is compatible with this V18 series)

Link to V17: https://lore.kernel.org/netdev/20240709063039.2909536-1-maxime.chevallier@bootlin.com/
Link to V16: https://lore.kernel.org/netdev/20240705132706.13588-1-maxime.chevallier@bootlin.com/
Link to V15: https://lore.kernel.org/netdev/20240703140806.271938-1-maxime.chevallier@bootlin.com/
Link to V14: https://lore.kernel.org/netdev/20240701131801.1227740-1-maxime.chevallier@bootlin.com/
Link to V13: https://lore.kernel.org/netdev/20240607071836.911403-1-maxime.chevallier@bootlin.com/
Link to v12: https://lore.kernel.org/netdev/20240605124920.720690-1-maxime.chevallier@bootlin.com/
Link to v11: https://lore.kernel.org/netdev/20240404093004.2552221-1-maxime.chevallier@bootlin.com/
Link to V10: https://lore.kernel.org/netdev/20240304151011.1610175-1-maxime.chevallier@bootlin.com/
Link to V9: https://lore.kernel.org/netdev/20240228114728.51861-1-maxime.chevallier@bootlin.com/
Link to V8: https://lore.kernel.org/netdev/20240220184217.3689988-1-maxime.chevallier@bootlin.com/
Link to V7: https://lore.kernel.org/netdev/20240213150431.1796171-1-maxime.chevallier@bootlin.com/
Link to V6: https://lore.kernel.org/netdev/20240126183851.2081418-1-maxime.chevallier@bootlin.com/
Link to V5: https://lore.kernel.org/netdev/20231221180047.1924733-1-maxime.chevallier@bootlin.com/
Link to V4: https://lore.kernel.org/netdev/20231215171237.1152563-1-maxime.chevallier@bootlin.com/
Link to V3: https://lore.kernel.org/netdev/20231201163704.1306431-1-maxime.chevallier@bootlin.com/
Link to V2: https://lore.kernel.org/netdev/20231117162323.626979-1-maxime.chevallier@bootlin.com/
Link to V1: https://lore.kernel.org/netdev/20230907092407.647139-1-maxime.chevallier@bootlin.com/

More discussions on specific issues that happened in 6.9-rc:

https://lore.kernel.org/netdev/20240412104615.3779632-1-maxime.chevallier@bootlin.com/
https://lore.kernel.org/netdev/20240429131008.439231-1-maxime.chevallier@bootlin.com/
https://lore.kernel.org/netdev/20240507102822.2023826-1-maxime.chevallier@bootlin.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoDocumentation: networking: document phy_link_topology
Maxime Chevallier [Wed, 21 Aug 2024 15:10:07 +0000 (17:10 +0200)]
Documentation: networking: document phy_link_topology

The newly introduced phy_link_topology tracks all ethernet PHYs that are
attached to a netdevice. Document the base principle, internal and
external APIs. As the phy_link_topology is expected to be extended, this
documentation will hold any further improvements and additions made
relative to topology handling.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: ethtool: strset: Allow querying phy stats by index
Maxime Chevallier [Wed, 21 Aug 2024 15:10:06 +0000 (17:10 +0200)]
net: ethtool: strset: Allow querying phy stats by index

The ETH_SS_PHY_STATS command gets PHY statistics. Use the phydev pointer
from the ethnl request to allow query phy stats from each PHY on the
link.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: ethtool: cable-test: Target the command to the requested PHY
Maxime Chevallier [Wed, 21 Aug 2024 15:10:05 +0000 (17:10 +0200)]
net: ethtool: cable-test: Target the command to the requested PHY

Cable testing is a PHY-specific command. Instead of targeting the command
towards dev->phydev, use the request to pick the targeted PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: ethtool: pse-pd: Target the command to the requested PHY
Maxime Chevallier [Wed, 21 Aug 2024 15:10:04 +0000 (17:10 +0200)]
net: ethtool: pse-pd: Target the command to the requested PHY

PSE and PD configuration is a PHY-specific command. Instead of targeting
the command towards dev->phydev, use the request to pick the targeted
PHY device.

As we don't get the PHY directly from the netdev's attached phydev, also
adjust the error messages.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: ethtool: plca: Target the command to the requested PHY
Maxime Chevallier [Wed, 21 Aug 2024 15:10:03 +0000 (17:10 +0200)]
net: ethtool: plca: Target the command to the requested PHY

PLCA is a PHY-specific command. Instead of targeting the command
towards dev->phydev, use the request to pick the targeted PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonetlink: specs: add ethnl PHY_GET command set
Maxime Chevallier [Wed, 21 Aug 2024 15:10:02 +0000 (17:10 +0200)]
netlink: specs: add ethnl PHY_GET command set

The PHY_GET command, supporting both DUMP and GET operations, is used to
retrieve the list of PHYs connected to a netdevice, and get topology
information to know where exactly it sits on the physical link.

Add the netlink specs corresponding to that command.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: ethtool: Introduce a command to list PHYs on an interface
Maxime Chevallier [Wed, 21 Aug 2024 15:10:01 +0000 (17:10 +0200)]
net: ethtool: Introduce a command to list PHYs on an interface

As we have the ability to track the PHYs connected to a net_device
through the link_topology, we can expose this list to userspace. This
allows userspace to use these identifiers for phy-specific commands and
take the decision of which PHY to target by knowing the link topology.

Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list
devices on only one interface.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonetlink: specs: add phy-index as a header parameter
Maxime Chevallier [Wed, 21 Aug 2024 15:10:00 +0000 (17:10 +0200)]
netlink: specs: add phy-index as a header parameter

Update the spec to take the newly introduced phy-index as a generic
request parameter.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: ethtool: Allow passing a phy index for some commands
Maxime Chevallier [Wed, 21 Aug 2024 15:09:59 +0000 (17:09 +0200)]
net: ethtool: Allow passing a phy index for some commands

Some netlink commands are target towards ethernet PHYs, to control some
of their features. As there's several such commands, add the ability to
pass a PHY index in the ethnl request, which will populate the generic
ethnl_req_info with the passed phy_index.

Add a helper that netlink command handlers need to use to grab the
targeted PHY from the req_info. This helper needs to hold rtnl_lock()
while interacting with the PHY, as it may be removed at any point.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: sfp: Add helper to return the SFP bus name
Maxime Chevallier [Wed, 21 Aug 2024 15:09:58 +0000 (17:09 +0200)]
net: sfp: Add helper to return the SFP bus name

Knowing the bus name is helpful when we want to expose the link topology
to userspace, add a helper to return the SFP bus name.

This call will always be made while holding the RTNL which ensures
that the SFP driver won't unbind from the device. The returned pointer
to the bus name will only be used while RTNL is held.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: phy: add helpers to handle sfp phy connect/disconnect
Maxime Chevallier [Wed, 21 Aug 2024 15:09:57 +0000 (17:09 +0200)]
net: phy: add helpers to handle sfp phy connect/disconnect

There are a few PHY drivers that can handle SFP modules through their
sfp_upstream_ops. Introduce Phylib helpers to keep track of connected
SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the
upstream PHY's netdev's namespace.

By doing so, these SFP PHYs can be enumerated and exposed to users,
which will be able to use their capabilities.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: sfp: pass the phy_device when disconnecting an sfp module's PHY
Maxime Chevallier [Wed, 21 Aug 2024 15:09:56 +0000 (17:09 +0200)]
net: sfp: pass the phy_device when disconnecting an sfp module's PHY

Pass the phy_device as a parameter to the sfp upstream .disconnect_phy
operation. This is preparatory work to help track phy devices across
a net_device's link.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: phy: Introduce ethernet link topology representation
Maxime Chevallier [Wed, 21 Aug 2024 15:09:55 +0000 (17:09 +0200)]
net: phy: Introduce ethernet link topology representation

Link topologies containing multiple network PHYs attached to the same
net_device can be found when using a PHY as a media converter for use
with an SFP connector, on which an SFP transceiver containing a PHY can
be used.

With the current model, the transceiver's PHY can't be used for
operations such as cable testing, timestamping, macsec offload, etc.

The reason being that most of the logic for these configuration, coming
from either ethtool netlink or ioctls tend to use netdev->phydev, which
in multi-phy systems will reference the PHY closest to the MAC.

Introduce a numbering scheme allowing to enumerate PHY devices that
belong to any netdev, which can in turn allow userspace to take more
precise decisions with regard to each PHY's configuration.

The numbering is maintained per-netdev, in a phy_device_list.
The numbering works similarly to a netdevice's ifindex, with
identifiers that are only recycled once INT_MAX has been reached.

This prevents races that could occur between PHY listing and SFP
transceiver removal/insertion.

The identifiers are assigned at phy_attach time, as the numbering
depends on the netdevice the phy is attached to. The PHY index can be
re-used for PHYs that are persistent.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Fri, 23 Aug 2024 00:05:09 +0000 (17:05 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt.h
  c948c0973df5 ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops")
  f2878cdeb754 ("bnxt_en: Add support to call FW to update a VNIC")

Link: https://patch.msgid.link/20240822210125.1542769-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'unmask-upper-dscp-bits-part-1'
Jakub Kicinski [Fri, 23 Aug 2024 00:00:30 +0000 (17:00 -0700)]
Merge branch 'unmask-upper-dscp-bits-part-1'

Ido Schimmel says:

====================
Unmask upper DSCP bits - part 1

tl;dr - This patchset starts to unmask the upper DSCP bits in the IPv4
flow key in preparation for allowing IPv4 FIB rules to match on DSCP.
No functional changes are expected.

The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
lookup to match against the TOS selector in FIB rules and routes.

It is currently impossible for user space to configure FIB rules that
match on the DSCP value as the upper DSCP bits are either masked in the
various call sites that initialize the IPv4 flow key or along the path
to the FIB core.

In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
need to make sure the entire DSCP value is present in the IPv4 flow key.
This patchset starts to unmask the upper DSCP bits in the various places
that invoke the core FIB lookup functions directly (patches #1-#7) and
in the input route path (patches #8-#12). Future patchsets will do the
same in the output route path.

No functional changes are expected as commit 1fa3314c14c6 ("ipv4:
Centralize TOS matching") moved the masking of the upper DSCP bits to
the core where 'flowi4_tos' is matched against the TOS selector.
====================

Link: https://patch.msgid.link/20240821125251.1571445-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: Unmask upper DSCP bits when using hints
Ido Schimmel [Wed, 21 Aug 2024 12:52:51 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits when using hints

Unmask the upper DSCP bits when performing source validation and routing
a packet using the same route from a previously processed packet (hint).
In the future, this will allow us to perform the FIB lookup that is
performed as part of source validation according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-13-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: udp: Unmask upper DSCP bits during early demux
Ido Schimmel [Wed, 21 Aug 2024 12:52:50 +0000 (15:52 +0300)]
ipv4: udp: Unmask upper DSCP bits during early demux

Unmask the upper DSCP bits when performing source validation for
multicast packets during early demux. In the future, this will allow us
to perform the FIB lookup which is performed as part of source
validation according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-12-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: icmp: Pass full DS field to ip_route_input()
Ido Schimmel [Wed, 21 Aug 2024 12:52:49 +0000 (15:52 +0300)]
ipv4: icmp: Pass full DS field to ip_route_input()

Align the ICMP code to other callers of ip_route_input() and pass the
full DS field. In the future this will allow us to perform a route
lookup according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-11-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: Unmask upper DSCP bits in RTM_GETROUTE input route lookup
Ido Schimmel [Wed, 21 Aug 2024 12:52:48 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits in RTM_GETROUTE input route lookup

Unmask the upper DSCP bits when looking up an input route via the
RTM_GETROUTE netlink message so that in the future the lookup could be
performed according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-10-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: Unmask upper DSCP bits in input route lookup
Ido Schimmel [Wed, 21 Aug 2024 12:52:47 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits in input route lookup

Unmask the upper DSCP bits in input route lookup so that in the future
the lookup could be performed according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-9-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: Unmask upper DSCP bits in fib_compute_spec_dst()
Ido Schimmel [Wed, 21 Aug 2024 12:52:46 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits in fib_compute_spec_dst()

As explained in commit 35ebf65e851c ("ipv4: Create and use
fib_compute_spec_dst() helper."), the function is used - for example -
to determine the source address for an ICMP reply. If we are responding
to a multicast or broadcast packet, the source address is set to the
source address that we would use if we were to send a packet to the
unicast source of the original packet. This address is determined by
performing a FIB lookup and using the preferred source address of the
resulting route.

Unmask the upper DSCP bits of the DS field of the packet that triggered
the reply so that in the future the FIB lookup could be performed
according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-8-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: ipmr: Unmask upper DSCP bits in ipmr_rt_fib_lookup()
Ido Schimmel [Wed, 21 Aug 2024 12:52:45 +0000 (15:52 +0300)]
ipv4: ipmr: Unmask upper DSCP bits in ipmr_rt_fib_lookup()

Unmask the upper DSCP bits when calling ipmr_fib_lookup() so that in the
future it could perform the FIB lookup according to the full DSCP value.

Note that ipmr_fib_lookup() performs a FIB rule lookup (returning the
relevant routing table) and that IPv4 multicast FIB rules do not support
matching on TOS / DSCP. However, it is still worth unmasking the upper
DSCP bits in case support for DSCP matching is ever added.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-7-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonetfilter: nft_fib: Unmask upper DSCP bits
Ido Schimmel [Wed, 21 Aug 2024 12:52:44 +0000 (15:52 +0300)]
netfilter: nft_fib: Unmask upper DSCP bits

In a similar fashion to the iptables rpfilter match, unmask the upper
DSCP bits of the DS field of the currently tested packet so that in the
future the FIB lookup could be performed according to the full DSCP
value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-6-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonetfilter: rpfilter: Unmask upper DSCP bits
Ido Schimmel [Wed, 21 Aug 2024 12:52:43 +0000 (15:52 +0300)]
netfilter: rpfilter: Unmask upper DSCP bits

The rpfilter match performs a reverse path filter test on a packet by
performing a FIB lookup with the source and destination addresses
swapped.

Unmask the upper DSCP bits of the DS field of the tested packet so that
in the future the FIB lookup could be performed according to the full
DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-5-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: Unmask upper DSCP bits when constructing the Record Route option
Ido Schimmel [Wed, 21 Aug 2024 12:52:42 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits when constructing the Record Route option

The Record Route IP option records the addresses of the routers that
routed the packet. In the case of forwarded packets, the kernel performs
a route lookup via fib_lookup() and fills in the preferred source
address of the matched route.

Unmask the upper DSCP bits when performing the lookup so that in the
future the lookup could be performed according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv4: Unmask upper DSCP bits in NETLINK_FIB_LOOKUP family
Ido Schimmel [Wed, 21 Aug 2024 12:52:41 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits in NETLINK_FIB_LOOKUP family

The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB
lookup according to user provided parameters and communicate the result
back to user space.

Unmask the upper DSCP bits of the user-provided DS field before invoking
the IPv4 FIB lookup API so that in the future the lookup could be
performed according to the full DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agobpf: Unmask upper DSCP bits in bpf_fib_lookup() helper
Ido Schimmel [Wed, 21 Aug 2024 12:52:40 +0000 (15:52 +0300)]
bpf: Unmask upper DSCP bits in bpf_fib_lookup() helper

The helper performs a FIB lookup according to the parameters in the
'params' argument, one of which is 'tos'. According to the test in
test_tc_neigh_fib.c, it seems that BPF programs are expected to
initialize the 'tos' field to the full 8 bit DS field from the IPv4
header.

Unmask the upper DSCP bits before invoking the IPv4 FIB lookup APIs so
that in the future the lookup could be performed according to the full
DSCP value.

No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'enhance-network-interface-feature-testing'
Jakub Kicinski [Thu, 22 Aug 2024 23:56:09 +0000 (16:56 -0700)]
Merge branch 'enhance-network-interface-feature-testing'

Abhinav Jain says:

====================
Enhance network interface feature testing

This small series includes fixes for creation of veth pairs for
networkless kernels & adds tests for turning the different network
interface features on and off in selftests/net/netdevice.sh script.
Tested using vng and compiles for network as well as networkless kernel.

   # selftests: net: netdevice.sh
   # No valid network device found, creating veth pair
   # PASS: veth0: set interface up
   # PASS: veth0: set MAC address
   # XFAIL: veth0: set IP address unsupported for veth*
   # PASS: veth0: ethtool list features
   # PASS: veth0: Turned off feature: rx-checksumming
   # PASS: veth0: Turned on feature: rx-checksumming
   # PASS: veth0: Restore feature rx-checksumming to initial state on
   # Actual changes:
   # tx-checksum-ip-generic: off

   ...

   # PASS: veth0: Turned on feature: rx-udp-gro-forwarding
   # PASS: veth0: Restore feature rx-udp-gro-forwarding to initial state off
   # Cannot get register dump: Operation not supported
   # XFAIL: veth0: ethtool dump not supported
   # PASS: veth0: ethtool stats
   # PASS: veth0: stop interface
====================

Link: https://patch.msgid.link/20240821171903.118324-1-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoselftests: net: Use XFAIL for operations not supported by the driver
Abhinav Jain [Wed, 21 Aug 2024 17:19:03 +0000 (22:49 +0530)]
selftests: net: Use XFAIL for operations not supported by the driver

Check if veth pair was created and if yes, xfail on setting IP address
logging an informational message.
Use XFAIL instead of SKIP for unsupported ethtool APIs.

Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821171903.118324-4-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoselftests: net: Add on/off checks for non-fixed features of interface
Abhinav Jain [Wed, 21 Aug 2024 17:19:02 +0000 (22:49 +0530)]
selftests: net: Add on/off checks for non-fixed features of interface

Implement on/off testing for all non-fixed features via while loop.

Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821171903.118324-3-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoselftests: net: Create veth pair for testing in networkless kernel
Abhinav Jain [Wed, 21 Aug 2024 17:19:01 +0000 (22:49 +0530)]
selftests: net: Create veth pair for testing in networkless kernel

Check if the netdev list is empty and create veth pair to be used for
feature on/off testing.
Remove the veth pair after testing is complete.

Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821171903.118324-2-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: atlantic: Avoid warning about potential string truncation
Simon Horman [Wed, 21 Aug 2024 15:58:57 +0000 (16:58 +0100)]
net: atlantic: Avoid warning about potential string truncation

W=1 builds with GCC 14.2.0 warn that:

.../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=]
  278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
      |                                                           ^~
.../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254]
  278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
      |                                                        ^~~~~~~
.../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8
  278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8,
the range is 0 - 255. Further, on inspecting the code, it seems
that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so
the range is actually 0 - 8.

So, it seems that the condition that GCC flags will not occur.
But, nonetheless, it would be nice if it didn't emit the warning.

It seems that this can be achieved by changing the format specifier
from %d to %u, in which case I believe GCC recognises an upper bound
on the range of tc of 0 - 255. After some experimentation I think
this is due to the combination of the use of %u and the type of
cfg->tcs (u8).

Empirically, updating the type of the tc variable to unsigned int
has the same effect.

As both of these changes seem to make sense in relation to what the code
is actually doing - iterating over unsigned values - do both.

Compile tested only.

Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821-atlantic-str-v1-1-fa2cfe38ca00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 22 Aug 2024 23:47:01 +0000 (07:47 +0800)]
Merge tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth and netfilter.

  Current release - regressions:

   - virtio_net: avoid crash on resume - move netdev_tx_reset_queue()
     call before RX napi enable

  Current release - new code bugs:

   - net/mlx5e: fix page leak and incorrect header release w/ HW GRO

  Previous releases - regressions:

   - udp: fix receiving fraglist GSO packets

   - tcp: prevent refcount underflow due to concurrent execution of
     tcp_sk_exit_batch()

  Previous releases - always broken:

   - ipv6: fix possible UAF when incrementing error counters on output

   - ip6: tunnel: prevent merging of packets with different L2

   - mptcp: pm: fix IDs not being reusable

   - bonding: fix potential crashes in IPsec offload handling

   - Bluetooth: HCI:
      - MGMT: add error handling to pair_device() to avoid a crash
      - invert LE State quirk to be opt-out rather then opt-in
      - fix LE quote calculation

   - drv: dsa: VLAN fixes for Ocelot driver

   - drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings

   - drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192

  Misc:

   - netpoll: do not export netpoll_poll_[disable|enable]()

   - MAINTAINERS: update the list of networking headers"

* tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
  s390/iucv: Fix vargs handling in iucv_alloc_device()
  net: ovs: fix ovs_drop_reasons error
  net: xilinx: axienet: Fix dangling multicast addresses
  net: xilinx: axienet: Always disable promiscuous mode
  MAINTAINERS: Mark JME Network Driver as Odd Fixes
  MAINTAINERS: Add header files to NETWORKING sections
  MAINTAINERS: Add limited globs for Networking headers
  MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section
  MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS
  octeontx2-af: Fix CPT AF register offset calculation
  net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F
  net: ngbe: Fix phy mode set to external phy
  netfilter: flowtable: validate vlan header
  bnxt_en: Fix double DMA unmapping for XDP_REDIRECT
  ipv6: prevent possible UAF in ip6_xmit()
  ipv6: fix possible UAF in ip6_finish_output2()
  ipv6: prevent UAF in ip6_send_skb()
  netpoll: do not export netpoll_poll_[disable|enable]()
  selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path
  udp: fix receiving fraglist GSO packets
  ...

14 months agoMerge tag 'kbuild-fixes-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 22 Aug 2024 23:43:15 +0000 (07:43 +0800)]
Merge tag 'kbuild-fixes-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Eliminate the fdtoverlay command duplication in scripts/Makefile.lib

 - Fix 'make compile_commands.json' for external modules

 - Ensure scripts/kconfig/merge_config.sh handles missing newlines

 - Fix some build errors on macOS

* tag 'kbuild-fixes-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: fix typos "prequisites" to "prerequisites"
  Documentation/llvm: turn make command for ccache into code block
  kbuild: avoid scripts/kallsyms parsing /dev/null
  treewide: remove unnecessary <linux/version.h> inclusion
  scripts: kconfig: merge_config: config files: add a trailing newline
  Makefile: add $(srctree) to dependency of compile_commands.json target
  kbuild: clean up code duplication in cmd_fdtoverlay

14 months agos390/iucv: Fix vargs handling in iucv_alloc_device()
Alexandra Winter [Wed, 21 Aug 2024 09:13:37 +0000 (11:13 +0200)]
s390/iucv: Fix vargs handling in iucv_alloc_device()

iucv_alloc_device() gets a format string and a varying number of
arguments. This is incorrectly forwarded by calling dev_set_name() with
the format string and a va_list, while dev_set_name() expects also a
varying number of arguments.

Symptoms:
Corrupted iucv device names, which can result in log messages like:
sysfs: cannot create duplicate filename '/devices/iucv/hvc_iucv1827699952'

Fixes: 4452e8ef8c36 ("s390/iucv: Provide iucv_alloc_device() / iucv_release_device()")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1228425
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20240821091337.3627068-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: ovs: fix ovs_drop_reasons error
Menglong Dong [Wed, 21 Aug 2024 12:32:52 +0000 (20:32 +0800)]
net: ovs: fix ovs_drop_reasons error

There is something wrong with ovs_drop_reasons. ovs_drop_reasons[0] is
"OVS_DROP_LAST_ACTION", but OVS_DROP_LAST_ACTION == __OVS_DROP_REASON + 1,
which means that ovs_drop_reasons[1] should be "OVS_DROP_LAST_ACTION".

And as Adrian tested, without the patch, adding flow to drop packets
results in:

drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
origin: software
input port ifindex: 8
timestamp: Tue Aug 20 10:19:17 2024 859853461 nsec
protocol: 0x800
length: 98
original length: 98
drop reason: OVS_DROP_ACTION_ERROR

With the patch, the same results in:

drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
origin: software
input port ifindex: 8
timestamp: Tue Aug 20 10:16:13 2024 475856608 nsec
protocol: 0x800
length: 98
original length: 98
drop reason: OVS_DROP_LAST_ACTION

Fix this by initializing ovs_drop_reasons with index.

Fixes: 9d802da40b7c ("net: openvswitch: add last-action drop reason")
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240821123252.186305-1-dongml2@chinatelecom.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'nf-24-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Thu, 22 Aug 2024 20:06:24 +0000 (13:06 -0700)]
Merge tag 'nf-24-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

Patch #1 disable BH when collecting stats via hardware offload to ensure
         concurrent updates from packet path do not result in losing stats.
         From Sebastian Andrzej Siewior.

Patch #2 uses write seqcount to reset counters serialize against reader.
         Also from Sebastian Andrzej Siewior.

Patch #3 ensures vlan header is in place before accessing its fields,
         according to KMSAN splat triggered by syzbot.

* tag 'nf-24-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: flowtable: validate vlan header
  netfilter: nft_counter: Synchronize nft_counter_reset() against reader.
  netfilter: nft_counter: Disable BH in nft_counter_offload_stats().
====================

Link: https://patch.msgid.link/20240822101842.4234-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'net-xilinx-axienet-multicast-fixes-and-improvements'
Jakub Kicinski [Thu, 22 Aug 2024 20:03:59 +0000 (13:03 -0700)]
Merge branch 'net-xilinx-axienet-multicast-fixes-and-improvements'

Sean Anderson says:

====================
net: xilinx: axienet: Multicast fixes and improvements [part]
====================

First two patches of the series which are fixes.

Link: https://patch.msgid.link/20240822154059.1066595-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: xilinx: axienet: Fix dangling multicast addresses
Sean Anderson [Thu, 22 Aug 2024 15:40:56 +0000 (11:40 -0400)]
net: xilinx: axienet: Fix dangling multicast addresses

If a multicast address is removed but there are still some multicast
addresses, that address would remain programmed into the frame filter.
Fix this by explicitly setting the enable bit for each filter.

Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240822154059.1066595-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: xilinx: axienet: Always disable promiscuous mode
Sean Anderson [Thu, 22 Aug 2024 15:40:55 +0000 (11:40 -0400)]
net: xilinx: axienet: Always disable promiscuous mode

If promiscuous mode is disabled when there are fewer than four multicast
addresses, then it will not be reflected in the hardware. Fix this by
always clearing the promiscuous mode flag even when we program multicast
addresses.

Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240822154059.1066595-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agokbuild: fix typos "prequisites" to "prerequisites"
Masahiro Yamada [Sun, 18 Aug 2024 07:07:11 +0000 (16:07 +0900)]
kbuild: fix typos "prequisites" to "prerequisites"

This typo in scripts/Makefile.build has been present for more than 20
years. It was accidentally copy-pasted to other scripts/Makefile.* files.
Fix them all.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
14 months agoMerge branch 'maintainers-networking-updates'
Paolo Abeni [Thu, 22 Aug 2024 13:24:07 +0000 (15:24 +0200)]
Merge branch 'maintainers-networking-updates'

Simon Horman says:

====================
MAINTAINERS: Networking updates

This series includes Networking-related updates to MAINTAINERS.

* Patches 1-4 aim to assign header files with "*net*' and '*skbuff*'
  in their name to Networking-related sections within Maintainers.

  There are a few such files left over after this patches.
  I have to sent separate patches to add them to SCSI SUBSYSTEM
  and NETWORKING DRIVERS (WIRELESS) sections [1][2].

  [1] https://lore.kernel.org/linux-scsi/20240816-scsi-mnt-v1-1-439af8b1c28b@kernel.org/
  [2] https://lore.kernel.org/linux-wireless/20240816-wifi-mnt-v1-1-3fb3bf5d44aa@kernel.org/

* Patch 5 updates the status of the JME driver to 'Odd Fixes'
====================

Link: https://patch.msgid.link/20240821-net-mnt-v2-0-59a5af38e69d@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoMAINTAINERS: Mark JME Network Driver as Odd Fixes
Simon Horman [Wed, 21 Aug 2024 08:46:48 +0000 (09:46 +0100)]
MAINTAINERS: Mark JME Network Driver as Odd Fixes

This driver only appears to have received sporadic clean-ups, typically
part of some tree-wide activity, and fixes for quite some time.  And
according to the maintainer, Guo-Fu Tseng, the device has been EOLed for
a long time (see Link).

Accordingly, it seems appropriate to mark this driver as odd fixes.

Cc: Moon Yeounsu <yyyynoom@gmail.com>
Cc: Guo-Fu Tseng <cooldavid@cooldavid.org>
Link: https://lore.kernel.org/netdev/20240805003139.M94125@cooldavid.org/
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoMAINTAINERS: Add header files to NETWORKING sections
Simon Horman [Wed, 21 Aug 2024 08:46:47 +0000 (09:46 +0100)]
MAINTAINERS: Add header files to NETWORKING sections

This is part of an effort to assign a section in MAINTAINERS to header
files that relate to Networking. In this case the files with "net" or
"skbuff" in their name.

This patch adds a number of such files to the NETWORKING DRIVERS
and NETWORKING [GENERAL] sections.

Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoMAINTAINERS: Add limited globs for Networking headers
Simon Horman [Wed, 21 Aug 2024 08:46:46 +0000 (09:46 +0100)]
MAINTAINERS: Add limited globs for Networking headers

This aims to add limited globs to improve the coverage of header files
in the NETWORKING DRIVERS and NETWORKING [GENERAL] sections.

It is done so in a minimal way to exclude overlap with other sections.
And so as not to require "X" entries to exclude files otherwise
matched by these new globs.

While imperfect, due to it's limited nature, this does extend coverage
of header files by these sections. And aims to automatically cover
new files that seem very likely belong to these sections.

The include/linux/netdev* glob (both sections)
+ Subsumes the entries for:
  - include/linux/netdevice.h
+ Extends the sections to cover
  - include/linux/netdevice_xmit.h
  - include/linux/netdev_features.h

The include/uapi/linux/netdev* globs: (both sections)
+ Subsumes the entries for:
  - include/linux/netdevice.h
+ Extends the sections to cover
  - include/linux/netdev.h

The include/linux/skbuff* glob (NETWORKING [GENERAL] section only):
+ Subsumes the entry for:
  - include/linux/skbuff.h
+ Extends the section to cover
  - include/linux/skbuff_ref.h

A include/uapi/linux/net_* glob was not added to the NETWORKING [GENERAL]
section. Although it would subsume the entry for
include/uapi/linux/net_namespace.h, which is fine, it would also extend
coverage to:
- include/uapi/linux/net_dropmon.h, which belongs to the
   NETWORK DROP MONITOR section
- include/uapi/linux/net_tstamp.h which, as per an earlier patch in this
  series, belongs to the SOCKET TIMESTAMPING section

Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoMAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section
Simon Horman [Wed, 21 Aug 2024 08:46:45 +0000 (09:46 +0100)]
MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section

This is part of an effort to assign a section in MAINTAINERS to header
files that relate to Networking. In this case the files with "net" in
their name.

Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoMAINTAINERS: Add sonet.h to ATM section of MAINTAINERS
Simon Horman [Wed, 21 Aug 2024 08:46:44 +0000 (09:46 +0100)]
MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS

This is part of an effort to assign a section in MAINTAINERS to header
files that relate to Networking. In this case the files with "net" in
their name.

It seems that sonet.h is included in ATM related source files,
and thus that ATM is the most relevant section for these files.

Cc: Chas Williams <3chas3@gmail.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonfp: bpf: Use kmemdup_array instead of kmemdup for multiple allocation
Yu Jiaoliang [Wed, 21 Aug 2024 08:14:45 +0000 (16:14 +0800)]
nfp: bpf: Use kmemdup_array instead of kmemdup for multiple allocation

Let the kememdup_array() take care about multiplication and possible
overflows.

Signed-off-by: Yu Jiaoliang <yujiaoliang@vivo.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821081447.12430-1-yujiaoliang@vivo.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: airoha: configure hw mac address according to the port id
Lorenzo Bianconi [Wed, 21 Aug 2024 07:30:14 +0000 (09:30 +0200)]
net: airoha: configure hw mac address according to the port id

GDM1 port on EN7581 SoC is connected to the lan dsa switch.
GDM{2,3,4} can be used as wan port connected to an external
phy module. Configure hw mac address registers according to the port id.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20240821-airoha-eth-wan-mac-addr-v2-1-8706d0cd6cd5@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoocteontx2-af: Fix CPT AF register offset calculation
Bharat Bhushan [Wed, 21 Aug 2024 07:05:58 +0000 (12:35 +0530)]
octeontx2-af: Fix CPT AF register offset calculation

Some CPT AF registers are per LF and others are global. Translation
of PF/VF local LF slot number to actual LF slot number is required
only for accessing perf LF registers. CPT AF global registers access
do not require any LF slot number. Also, there is no reason CPT
PF/VF to know actual lf's register offset.

Without this fix microcode loading will fail, VFs cannot be created
and hardware is not usable.

Fixes: bc35e28af789 ("octeontx2-af: replace cpt slot with lf id on reg write")
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821070558.1020101-1-bbhushan2@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F
Sava Jakovljev [Wed, 21 Aug 2024 02:16:57 +0000 (04:16 +0200)]
net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F

The current implementation incorrectly sets the mode bit of the PHY chip.
Bit 15 (RTL8211F_LEDCR_MODE) should not be shifted together with the
configuration nibble of a LED- it should be set independently of the
index of the LED being configured.
As a consequence, the RTL8211F LED control is actually operating in Mode A.
Fix the error by or-ing final register value to write with a const-value of
RTL8211F_LEDCR_MODE, thus setting Mode bit explicitly.

Fixes: 17784801d888 ("net: phy: realtek: Add support for PHY LEDs on RTL8211F")
Signed-off-by: Sava Jakovljev <savaj@meyersound.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://patch.msgid.link/PAWP192MB21287372F30C4E55B6DF6158C38E2@PAWP192MB2128.EURP192.PROD.OUTLOOK.COM
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoselftests: net: add helper for checking if nettest is available
Jakub Kicinski [Wed, 21 Aug 2024 01:22:27 +0000 (18:22 -0700)]
selftests: net: add helper for checking if nettest is available

A few tests check if nettest exists in the $PATH before adding
$PWD to $PATH and re-checking. They don't discard stderr on
the first check (and nettest is built as part of selftests,
so it's pretty normal for it to not be available in system $PATH).
This leads to output noise:

  which: no nettest in (/home/virtme/tools/fs/bin:/home/virtme/tools/fs/sbin:/home/virtme/tools/fs/usr/bin:/home/virtme/tools/fs/usr/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin)

Add a common helper for the check which does silence stderr.

There is another small functional change hiding here, because pmtu.sh
and fib_rule_tests.sh used to return from the test case rather than
completely exit. Building nettest is not hard, there should be no need
to maintain the ability to selectively skip cases in its absence.

Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20240821012227.1398769-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: ngbe: Fix phy mode set to external phy
Mengyuan Lou [Tue, 20 Aug 2024 03:04:25 +0000 (11:04 +0800)]
net: ngbe: Fix phy mode set to external phy

The MAC only has add the TX delay and it can not be modified.
MAC and PHY are both set the TX delay cause transmission problems.
So just disable TX delay in PHY, when use rgmii to attach to
external phy, set PHY_INTERFACE_MODE_RGMII_RXID to phy drivers.
And it is does not matter to internal phy.

Fixes: bc2426d74aa3 ("net: ngbe: convert phylib to phylink")
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Cc: stable@vger.kernel.org # 6.3+
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E6759CF1387CF84C+20240820030425.93003-1-mengyuanlou@net-swift.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonetfilter: flowtable: validate vlan header
Pablo Neira Ayuso [Tue, 13 Aug 2024 10:39:46 +0000 (12:39 +0200)]
netfilter: flowtable: validate vlan header

Ensure there is sufficient room to access the protocol field of the
VLAN header, validate it once before the flowtable lookup.

=====================================================
BUG: KMSAN: uninit-value in nf_flow_offload_inet_hook+0x45a/0x5f0 net/netfilter/nf_flow_table_inet.c:32
 nf_flow_offload_inet_hook+0x45a/0x5f0 net/netfilter/nf_flow_table_inet.c:32
 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
 nf_hook_slow+0xf4/0x400 net/netfilter/core.c:626
 nf_hook_ingress include/linux/netfilter_netdev.h:34 [inline]
 nf_ingress net/core/dev.c:5440 [inline]

Fixes: 4cd91f7c290f ("netfilter: flowtable: add vlan support")
Reported-by: syzbot+8407d9bb88cd4c6bf61a@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
14 months agoMerge branch 'net-ipv6-ioam6-introduce-tunsrc'
Paolo Abeni [Thu, 22 Aug 2024 08:45:15 +0000 (10:45 +0200)]
Merge branch 'net-ipv6-ioam6-introduce-tunsrc'

Justin Iurman says:

====================
net: ipv6: ioam6: introduce tunsrc

This series introduces a new feature called "tunsrc" (just like seg6
already does).

v3:
- address Jakub's comments

v2:
- add links to performance result figures (see patch#2 description)
- move the ipv6_addr_any() check out of the datapath
====================

Link: https://patch.msgid.link/20240817131818.11834-1-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: ipv6: ioam6: new feature tunsrc
Justin Iurman [Sat, 17 Aug 2024 13:18:18 +0000 (15:18 +0200)]
net: ipv6: ioam6: new feature tunsrc

This patch provides a new feature (i.e., "tunsrc") for the tunnel (i.e.,
"encap") mode of ioam6. Just like seg6 already does, except it is
attached to a route. The "tunsrc" is optional: when not provided (by
default), the automatic resolution is applied. Using "tunsrc" when
possible has a benefit: performance. See the comparison:
 - before (= "encap" mode): https://ibb.co/bNCzvf7
 - after (= "encap" mode with "tunsrc"): https://ibb.co/PT8L6yq

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: ipv6: ioam6: code alignment
Justin Iurman [Sat, 17 Aug 2024 13:18:17 +0000 (15:18 +0200)]
net: ipv6: ioam6: code alignment

This patch prepares the next one by correcting the alignment of some
lines.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net...
Jakub Kicinski [Thu, 22 Aug 2024 01:05:24 +0000 (18:05 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-08-20 (ice)

This series contains updates to ice driver only.

Maciej fixes issues with Rx data path on architectures with
PAGE_SIZE >= 8192; correcting page reuse usage and calculations for
last offset and truesize.

Michal corrects assignment of devlink port number to use PF id.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: use internal pf id instead of function number
  ice: fix truesize operations for PAGE_SIZE >= 8192
  ice: fix ICE_LAST_OFFSET formula
  ice: fix page reuse when PAGE_SIZE is over 8k
====================

Link: https://patch.msgid.link/20240820215620.1245310-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agotools: ynl: lift an assumption about spec file name
Paolo Abeni [Tue, 20 Aug 2024 15:12:22 +0000 (17:12 +0200)]
tools: ynl: lift an assumption about spec file name

Currently the parsing code generator assumes that the yaml
specification file name and the main 'name' attribute carried
inside correspond, that is the field is the c-name representation
of the file basename.

The above assumption held true within the current tree, but will be
hopefully broken soon by the upcoming net shaper specification.
Additionally, it makes the field 'name' itself useless.

Lift the assumption, always computing the generated include file
name from the generated c file name.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/24da5a3596d814beeb12bd7139a6b4f89756cc19.1724165948.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'net-xilinx-axienet-add-statistics-support'
Jakub Kicinski [Thu, 22 Aug 2024 00:49:24 +0000 (17:49 -0700)]
Merge branch 'net-xilinx-axienet-add-statistics-support'

Sean Anderson says:

====================
net: xilinx: axienet: Add statistics support

Add support for hardware statistics counters (if they are enabled) in
the AXI Ethernet driver. Unfortunately, the implementation is
complicated a bit since the hardware might only support 32-bit counters.
====================

Link: https://patch.msgid.link/20240820175343.760389-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: xilinx: axienet: Add statistics support
Sean Anderson [Tue, 20 Aug 2024 17:53:42 +0000 (13:53 -0400)]
net: xilinx: axienet: Add statistics support

Add support for reading the statistics counters, if they are enabled.
The counters may be 64-bit, but we can't detect this statically as
there's no ability bit for it and the counters are read-only. Therefore,
we assume the counters are 32-bits by default. To ensure we don't miss
an overflow, we read all counters at 13-second intervals. This should be
often enough to ensure the bytes counters don't wrap at 2.5 Gbit/s.

Another complication is that the counters may be reset when the device
is reset (depending on configuration). To ensure the counters persist
across link up/down (including suspend/resume), we maintain our own
versions along with the last counter value we saw. Because we might wait
up to 100 ms for the reset to complete, we use a mutex to protect
writing hw_stats. We can't sleep in ndo_get_stats64, so we use a seqlock
to protect readers.

We don't bother disabling the refresh work when we detect 64-bit
counters. This is because the reset issue requires us to read
hw_stat_base and reset_in_progress anyway, which would still require the
seqcount. And I don't think skipping the task is worth the extra
bookkeeping.

We can't use the byte counters for either get_stats64 or
get_eth_mac_stats. This is because the byte counters include everything
in the frame (destination address to FCS, inclusive). But
rtnl_link_stats64 wants bytes excluding the FCS, and
ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and
length/type). It might be possible to calculate the byte values Linux
expects based on the frame counters, but I think it is simpler to use
the existing software counters.

get_ethtool_stats is implemented for nonstandard statistics. This
includes the aforementioned byte counters, VLAN and PFC frame
counters, and user-defined (e.g. with custom RTL) counters.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Link: https://patch.msgid.link/20240820175343.760389-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: xilinx: axienet: Report RxRject as rx_dropped
Sean Anderson [Tue, 20 Aug 2024 17:53:41 +0000 (13:53 -0400)]
net: xilinx: axienet: Report RxRject as rx_dropped

The Receive Frame Rejected interrupt is asserted whenever there was a
receive error (bad FCS, bad length, etc.) or whenever the frame was
dropped due to a mismatched address. So this is really a combination of
rx_otherhost_dropped, rx_length_errors, rx_frame_errors, and
rx_crc_errors. Mismatched addresses are common and aren't really errors
at all (much like how fragments are normal on half-duplex links). To
avoid confusion, report these events as rx_dropped. This better
reflects what's going on: the packet was received by the MAC but dropped
before being processed.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Link: https://patch.msgid.link/20240820175343.760389-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: repack struct netdev_queue
Jakub Kicinski [Tue, 20 Aug 2024 20:51:19 +0000 (13:51 -0700)]
net: repack struct netdev_queue

Adding the NAPI pointer to struct netdev_queue made it grow into another
cacheline, even though there was 44 bytes of padding available.

The struct was historically grouped as follows:

    /* read-mostly stuff (align) */
    /* ... random control path fields ... */
    /* write-mostly stuff (align) */
    /* ... 40 byte hole ... */
    /* struct dql (align) */

It seems that people want to add control path fields after
the read only fields. struct dql looks pretty innocent
but it forces its own alignment and nothing indicates that
there is a lot of empty space above it.

Move dql above the xmit_lock. This shifts the empty space
to the end of the struct rather than in the middle of it.
Move two example fields there to set an example.
Hopefully people will now add new fields at the end of
the struct. A lot of the read-only stuff is also control
path-only, but if we move it all we'll have another hole
in the middle.

Before:
/* size: 384, cachelines: 6, members: 16 */
/* sum members: 284, holes: 3, sum holes: 100 */

After:
        /* size: 320, cachelines: 5, members: 16 */
        /* sum members: 284, holes: 1, sum holes: 8 */

Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240820205119.1321322-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agobnxt_en: Fix double DMA unmapping for XDP_REDIRECT
Somnath Kotur [Tue, 20 Aug 2024 20:34:15 +0000 (13:34 -0700)]
bnxt_en: Fix double DMA unmapping for XDP_REDIRECT

Remove the dma_unmap_page_attrs() call in the driver's XDP_REDIRECT
code path.  This should have been removed when we let the page pool
handle the DMA mapping.  This bug causes the warning:

WARNING: CPU: 7 PID: 59 at drivers/iommu/dma-iommu.c:1198 iommu_dma_unmap_page+0xd5/0x100
CPU: 7 PID: 59 Comm: ksoftirqd/7 Tainted: G        W          6.8.0-1010-gcp #11-Ubuntu
Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 2.15.2 04/02/2024
RIP: 0010:iommu_dma_unmap_page+0xd5/0x100
Code: 89 ee 48 89 df e8 cb f2 69 ff 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 c9 31 f6 31 ff 45 31 c0 e9 ab 17 71 00 <0f> 0b 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 c9
RSP: 0018:ffffab1fc0597a48 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff99ff838280c8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffab1fc0597a78 R08: 0000000000000002 R09: ffffab1fc0597c1c
R10: ffffab1fc0597cd3 R11: ffff99ffe375acd8 R12: 00000000e65b9000
R13: 0000000000000050 R14: 0000000000001000 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff9a06efb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000565c34c37210 CR3: 00000005c7e3e000 CR4: 0000000000350ef0
? show_regs+0x6d/0x80
? __warn+0x89/0x150
? iommu_dma_unmap_page+0xd5/0x100
? report_bug+0x16a/0x190
? handle_bug+0x51/0xa0
? exc_invalid_op+0x18/0x80
? iommu_dma_unmap_page+0xd5/0x100
? iommu_dma_unmap_page+0x35/0x100
dma_unmap_page_attrs+0x55/0x220
? bpf_prog_4d7e87c0d30db711_xdp_dispatcher+0x64/0x9f
bnxt_rx_xdp+0x237/0x520 [bnxt_en]
bnxt_rx_pkt+0x640/0xdd0 [bnxt_en]
__bnxt_poll_work+0x1a1/0x3d0 [bnxt_en]
bnxt_poll+0xaa/0x1e0 [bnxt_en]
__napi_poll+0x33/0x1e0
net_rx_action+0x18a/0x2f0

Fixes: 578fcfd26e2a ("bnxt_en: Let the page pool manage the DMA mapping")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240820203415.168178-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'ipv6-fix-possible-uaf-in-output-paths'
Jakub Kicinski [Thu, 22 Aug 2024 00:35:51 +0000 (17:35 -0700)]
Merge branch 'ipv6-fix-possible-uaf-in-output-paths'

Eric Dumazet says:

====================
ipv6: fix possible UAF in output paths

First patch fixes an issue spotted by syzbot, and the two
other patches fix error paths after skb_expand_head()
adoption.
====================

Link: https://patch.msgid.link/20240820160859.3786976-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv6: prevent possible UAF in ip6_xmit()
Eric Dumazet [Tue, 20 Aug 2024 16:08:59 +0000 (16:08 +0000)]
ipv6: prevent possible UAF in ip6_xmit()

If skb_expand_head() returns NULL, skb has been freed
and the associated dst/idev could also have been freed.

We must use rcu_read_lock() to prevent a possible UAF.

Fixes: 0c9f227bee11 ("ipv6: use skb_expand_head in ip6_xmit")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240820160859.3786976-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv6: fix possible UAF in ip6_finish_output2()
Eric Dumazet [Tue, 20 Aug 2024 16:08:58 +0000 (16:08 +0000)]
ipv6: fix possible UAF in ip6_finish_output2()

If skb_expand_head() returns NULL, skb has been freed
and associated dst/idev could also have been freed.

We need to hold rcu_read_lock() to make sure the dst and
associated idev are alive.

Fixes: 5796015fa968 ("ipv6: allocate enough headroom in ip6_finish_output2()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240820160859.3786976-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv6: prevent UAF in ip6_send_skb()
Eric Dumazet [Tue, 20 Aug 2024 16:08:57 +0000 (16:08 +0000)]
ipv6: prevent UAF in ip6_send_skb()

syzbot reported an UAF in ip6_send_skb() [1]

After ip6_local_out() has returned, we no longer can safely
dereference rt, unless we hold rcu_read_lock().

A similar issue has been fixed in commit
a688caa34beb ("ipv6: take rcu lock in rawv6_send_hdrinc()")

Another potential issue in ip6_finish_output2() is handled in a
separate patch.

[1]
 BUG: KASAN: slab-use-after-free in ip6_send_skb+0x18d/0x230 net/ipv6/ip6_output.c:1964
Read of size 8 at addr ffff88806dde4858 by task syz.1.380/6530

CPU: 1 UID: 0 PID: 6530 Comm: syz.1.380 Not tainted 6.11.0-rc3-syzkaller-00306-gdf6cbc62cc9b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
 <TASK>
  __dump_stack lib/dump_stack.c:93 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
  print_address_description mm/kasan/report.c:377 [inline]
  print_report+0x169/0x550 mm/kasan/report.c:488
  kasan_report+0x143/0x180 mm/kasan/report.c:601
  ip6_send_skb+0x18d/0x230 net/ipv6/ip6_output.c:1964
  rawv6_push_pending_frames+0x75c/0x9e0 net/ipv6/raw.c:588
  rawv6_sendmsg+0x19c7/0x23c0 net/ipv6/raw.c:926
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x1a6/0x270 net/socket.c:745
  sock_write_iter+0x2dd/0x400 net/socket.c:1160
 do_iter_readv_writev+0x60a/0x890
  vfs_writev+0x37c/0xbb0 fs/read_write.c:971
  do_writev+0x1b1/0x350 fs/read_write.c:1018
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f936bf79e79
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f936cd7f038 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 00007f936c115f80 RCX: 00007f936bf79e79
RDX: 0000000000000001 RSI: 0000000020000040 RDI: 0000000000000004
RBP: 00007f936bfe7916 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f936c115f80 R15: 00007fff2860a7a8
 </TASK>

Allocated by task 6530:
  kasan_save_stack mm/kasan/common.c:47 [inline]
  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
  unpoison_slab_object mm/kasan/common.c:312 [inline]
  __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
  kasan_slab_alloc include/linux/kasan.h:201 [inline]
  slab_post_alloc_hook mm/slub.c:3988 [inline]
  slab_alloc_node mm/slub.c:4037 [inline]
  kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4044
  dst_alloc+0x12b/0x190 net/core/dst.c:89
  ip6_blackhole_route+0x59/0x340 net/ipv6/route.c:2670
  make_blackhole net/xfrm/xfrm_policy.c:3120 [inline]
  xfrm_lookup_route+0xd1/0x1c0 net/xfrm/xfrm_policy.c:3313
  ip6_dst_lookup_flow+0x13e/0x180 net/ipv6/ip6_output.c:1257
  rawv6_sendmsg+0x1283/0x23c0 net/ipv6/raw.c:898
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x1a6/0x270 net/socket.c:745
  ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
  ___sys_sendmsg net/socket.c:2651 [inline]
  __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 45:
  kasan_save_stack mm/kasan/common.c:47 [inline]
  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
  kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
  poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
  __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
  kasan_slab_free include/linux/kasan.h:184 [inline]
  slab_free_hook mm/slub.c:2252 [inline]
  slab_free mm/slub.c:4473 [inline]
  kmem_cache_free+0x145/0x350 mm/slub.c:4548
  dst_destroy+0x2ac/0x460 net/core/dst.c:124
  rcu_do_batch kernel/rcu/tree.c:2569 [inline]
  rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2843
  handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
  __do_softirq kernel/softirq.c:588 [inline]
  invoke_softirq kernel/softirq.c:428 [inline]
  __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
  irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
  instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
  sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702

Last potentially related work creation:
  kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
  __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
  __call_rcu_common kernel/rcu/tree.c:3106 [inline]
  call_rcu+0x167/0xa70 kernel/rcu/tree.c:3210
  refdst_drop include/net/dst.h:263 [inline]
  skb_dst_drop include/net/dst.h:275 [inline]
  nf_ct_frag6_queue net/ipv6/netfilter/nf_conntrack_reasm.c:306 [inline]
  nf_ct_frag6_gather+0xb9a/0x2080 net/ipv6/netfilter/nf_conntrack_reasm.c:485
  ipv6_defrag+0x2c8/0x3c0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:67
  nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
  nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
  nf_hook include/linux/netfilter.h:269 [inline]
  __ip6_local_out+0x6fa/0x800 net/ipv6/output_core.c:143
  ip6_local_out+0x26/0x70 net/ipv6/output_core.c:153
  ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1959
  rawv6_push_pending_frames+0x75c/0x9e0 net/ipv6/raw.c:588
  rawv6_sendmsg+0x19c7/0x23c0 net/ipv6/raw.c:926
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x1a6/0x270 net/socket.c:745
  sock_write_iter+0x2dd/0x400 net/socket.c:1160
 do_iter_readv_writev+0x60a/0x890

Fixes: 0625491493d9 ("ipv6: ip6_push_pending_frames() should increment IPSTATS_MIB_OUTDISCARDS")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240820160859.3786976-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonetpoll: do not export netpoll_poll_[disable|enable]()
Eric Dumazet [Tue, 20 Aug 2024 16:20:53 +0000 (16:20 +0000)]
netpoll: do not export netpoll_poll_[disable|enable]()

netpoll_poll_disable() and netpoll_poll_enable() are only used
from core networking code, there is no need to export them.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240820162053.3870927-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoice: Fix a 32bit bug
Dan Carpenter [Tue, 20 Aug 2024 13:43:46 +0000 (16:43 +0300)]
ice: Fix a 32bit bug

BIT() is unsigned long but ->pu.flg_msk and ->pu.flg_val are u64 type.
On 32 bit systems, unsigned long is a u32 and the mismatch between u32
and u64 will break things for the high 32 bits.

Fixes: 9a4c07aaa0f5 ("ice: add parser execution main loop")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/ddc231a8-89c1-4ff4-8704-9198bcb41f8d@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoipv6: remove redundant check
Xi Huang [Tue, 20 Aug 2024 11:54:42 +0000 (19:54 +0800)]
ipv6: remove redundant check

err varibale will be set everytime,like -ENOBUFS and in if (err < 0),
 when code gets into this path. This check will just slowdown
the execution and that's all.

Signed-off-by: Xi Huang <xuiagnh@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20240820115442.49366-1-xuiagnh@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoselftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path
Ido Schimmel [Tue, 20 Aug 2024 10:53:47 +0000 (12:53 +0200)]
selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path

Source the ethtool library from the correct path and avoid the following
error:

./ethtool_lanes.sh: line 14: ./../../../net/forwarding/ethtool_lib.sh: No such file or directory

Fixes: 40d269c000bd ("selftests: forwarding: Move several selftests")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/2112faff02e536e1ac14beb4c2be09c9574b90ae.1724150067.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: dsa: sja1105: Simplify with scoped for each OF child loop
Jinjie Ruan [Tue, 20 Aug 2024 07:50:47 +0000 (15:50 +0800)]
net: dsa: sja1105: Simplify with scoped for each OF child loop

Use scoped for_each_available_child_of_node_scoped() when iterating over
device nodes to make code a bit simpler.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://patch.msgid.link/20240820075047.681223-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: dsa: ocelot: Simplify with scoped for each OF child loop
Jinjie Ruan [Tue, 20 Aug 2024 07:48:05 +0000 (15:48 +0800)]
net: dsa: ocelot: Simplify with scoped for each OF child loop

Use scoped for_each_available_child_of_node_scoped() when iterating over
device nodes to make code a bit simpler.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://patch.msgid.link/20240820074805.680674-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonfc: pn533: Avoid -Wflex-array-member-not-at-end warnings
Gustavo A. R. Silva [Tue, 20 Aug 2024 01:27:11 +0000 (19:27 -0600)]
nfc: pn533: Avoid -Wflex-array-member-not-at-end warnings

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Remove unnecessary flex-array member `data[]`, and with this fix
the following warnings:

drivers/nfc/pn533/usb.c:268:38: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/nfc/pn533/usb.c:275:38: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/ZsPw7+6vNoS651Cb@elsanto
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoudp: fix receiving fraglist GSO packets
Felix Fietkau [Mon, 19 Aug 2024 15:06:21 +0000 (17:06 +0200)]
udp: fix receiving fraglist GSO packets

When assembling fraglist GSO packets, udp4_gro_complete does not set
skb->csum_start, which makes the extra validation in __udp_gso_segment fail.

Fixes: 89add40066f9 ("net: drop bad gso csum_start and offset in virtio_net_hdr")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240819150621.59833-1-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'platform-drivers-x86-v6.11-4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 21 Aug 2024 22:34:27 +0000 (06:34 +0800)]
Merge tag 'platform-drivers-x86-v6.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:

 - ISST: Fix an error-handling corner case

 - platform/surface: aggregator: Minor corner case fix and new HW
   support

* tag 'platform-drivers-x86-v6.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: ISST: Fix return value on last invalid resource
  platform/surface: aggregator: Fix warning when controller is destroyed in probe
  platform/surface: aggregator_registry: Add support for Surface Laptop 6
  platform/surface: aggregator_registry: Add fan and thermal sensor support for Surface Laptop 5
  platform/surface: aggregator_registry: Add support for Surface Laptop Studio 2
  platform/surface: aggregator_registry: Add support for Surface Laptop Go 3
  platform/surface: aggregator_registry: Add Support for Surface Pro 10
  platform/x86: asus-wmi: Add quirk for ROG Ally X

14 months agoMerge tag 'erofs-for-6.11-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 21 Aug 2024 22:06:09 +0000 (06:06 +0800)]
Merge tag 'erofs-for-6.11-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
 "As I mentioned in the merge window pull request, there is a regression
  which could cause system hang due to page migration. The corresponding
  fix landed upstream through MM tree last week (commit 2e6506e1c4ee:
  "mm/migrate: fix deadlock in migrate_pages_batch() on large folios"),
  therefore large folios can be safely allowed for compressed inodes and
  stress tests have been running on my fleet for over 20 days without
  any regression. Users have explicitly requested this for months, so
  let's allow large folios for EROFS full cases now for wider testing.

  Additionally, there is a fix which addresses invalid memory accesses
  on a failure path triggered by fault injection and two minor cleanups
  to simplify the codebase.

  Summary:

   - Allow large folios on compressed inodes

   - Fix invalid memory accesses if z_erofs_gbuf_growsize() partially
     fails

   - Two minor cleanups"

* tag 'erofs-for-6.11-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix out-of-bound access when z_erofs_gbuf_growsize() partially fails
  erofs: allow large folios for compressed files
  erofs: get rid of check_layout_compatibility()
  erofs: simplify readdir operation

14 months agonet: wwan: t7xx: PCIe reset rescan
Jinjian Song [Sat, 17 Aug 2024 08:33:55 +0000 (16:33 +0800)]
net: wwan: t7xx: PCIe reset rescan

WWAN device is programmed to boot in normal mode or fastboot mode,
when triggering a device reset through ACPI call or fastboot switch
command. Maintain state machine synchronization and reprobe logic
after a device reset.

The PCIe device reset triggered by several ways.
E.g.:
 - fastboot: echo "fastboot_switching" > /sys/bus/pci/devices/${bdf}/t7xx_mode.
 - reset: echo "reset" > /sys/bus/pci/devices/${bdf}/t7xx_mode.
 - IRQ: PCIe device request driver to reset itself by an interrupt request.

Use pci_reset_function() as a generic way to reset device, save and
restore the PCIe configuration before and after reset device to ensure
the reprobe process.

Suggestion from Bjorn:
Link: https://lore.kernel.org/all/20230127133034.GA1364550@bhelgaas/
Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge tag '6.11-rc4-server-fixes' of git://git.samba.org/ksmbd
Linus Torvalds [Wed, 21 Aug 2024 02:03:07 +0000 (19:03 -0700)]
Merge tag '6.11-rc4-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

 - important reconnect fix

 - fix for memcpy issues on mount

 - two minor cleanup patches

* tag '6.11-rc4-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: Replace one-element arrays with flexible-array members
  ksmbd: fix spelling mistakes in documentation
  ksmbd: fix race condition between destroy_previous_session() and smb2 operations()
  ksmbd: Use unsafe_memcpy() for ntlm_negotiate

14 months agonet: dsa: b53: Use dev_err_probe()
Florian Fainelli [Tue, 20 Aug 2024 00:44:35 +0000 (17:44 -0700)]
net: dsa: b53: Use dev_err_probe()

Rather than print an error even when we get -EPROBE_DEFER, use
dev_err_probe() to filter out those messages.

Link: https://github.com/openwrt/openwrt/pull/11680
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240820004436.224603-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>