]> git.apps.os.sepia.ceph.com Git - ceph-client.git/log
ceph-client.git
2 years agonet: microchip: vcap: Extend vcap with lan966x
Horatiu Vultur [Fri, 25 Nov 2022 09:50:03 +0000 (10:50 +0100)]
net: microchip: vcap: Extend vcap with lan966x

Add the keysets, keys, actionsets and actions used by lan966x in IS2.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: microchip: vcap: Merge the vcap_ag_api_kunit.h into vcap_ag_api.h
Horatiu Vultur [Fri, 25 Nov 2022 09:50:02 +0000 (10:50 +0100)]
net: microchip: vcap: Merge the vcap_ag_api_kunit.h into vcap_ag_api.h

Currently there are 2 files that contain the keyfields, keys,
actionfields and actions. First file is used by the kunit while the
second one is used by VCAP api.
The header file that is used by kunit is just a super set of the of the
header file used by VCAP api.
Therefore not to have duplicate information in different files which is
also harder to maintain, create a single file that is used both by API
and by kunit.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'refactor-mtk_wed-code-to-introduce-ser-support'
Paolo Abeni [Tue, 29 Nov 2022 10:40:27 +0000 (11:40 +0100)]
Merge branch 'refactor-mtk_wed-code-to-introduce-ser-support'

Lorenzo Bianconi says:

====================
refactor mtk_wed code to introduce SER support

Refactor mtk_wed support in order to introduce proper integration for hw reset
between mtk_eth_soc/mtk_wed and mt76 drivers.
====================

Link: https://lore.kernel.org/r/cover.1669303154.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: ethernet: mtk_wed: add reset to tx_ring_setup callback
Lorenzo Bianconi [Thu, 24 Nov 2022 15:22:55 +0000 (16:22 +0100)]
net: ethernet: mtk_wed: add reset to tx_ring_setup callback

Introduce reset parameter to mtk_wed_tx_ring_setup signature.
This is a preliminary patch to add Wireless Ethernet Dispatcher reset
support.

Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: ethernet: mtk_wed: add mtk_wed_rx_reset routine
Lorenzo Bianconi [Thu, 24 Nov 2022 15:22:54 +0000 (16:22 +0100)]
net: ethernet: mtk_wed: add mtk_wed_rx_reset routine

Introduce mtk_wed_rx_reset routine in order to reset rx DMA for Wireless
Ethernet Dispatcher available on MT7986 SoC.

Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: ethernet: mtk_wed: update mtk_wed_stop
Lorenzo Bianconi [Thu, 24 Nov 2022 15:22:53 +0000 (16:22 +0100)]
net: ethernet: mtk_wed: update mtk_wed_stop

Update mtk_wed_stop routine and rename old mtk_wed_stop() to
mtk_wed_deinit(). This is a preliminary patch to add Wireless Ethernet
Dispatcher reset support.

Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: ethernet: mtk_wed: move MTK_WDMA_RESET_IDX_TX configuration in mtk_wdma_tx_reset
Lorenzo Bianconi [Thu, 24 Nov 2022 15:22:52 +0000 (16:22 +0100)]
net: ethernet: mtk_wed: move MTK_WDMA_RESET_IDX_TX configuration in mtk_wdma_tx_reset

Remove duplicated code. Increase poll timeout to 10ms in order to be
aligned with vendor sdk.
This is a preliminary patch to add Wireless Ethernet Dispatcher reset
support.

Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: ethernet: mtk_wed: return status value in mtk_wdma_rx_reset
Lorenzo Bianconi [Thu, 24 Nov 2022 15:22:51 +0000 (16:22 +0100)]
net: ethernet: mtk_wed: return status value in mtk_wdma_rx_reset

Move MTK_WDMA_RESET_IDX configuration in mtk_wdma_rx_reset routine.
Increase poll timeout to 10ms in order to be aligned with vendor sdk.
This is a preliminary patch to add Wireless Ethernet Dispatcher reset
support.

Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'marvell-nvmem-mac-addresses-support'
Paolo Abeni [Tue, 29 Nov 2022 09:46:41 +0000 (10:46 +0100)]
Merge branch 'marvell-nvmem-mac-addresses-support'

Miquel Raynal says:

====================
Marvell nvmem mac addresses support

Now that we are aligned on how to make information available from static
storage media to drivers like Ethernet controller drivers or switch
drivers by using nvmem cells and going through the whole nvmem
infrastructure, here are two driver updates to reflect these changes.

Prior to the driver updates, I propose:
* Reverting binding changes which should have never been accepted like
  that.
* A conversion of the (old) Prestera and DFX server bindings (optional,
  can be dropped if not considered necessary).
* A better description of the more recent Prestera PCI switch.

Please mind that this series cannot break anything since retrieving the
MAC address Prestera driver has never worked upstream, because the (ONIE
tlv) driver supposed to export the MAC address has not been accepted in
its original form and has been updated to the nvmem-layout
infrastructure (bindings have been merged, the code remains to be
applied).
====================

Link: https://lore.kernel.org/r/20221124111556.264647-1-miquel.raynal@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mvpp2: Consider NVMEM cells as possible MAC address source
Miquel Raynal [Thu, 24 Nov 2022 11:15:56 +0000 (12:15 +0100)]
net: mvpp2: Consider NVMEM cells as possible MAC address source

The ONIE standard describes the organization of tlv (type-length-value)
arrays commonly stored within NVMEM devices on common networking
hardware.

Several drivers already make use of NVMEM cells for purposes like
retrieving a default MAC address provided by the manufacturer.

What made ONIE tables unusable so far was the fact that the information
where "dynamically" located within the table depending on the
manufacturer wishes, while Linux NVMEM support only allowed statically
defined NVMEM cells. Fortunately, this limitation was eventually tackled
with the introduction of discoverable cells through the use of NVMEM
layouts, making it possible to extract and consistently use the content
of tables like ONIE's tlv arrays.

Parsing this table at runtime in order to get various information is now
possible. So, because many Marvell networking switches already follow
this standard, let's consider using NVMEM cells as a new valid source of
information when looking for a base MAC address, which is one of the
primary uses of these new fields. Indeed, manufacturers following the
ONIE standard are encouraged to provide a default MAC address there, so
let's eventually use it if no other MAC address has been found using the
existing methods.

Link: https://opencomputeproject.github.io/onie/design-spec/hw_requirements.html
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: marvell: prestera: Avoid unnecessary DT lookups
Miquel Raynal [Thu, 24 Nov 2022 11:15:55 +0000 (12:15 +0100)]
net: marvell: prestera: Avoid unnecessary DT lookups

This driver fist makes an expensive DT lookup to retrieve its DT node
(this is a PCI driver) in order to later search for the
base-mac-provider property. This property has no reality upstream and
this code should not have been accepted like this in the first
place. Instead, there is a proper nvmem interface that should be
used. Let's avoid these extra lookups and rely on the nvmem internal
logic.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoof: net: export of_get_mac_address_nvmem()
Miquel Raynal [Thu, 24 Nov 2022 11:15:54 +0000 (12:15 +0100)]
of: net: export of_get_mac_address_nvmem()

Export

of_get_mac_addr_nvmem()

and rename it to

of_get_mac_address_nvmem()

in order to fit the convention followed by the existing exported helpers
of the same kind.

This way, OF compatible drivers using eg. fwnode_get_mac_address() can
do a direct call to it instead of calling of_get_mac_address() just for
the nvmem step, avoiding to repeat an expensive DT lookup which has
already been done once.

Eventually, fwnode_get_mac_address() should probably be updated to
perform the nvmem lookup directly, but as of today, nvmem cells seem not
to be supported by ACPI yet which would defeat this kind of extension.

Suggested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agodt-bindings: net: marvell,prestera: Describe PCI devices of the prestera family
Miquel Raynal [Thu, 24 Nov 2022 11:15:53 +0000 (12:15 +0100)]
dt-bindings: net: marvell,prestera: Describe PCI devices of the prestera family

Even though the devices have very little in common beside the name and
the main "switch" feature, Marvell Prestera switch family is also
composed of PCI-only devices which can receive additional static
properties, like nvmem cells to point at MAC addresses, for
instance. Let's describe them.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agodt-bindings: net: marvell,prestera: Convert to yaml
Miquel Raynal [Thu, 24 Nov 2022 11:15:52 +0000 (12:15 +0100)]
dt-bindings: net: marvell,prestera: Convert to yaml

The currently described switch family is named AlleyCat3, it is a memory
mapped switch found on Armada XP boards.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agodt-bindings: net: marvell,dfx-server: Convert to yaml
Miquel Raynal [Thu, 24 Nov 2022 11:15:51 +0000 (12:15 +0100)]
dt-bindings: net: marvell,dfx-server: Convert to yaml

Even though this description is not used anywhere upstream (no matching
driver), while on this file I decided I would try a conversion to yaml
in order to clarify the prestera family description.

I cannot keep the nodename dfx-server@xxxx so I switched to dfx-bus@xxxx
which matches simple-bus.yaml. Otherwise I took the example context from
the only user of this compatible: armada-xp-98dx3236.dtsi, which is a
rather old and not perfect DT.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoRevert "dt-bindings: marvell,prestera: Add description for device-tree bindings"
Miquel Raynal [Thu, 24 Nov 2022 11:15:50 +0000 (12:15 +0100)]
Revert "dt-bindings: marvell,prestera: Add description for device-tree bindings"

This reverts commit 40acc05271abc2852c32622edbebd75698736b9b.

marvell,prestera.txt is an old file describing the old Alleycat3
standalone switches. The commit mentioned above actually hacked these
bindings to add support for a device tree property for a more modern
version of the IP connected over PCI, using only the generic compatible
in order to retrieve the device node from the prestera driver to read
one static property.

The problematic property discussed here is "base-mac-provider". The
original intent was to point to a nvmem device which could produce the
relevant nvmem-cell. This property has never been acked by DT
maintainers and fails all the layering that has been brought with the nvmem
bindings by pointing at a nvmem producer, bypassing the existing nvmem
bindings, rather than a nvmem cell directly. Furthermore, the property
cannot even be used upstream because it expected the ONIE tlv driver to
produce a specific cell, driver which used nacked bindings and thus was
never merged, replaced by a more integrated concept: the nvmem-layout.

So let's forget about this temporary addition, safely avoiding the need
for any backward compatibility handling. A new (yaml) binding file will
be brought with the prestera bindings, and there we will actually
include a description of the modern IP over PCI, including the right way
to point to a nvmem cell.

Cc: Vadym Kochan <vadym.kochan@plvision.eu>
Cc: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoDaniel Borkmann says:
Jakub Kicinski [Tue, 29 Nov 2022 01:14:01 +0000 (17:14 -0800)]
Daniel Borkmann says:

====================
bpf-next 2022-11-25

We've added 101 non-merge commits during the last 11 day(s) which contain
a total of 109 files changed, 8827 insertions(+), 1129 deletions(-).

The main changes are:

1) Support for user defined BPF objects: the use case is to allocate own
   objects, build own object hierarchies and use the building blocks to
   build own data structures flexibly, for example, linked lists in BPF,
   from Kumar Kartikeya Dwivedi.

2) Add bpf_rcu_read_{,un}lock() support for sleepable programs,
   from Yonghong Song.

3) Add support storing struct task_struct objects as kptrs in maps,
   from David Vernet.

4) Batch of BPF map documentation improvements, from Maryam Tahhan
   and Donald Hunter.

5) Improve BPF verifier to propagate nullness information for branches
   of register to register comparisons, from Eduard Zingerman.

6) Fix cgroup BPF iter infra to hold reference on the start cgroup,
   from Hou Tao.

7) Fix BPF verifier to not mark fentry/fexit program arguments as trusted
   given it is not the case for them, from Alexei Starovoitov.

8) Improve BPF verifier's realloc handling to better play along with dynamic
   runtime analysis tools like KASAN and friends, from Kees Cook.

9) Remove legacy libbpf mode support from bpftool,
   from Sahid Orentino Ferdjaoui.

10) Rework zero-len skb redirection checks to avoid potentially breaking
    existing BPF test infra users, from Stanislav Fomichev.

11) Two small refactorings which are independent and have been split out
    of the XDP queueing RFC series, from Toke Høiland-Jørgensen.

12) Fix a memory leak in LSM cgroup BPF selftest, from Wang Yufen.

13) Documentation on how to run BPF CI without patch submission,
    from Daniel Müller.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================

Link: https://lore.kernel.org/r/20221125012450.441-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "net: stmmac: use sysfs_streq() instead of strncmp()"
Vladimir Oltean [Fri, 25 Nov 2022 10:53:04 +0000 (12:53 +0200)]
Revert "net: stmmac: use sysfs_streq() instead of strncmp()"

This reverts commit f72cd76b05ea1ce9258484e8127932d0ea928f22.
This patch is so broken, it hurts. Apparently no one reviewed it and it
passed the build testing (because the code was compiled out), but it was
obviously never compile-tested, since it produces the following build
error, due to an incomplete conversion where an extra argument was left,
although the function being called was left:

stmmac_main.c: In function ‘stmmac_cmdline_opt’:
stmmac_main.c:7586:28: error: too many arguments to function ‘sysfs_streq’
 7586 |                 } else if (sysfs_streq(opt, "pause:", 6)) {
      |                            ^~~~~~~~~~~
In file included from ../include/linux/bitmap.h:11,
                 from ../include/linux/cpumask.h:12,
                 from ../include/linux/smp.h:13,
                 from ../include/linux/lockdep.h:14,
                 from ../include/linux/mutex.h:17,
                 from ../include/linux/notifier.h:14,
                 from ../include/linux/clk.h:14,
                 from ../drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:17:
../include/linux/string.h:185:13: note: declared here
  185 | extern bool sysfs_streq(const char *s1, const char *s2);
      |             ^~~~~~~~~~~

What's even worse is that the patch is flat out wrong. The stmmac_cmdline_opt()
function does not parse sysfs input, but cmdline input such as
"stmmaceth=tc:1,pause:1". The pattern of using strsep() followed by
strncmp() for such strings is not unique to stmmac, it can also be found
mainly in drivers under drivers/video/fbdev/.

With strncmp("tc:", 3), the code matches on the "tc:1" token properly.
With sysfs_streq("tc:"), it doesn't.

Fixes: f72cd76b05ea ("net: stmmac: use sysfs_streq() instead of strncmp()")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20221125105304.3012153-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: usb: cdc_ether: add u-blox 0x1343 composition
Davide Tronchin [Thu, 24 Nov 2022 11:28:11 +0000 (12:28 +0100)]
net: usb: cdc_ether: add u-blox 0x1343 composition

Add CDC-ECM support for LARA-L6.

LARA-L6 module can be configured (by AT interface) in three different
USB modes:
* Default mode (Vendor ID: 0x1546 Product ID: 0x1341) with 4 serial
interfaces
* RmNet mode (Vendor ID: 0x1546 Product ID: 0x1342) with 4 serial
interfaces and 1 RmNet virtual network interface
* CDC-ECM mode (Vendor ID: 0x1546 Product ID: 0x1343) with 4 serial
interface and 1 CDC-ECM virtual network interface

In CDC-ECM mode LARA-L6 exposes the following interfaces:
If 0: Diagnostic
If 1: AT parser
If 2: AT parser
If 3: AT parset/alternative functions
If 4: CDC-ECM interface

Signed-off-by: Davide Tronchin <davide.tronchin.94@gmail.com>
Link: https://lore.kernel.org/r/20221124112811.3548-1-davide.tronchin.94@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoocteontx2-pf: Add support to filter packet based on IP fragment
Suman Ghosh [Thu, 24 Nov 2022 06:35:48 +0000 (12:05 +0530)]
octeontx2-pf: Add support to filter packet based on IP fragment

1. Added support to filter packets based on IP fragment.
For IPv4 packets check for ip_flag == 0x20 (more fragment bit set).
For IPv6 packets check for next_header == 0x2c (next_header set to
'fragment header for IPv6')
2. Added configuration support from both "ethtool ntuple" and "tc flower".

Signed-off-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: mtk_wed: add wcid overwritten support for wed v1
Sujuan Chen [Thu, 24 Nov 2022 03:18:14 +0000 (11:18 +0800)]
net: ethernet: mtk_wed: add wcid overwritten support for wed v1

All wed versions should enable the wcid overwritten feature,
since the wcid size is controlled by the wlan driver.

Tested-by: Sujuan Chen <sujuan.chen@mediatek.com>
Co-developed-by: Bo Jiao <bo.jiao@mediatek.com>
Signed-off-by: Bo Jiao <bo.jiao@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
David S. Miller [Fri, 25 Nov 2022 10:46:22 +0000 (10:46 +0000)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-11-23 (ice)

This series contains updates to ice driver only.

Karol adjusts check of PTP hardware to wait longer but check more often.

Brett removes use of driver defined link speed; instead using the values
from ethtool.h, utilizing static tables for indexing.

Ben adds tracking of stats in order to accumulate reported statistics that
were previously reset by hardware.

Marcin fixes issues setting RXDID when queues are asymmetric.

Anatolii re-introduces use of define over magic number; ICE_RLAN_BASE_S.
---
v3:
 - Dropped, previous, patch 2
v2:
Patch 5
 - Convert some allocations to non-managed
 - Remove combined error checking; add error checks for each call
 - Remove excess NULL checks
 - Remove unnecessary NULL sets and newlines
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'net-remove-kmap_atomic'
David S. Miller [Fri, 25 Nov 2022 10:44:01 +0000 (10:44 +0000)]
Merge branch 'net-remove-kmap_atomic'

Anirudh Venkataramanan says:

====================
net: Remove uses of kmap_atomic()

kmap_atomic() is being deprecated. This little series replaces the last
few uses of kmap_atomic() in the networking subsystem.

This series triggered a suggestion [1] that perhaps the Sun Cassini,
LDOM Virtual Switch Driver and the LDOM virtual network drivers should be
removed completely. I plan to do this in a follow up patchset. For
completeness, this series still includes kmap_atomic() conversions that
apply to the above referenced drivers. If for some reason we choose to not
remove these drivers, at least they won't be using kmap_atomic() anymore.

Also, the following maintainer entries for the Chelsio driver seem to be
defunct:

  Vinay Kumar Yadav <vinay.yadav@chelsio.com>
  Rohit Maheshwari <rohitm@chelsio.com>

I can submit a follow up patch to remove these entries, but thought
maybe the folks over at Chelsio would want to look into this first.

Changes v1 -> v2:
  Use memcpy_from_page() in patches 2/6 and 4/6
  Add new patch for the thunderbolt driver
  Update commit messages and cover letter
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: thunderbolt: Use kmap_local_page() instead of kmap_atomic()
Anirudh Venkataramanan [Wed, 23 Nov 2022 20:52:19 +0000 (12:52 -0800)]
net: thunderbolt: Use kmap_local_page() instead of kmap_atomic()

kmap_atomic() is being deprecated in favor of kmap_local_page(). Replace
kmap_atomic() and kunmap_atomic() with kmap_local_page() and kunmap_local()
respectively.

Note that kmap_atomic() disables preemption and page-fault processing, but
kmap_local_page() doesn't. When converting uses of kmap_atomic(), one has
to check if the code being executed between the map/unmap implicitly
depends on page-faults and/or preemption being disabled. If yes, then code
to disable page-faults and/or preemption should also be added for
functional correctness. That however doesn't appear to be the case here,
so just kmap_local_page() is used.

Also note that the page being mapped is not allocated by the driver, and so
the driver doesn't know if the page is in normal memory. This is the reason
kmap_local_page() is used as opposed to page_address().

I don't have hardware, so this change has only been compile tested.

Cc: Michael Jamet <michael.jamet@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Yehezkel Bernat <YehezkelShB@gmail.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosunvnet: Use kmap_local_page() instead of kmap_atomic()
Anirudh Venkataramanan [Wed, 23 Nov 2022 20:52:18 +0000 (12:52 -0800)]
sunvnet: Use kmap_local_page() instead of kmap_atomic()

kmap_atomic() is being deprecated in favor of kmap_local_page(). Replace
kmap_atomic() and kunmap_atomic() with kmap_local_page() and kunmap_local()
respectively.

Note that kmap_atomic() disables preemption and page-fault processing, but
kmap_local_page() doesn't. When converting uses of kmap_atomic(), one has
to check if the code being executed between the map/unmap implicitly
depends on page-faults and/or preemption being disabled. If yes, then code
to disable page-faults and/or preemption should also be added for
functional correctness. That however doesn't appear to be the case here,
so just kmap_local_page() is used.

Also note that the page being mapped is not allocated by the driver, and so
the driver doesn't know if the page is in normal memory. This is the reason
kmap_local_page() is used as opposed to page_address().

I don't have hardware, so this change has only been compile tested.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agocassini: Use memcpy_from_page() instead of k[un]map_atomic()
Anirudh Venkataramanan [Wed, 23 Nov 2022 20:52:17 +0000 (12:52 -0800)]
cassini: Use memcpy_from_page() instead of k[un]map_atomic()

kmap_atomic() is being deprecated in favor of kmap_local_page(). Replace
the map-memcpy-unmap usage pattern (done using k[un]map_atomic()) with
memcpy_from_page(), which internally uses kmap_local_page() and
kunmap_local(). This renders the variable 'vaddr' unnecessary, and so
remove this too.

Note that kmap_atomic() disables preemption and page-fault processing, but
kmap_local_page() doesn't. When converting uses of kmap_atomic(), one has
to check if the code being executed between the map/unmap implicitly
depends on page-faults and/or preemption being disabled. If yes, then code
to disable page-faults and/or preemption should also be added for
functional correctness. That however doesn't appear to be the case here,
so just memcpy_from_page() is used.

Also note that the page being mapped is not allocated by the driver, and so
the driver doesn't know if the page is in normal memory. This is the reason
kmap_local_page() is used (via memcpy_from_page()) as opposed to
page_address().

I don't have hardware, so this change has only been compile tested.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agocassini: Use page_address() instead of kmap_atomic()
Anirudh Venkataramanan [Wed, 23 Nov 2022 20:52:16 +0000 (12:52 -0800)]
cassini: Use page_address() instead of kmap_atomic()

Pages for Rx buffers are allocated in cas_page_alloc() using either
GFP_ATOMIC or GFP_KERNEL. Memory allocated with GFP_KERNEL/GFP_ATOMIC can't
come from highmem and so there's no need to kmap() them. Just use
page_address() instead. This makes the variable 'addr' unnecessary, so
remove it too.

Note that kmap_atomic() disables preemption and page-fault processing,
but page_address() doesn't. When removing uses of kmap_atomic(), one has to
check if the code being executed between the map/unmap implicitly depends
on page-faults and/or preemption being disabled. If yes, then code to
disable page-faults and/or preemption should also be added for functional
correctness. That however doesn't appear to be the case here, so just
page_address() is used.

I don't have hardware, so this change has only been compile tested.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosfc: Use kmap_local_page() instead of kmap_atomic()
Anirudh Venkataramanan [Wed, 23 Nov 2022 20:52:15 +0000 (12:52 -0800)]
sfc: Use kmap_local_page() instead of kmap_atomic()

kmap_atomic() is being deprecated in favor of kmap_local_page(). Replace
kmap_atomic() and kunmap_atomic() with kmap_local_page() and kunmap_local()
respectively.

Note that kmap_atomic() disables preemption and page-fault processing, but
kmap_local_page() doesn't. When converting uses of kmap_atomic(), one has
to check if the code being executed between the map/unmap implicitly
depends on page-faults and/or preemption being disabled. If yes, then code
to disable page-faults and/or preemption should also be added for
functional correctness. That however doesn't appear to be the case here,
so just kmap_local_page() is used.

Also note that the page being mapped is not allocated by the driver, and so
the driver doesn't know if the page is in normal memory. This is the reason
kmap_local_page() is used as opposed to page_address().

I don't have hardware, so this change has only been compile tested.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoch_ktls: Use memcpy_from_page() instead of k[un]map_atomic()
Anirudh Venkataramanan [Wed, 23 Nov 2022 20:52:14 +0000 (12:52 -0800)]
ch_ktls: Use memcpy_from_page() instead of k[un]map_atomic()

kmap_atomic() is being deprecated in favor of kmap_local_page(). Replace
the map-memcpy-unmap usage pattern (done using k[un]map_atomic()) with
memcpy_from_page(), which internally uses kmap_local_page() and
kunmap_local(). This renders the variables 'data' and 'vaddr' unnecessary,
and so remove these too.

Note that kmap_atomic() disables preemption and page-fault processing, but
kmap_local_page() doesn't. When converting uses of kmap_atomic(), one has
to check if the code being executed between the map/unmap implicitly
depends on page-faults and/or preemption being disabled. If yes, then code
to disable page-faults and/or preemption should also be added for
functional correctness. That however doesn't appear to be the case here,
so just memcpy_from_page() is used.

Also note that the page being mapped is not allocated by the driver, and so
the driver doesn't know if the page is in normal memory. This is the reason
kmap_local_page() is used (via memcpy_from_page()) as opposed to
page_address().

I don't have hardware, so this change has only been compile tested.

Cc: Ayush Sawal <ayush.sawal@chelsio.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'lan966x-extend-xdp-support'
David S. Miller [Fri, 25 Nov 2022 10:38:10 +0000 (10:38 +0000)]
Merge branch 'lan966x-extend-xdp-support'

Horatiu Vultur says:

====================
net: lan966x: Extend xdp support

Extend the current support of XDP in lan966x with the action XDP_TX and
XDP_REDIRECT.
The first patches just prepare the things such that it would be easier
to add XDP_TX and XDP_REDIRECT actions. Like adding XDP_PACKET_HEADROOM,
introduce helper functions, use the correct dma_dir for the page pool
The last 2 patches introduce the XDP actions XDP_TX and XDP_REDIRECT.

v4->v5:
- add iterator declaration inside for loops
- move the scope of port inside the function lan966x_fdma_rx_alloc_page_pool
- create union for skb and xdpf inside struct lan966x_tx_dcb_buf

v3->v4:
- use napi_consume_skb instead of dev_kfree_skb_any
- arrange members in struct lan966x_tx_dcb_buf not to have holes
- fix when xdp program is added the check for determining if page pool
  needs to be recreated was wrong
- change type for len in lan966x_tx_dcb_buf to u32

v2->v3:
- make sure to update rxq memory model
- update the page pool direction if there is any xdp program
- in case of action XDP_TX give back to reuse the page
- in case of action XDP_REDIRECT, remap the frame and make sure to
  unmap it when is transmitted.

v1->v2:
- use skb_reserve of using skb_put and skb_pull
- make sure that data_len doesn't include XDP_PACKET_HEADROOM
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add support for XDP_REDIRECT
Horatiu Vultur [Wed, 23 Nov 2022 20:31:39 +0000 (21:31 +0100)]
net: lan966x: Add support for XDP_REDIRECT

Extend lan966x XDP support with the action XDP_REDIRECT. This is similar
with the XDP_TX, so a lot of functionality can be reused.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add support for XDP_TX
Horatiu Vultur [Wed, 23 Nov 2022 20:31:38 +0000 (21:31 +0100)]
net: lan966x: Add support for XDP_TX

Extend lan966x XDP support with the action XDP_TX. In this case when the
received buffer needs to execute XDP_TX, the buffer will be moved to the
TX buffers. So a new RX buffer will be allocated.
When the TX finish with the frame, it would give back the buffer to the
page pool.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Update dma_dir of page_pool_params
Horatiu Vultur [Wed, 23 Nov 2022 20:31:37 +0000 (21:31 +0100)]
net: lan966x: Update dma_dir of page_pool_params

To add support for XDP_TX it is required to be able to write to the DMA
area therefore it is required that the pages will be mapped using
DMA_BIDIRECTIONAL flag.
Therefore check if there are any xdp programs on the interfaces and in
that case set DMA_BIDRECTIONAL otherwise use DMA_FROM_DEVICE.
Therefore when a new XDP program is added it is required to redo the
page_pool.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Update rxq memory model
Horatiu Vultur [Wed, 23 Nov 2022 20:31:36 +0000 (21:31 +0100)]
net: lan966x: Update rxq memory model

By default the rxq memory model is MEM_TYPE_PAGE_SHARED but to be able
to reuse pages on the TX side, when the XDP action XDP_TX it is required
to update the memory model to PAGE_POOL.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add len field to lan966x_tx_dcb_buf
Horatiu Vultur [Wed, 23 Nov 2022 20:31:35 +0000 (21:31 +0100)]
net: lan966x: Add len field to lan966x_tx_dcb_buf

Currently when a frame was transmitted, it is required to unamp the
frame that was transmitted. The length of the frame was taken from the
transmitted skb. In the future we might not have an skb, therefore store
the length skb directly in the lan966x_tx_dcb_buf and use this one to
unamp the frame.
While at this, also arrange the members in lan966x_tx_dcb_buf not to
have any holes.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Introduce helper functions
Horatiu Vultur [Wed, 23 Nov 2022 20:31:34 +0000 (21:31 +0100)]
net: lan966x: Introduce helper functions

Introduce lan966x_fdma_tx_setup_dcb and lan966x_fdma_tx_start functions
and use of them inside lan966x_fdma_xmit. There is no functional change
in here.
They are introduced to be used when XDP_TX/REDIRECT actions are
introduced.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add XDP_PACKET_HEADROOM
Horatiu Vultur [Wed, 23 Nov 2022 20:31:33 +0000 (21:31 +0100)]
net: lan966x: Add XDP_PACKET_HEADROOM

Update the page_pool params to allocate XDP_PACKET_HEADROOM space as
headroom for all received frames.
This is needed for when the XDP_TX and XDP_REDIRECT are implemented.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoptp: idt82p33: remove PEROUT_ENABLE_OUTPUT_MASK
Min Li [Wed, 23 Nov 2022 19:52:07 +0000 (14:52 -0500)]
ptp: idt82p33: remove PEROUT_ENABLE_OUTPUT_MASK

PEROUT_ENABLE_OUTPUT_MASK was there to allow us to enable/disable
all the perout pins. But it is not standard procedure, we will
have to discard it.

Signed-off-by: Min Li <min.li.xe@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoptp: idt82p33: Add PTP_CLK_REQ_EXTTS support
Min Li [Wed, 23 Nov 2022 19:52:06 +0000 (14:52 -0500)]
ptp: idt82p33: Add PTP_CLK_REQ_EXTTS support

82P33 family of chips can trigger TOD read/write by external
signal from one of the IN12/13/14 pins, which are set user
space programs by calling PTP_PIN_SETFUNC through ptp_ioctl

Signed-off-by: Min Li <min.li.xe@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'sparx5-tc-protocol-all'
David S. Miller [Fri, 25 Nov 2022 09:42:14 +0000 (09:42 +0000)]
Merge branch 'sparx5-tc-protocol-all'

Steen Hegelund says:

====================
net: TC protocol all support in Sparx5 IS2 VCAP

This provides support for the TC flower filters 'protocol all' clause in
the Sparx5 IS2 VCAP.

It builds on top of the initial IS2 VCAP support found in these series:

https://lore.kernel.org/all/20221020130904.1215072-1-steen.hegelund@microchip.com/
https://lore.kernel.org/all/20221109114116.3612477-1-steen.hegelund@microchip.com/
https://lore.kernel.org/all/20221111130519.1459549-1-steen.hegelund@microchip.com/
https://lore.kernel.org/all/20221117213114.699375-1-steen.hegelund@microchip.com/

Functionality:
==============

As the configuration for the Sparx5 IS2 VCAP consists of one (or more)
keyset(s) for each lookup/port per traffic classification, it is not
always possible to cover all protocols with just one ordinary VCAP rule.

To improve this situation the driver will try to find out what keysets a
rule will need to cover a TC flower "protocol all" filter and then compare
this set of keysets to what the hardware is currently configured for.

In case multiple keysets are needed then the driver can create a rule per
rule size (e.g. X6 and X12) and use a mask on the keyset type field to
allow the VCAP to match more than one keyset with just one rule.

This is possible because the keysets that have the same size typically has
many keys in common, so the VCAP rule keys can make a common match.

The result is that one TC filter command may create multiple IS2 VCAP rules
of different sizes that have a type field with a masked type id.

Delivery:
=========

This is current plan for delivering the full VCAP feature set of Sparx5:

- Sparx5 IS0 VCAP support
- TC policer and drop action support (depends on the Sparx5 QoS support
  upstreamed separately)
- Sparx5 ES0 VCAP support
- TC flower template support
- TC matchall filter support for mirroring and policing ports
- TC flower filter mirror action support
- Sparx5 ES2 VCAP support

Version History:
================
v2      Fixed a NULL return value compiler warning.
        Moved the new vcap_find_actionfield function a bit up in the file.

v1      Initial version
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Add VCAP filter keys KUNIT test
Steen Hegelund [Wed, 23 Nov 2022 15:25:45 +0000 (16:25 +0100)]
net: microchip: sparx5: Add VCAP filter keys KUNIT test

This tests the filtering of keys, either dropping unsupported keys or
dropping keys specified in a list.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Support for displaying a list of keysets
Steen Hegelund [Wed, 23 Nov 2022 15:25:44 +0000 (16:25 +0100)]
net: microchip: sparx5: Support for displaying a list of keysets

This will display a list of keyset in case the type_id field in the VCAP
rule has been wildcarded.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Support for TC protocol all
Steen Hegelund [Wed, 23 Nov 2022 15:25:43 +0000 (16:25 +0100)]
net: microchip: sparx5: Support for TC protocol all

This allows support of TC protocol all for the Sparx5 IS2 VCAP.

This is done by creating multiple rules that covers the rule size and
traffic types in the IS2.
Each rule size (e.g X16 and X6) may have multiple keysets and if there are
more than one the type field in the VCAP rule will be wildcarded to support
these keysets.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Support for copying and modifying rules in the API
Steen Hegelund [Wed, 23 Nov 2022 15:25:42 +0000 (16:25 +0100)]
net: microchip: sparx5: Support for copying and modifying rules in the API

This adds support for making a copy of a rule and modify keys and actions
to differentiate the copy.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agolib/test_rhashtable: Remove set but unused variable 'insert_retries'
Jiapeng Chong [Wed, 23 Nov 2022 09:37:02 +0000 (17:37 +0800)]
lib/test_rhashtable: Remove set but unused variable 'insert_retries'

Variable 'insert_retries' is not effectively used in the function, so
delete it.

lib/test_rhashtable.c:437:18: warning: variable 'insert_retries' set but not used.

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3242
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: stmmac: use sysfs_streq() instead of strncmp()
Xu Panda [Tue, 22 Nov 2022 12:09:23 +0000 (20:09 +0800)]
net: stmmac: use sysfs_streq() instead of strncmp()

Replace the open-code with sysfs_streq().

Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Yang Yang <yang.yang29@zte.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: add Motorcomm YT8531S phy id.
Frank [Tue, 22 Nov 2022 08:42:32 +0000 (16:42 +0800)]
net: phy: add Motorcomm YT8531S phy id.

We added patch for motorcomm.c to support YT8531S. This patch has
been tested on AM335x platform which has one YT8531S interface
card and passed all test cases.
The tested cases indluding: YT8531S UTP function with support of
10M/100M/1000M; YT8531S Fiber function with support of 100M/1000M;
and YT8531S Combo function that supports auto detection of media type.

Since most functions of YT8531S are similar to YT8521 and we reuse some
codes for YT8521 in the patch file.

Signed-off-by: Frank <Frank.Sae@motor-comm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs/bpf: Add BPF_MAP_TYPE_XSKMAP documentation
Maryam Tahhan [Wed, 23 Nov 2022 09:00:43 +0000 (09:00 +0000)]
docs/bpf: Add BPF_MAP_TYPE_XSKMAP documentation

Add documentation for BPF_MAP_TYPE_XSKMAP including kernel version introduced,
usage and examples.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221123090043.83945-1-mtahhan@redhat.com
2 years agosamples/bpf: Fix wrong allocation size in xdp_router_ipv4_user
Rong Tao [Tue, 22 Nov 2022 02:32:56 +0000 (10:32 +0800)]
samples/bpf: Fix wrong allocation size in xdp_router_ipv4_user

prefix_key->data allocates three bytes using alloca(), but four bytes are
actually accessed in the program.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/tencent_F9E2E81922B0C181D05B96DAE5AB0ACE6B06@qq.com
2 years agodocs/bpf: Update btf selftests program and add link
Rong Tao [Tue, 22 Nov 2022 00:50:42 +0000 (08:50 +0800)]
docs/bpf: Update btf selftests program and add link

Commit c64779e24e88("selftests/bpf: Merge most of test_btf into test_progs")
renamed the BTF selftest from 'test_btf.c' to 'prog_tests/btf.c'.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/tencent_1FA6904156E8E599CAE4ABDBE80F22830106@qq.com
2 years agobpf: Don't mark arguments to fentry/fexit programs as trusted.
Alexei Starovoitov [Thu, 24 Nov 2022 21:53:14 +0000 (13:53 -0800)]
bpf: Don't mark arguments to fentry/fexit programs as trusted.

The PTR_TRUSTED flag should only be applied to pointers where the verifier can
guarantee that such pointers are valid.
The fentry/fexit/fmod_ret programs are not in this category.
Only arguments of SEC("tp_btf") and SEC("iter") programs are trusted
(which have BPF_TRACE_RAW_TP and BPF_TRACE_ITER attach_type correspondingly)

This bug was masked because convert_ctx_accesses() was converting trusted
loads into BPF_PROBE_MEM loads. Fix it as well.
The loads from trusted pointers don't need exception handling.

Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221124215314.55890-1-alexei.starovoitov@gmail.com
2 years agoMerge branch 'bpf: Add bpf_rcu_read_lock() support'
Alexei Starovoitov [Thu, 24 Nov 2022 20:27:13 +0000 (12:27 -0800)]
Merge branch 'bpf: Add bpf_rcu_read_lock() support'

Yonghong Song says:

====================

Currently, without rcu attribute info in BTF, the verifier treats
rcu tagged pointer as a normal pointer. This might be a problem
for sleepable program where rcu_read_lock()/unlock() is not available.
For example, for a sleepable fentry program, if rcu protected memory
access is interleaved with a sleepable helper/kfunc, it is possible
the memory access after the sleepable helper/kfunc might be invalid
since the object might have been freed then. Even without
a sleepable helper/kfunc, without rcu_read_lock() protection,
it is possible that the rcu protected object might be release
in the middle of bpf program execution which may cause incorrect
result.

To prevent above cases, enable btf_type_tag("rcu") attributes,
introduce new bpf_rcu_read_lock/unlock() kfuncs and add verifier support.

In the rest of patch set, Patch 1 enabled btf_type_tag for __rcu
attribute. Patche 2 added might_sleep in bpf_func_proto. Patch 3 added new
bpf_rcu_read_lock/unlock() kfuncs and verifier support.
Patch 4 added some tests for these two new kfuncs.

Changelogs:
  v9 -> v10:
    . if no rcu tag support in vmlinux btf, using bpf_rcu_read_lock/unlock()
      will cause verification error.
    . at bpf_rcu_read_unlock(), invalidate rcu ptr to PTR_UNTRUSTED
      instead of SCALAR_VALUE.
    . a few other comment changes and other minor changes.
  v8 -> v9:
    . remove sleepable prog check for ld_abs/ind checking in rcu read
      lock region.
    . fix a test failure with gcc-compiled kernel.
    . a couple of other minor fixes.
  v7 -> v8:
    . add might_sleep in bpf_func_proto so we can easily identify whether
      a helper is sleepable or not.
    . do not enforce rcu rules for sleepable, e.g., rcu dereference must
      be in a bpf_rcu_read_lock region. This is to keep old code working
      fine.
    . Mark 'b' in 'b = a->b' (b is tagged with __rcu) as MEM_RCU only if
      'b = a->b' in rcu read region and 'a' is trusted. This adds safety
      guarantee for 'b' inside the rcu read region.
  v6 -> v7:
    . rebase on top of bpf-next.
    . remove the patch which enables sleepable program using
      cgrp_local_storage map. This is orthogonal to this patch set
      and will be addressed separately.
    . mark the rcu pointer dereference result as UNTRUSTED if inside
      a bpf_rcu_read_lock() region.
  v5 -> v6:
    . fix selftest prog miss_unlock which tested nested locking.
    . add comments in selftest prog cgrp_succ to explain how to handle
      nested memory access after rcu memory load.
  v4 -> v5:
    . add new test to aarch64 deny list.
  v3 -> v4:
    . fix selftest failures when built with gcc. gcc doesn't support
      btf_type_tag yet and some tests relies on that. skip these
      tests if vmlinux BTF does not have btf_type_tag("rcu").
  v2 -> v3:
    . went back to MEM_RCU approach with invalidate rcu ptr registers
      at bpf_rcu_read_unlock() place.
    . remove KF_RCU_LOCK/UNLOCK flag and compare btf_id at verification
      time instead.
  v1 -> v2:
    . use kfunc instead of helper for bpf_rcu_read_lock/unlock.
    . not use MEM_RCU bpf_type_flag, instead use active_rcu_lock
      in reg state to identify rcu ptr's.
    . Add more self tests.
    . add new test to s390x deny list.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add tests for bpf_rcu_read_lock()
Yonghong Song [Thu, 24 Nov 2022 05:32:22 +0000 (21:32 -0800)]
selftests/bpf: Add tests for bpf_rcu_read_lock()

Add a few positive/negative tests to test bpf_rcu_read_lock()
and its corresponding verifier support. The new test will fail
on s390x and aarch64, so an entry is added to each of their
respective deny lists.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221124053222.2374650-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Add kfunc bpf_rcu_read_lock/unlock()
Yonghong Song [Thu, 24 Nov 2022 05:32:17 +0000 (21:32 -0800)]
bpf: Add kfunc bpf_rcu_read_lock/unlock()

Add two kfunc's bpf_rcu_read_lock() and bpf_rcu_read_unlock(). These two kfunc's
can be used for all program types. The following is an example about how
rcu pointer are used w.r.t. bpf_rcu_read_lock()/bpf_rcu_read_unlock().

  struct task_struct {
    ...
    struct task_struct              *last_wakee;
    struct task_struct __rcu        *real_parent;
    ...
  };

Let us say prog does 'task = bpf_get_current_task_btf()' to get a
'task' pointer. The basic rules are:
  - 'real_parent = task->real_parent' should be inside bpf_rcu_read_lock
    region. This is to simulate rcu_dereference() operation. The
    'real_parent' is marked as MEM_RCU only if (1). task->real_parent is
    inside bpf_rcu_read_lock region, and (2). task is a trusted ptr. So
    MEM_RCU marked ptr can be 'trusted' inside the bpf_rcu_read_lock region.
  - 'last_wakee = real_parent->last_wakee' should be inside bpf_rcu_read_lock
    region since it tries to access rcu protected memory.
  - the ptr 'last_wakee' will be marked as PTR_UNTRUSTED since in general
    it is not clear whether the object pointed by 'last_wakee' is valid or
    not even inside bpf_rcu_read_lock region.

The verifier will reset all rcu pointer register states to untrusted
at bpf_rcu_read_unlock() kfunc call site, so any such rcu pointer
won't be trusted any more outside the bpf_rcu_read_lock() region.

The current implementation does not support nested rcu read lock
region in the prog.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221124053217.2373910-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Introduce might_sleep field in bpf_func_proto
Yonghong Song [Thu, 24 Nov 2022 05:32:11 +0000 (21:32 -0800)]
bpf: Introduce might_sleep field in bpf_func_proto

Introduce bpf_func_proto->might_sleep to indicate a particular helper
might sleep. This will make later check whether a helper might be
sleepable or not easier.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221124053211.2373553-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agocompiler_types: Define __rcu as __attribute__((btf_type_tag("rcu")))
Yonghong Song [Thu, 24 Nov 2022 05:32:06 +0000 (21:32 -0800)]
compiler_types: Define __rcu as __attribute__((btf_type_tag("rcu")))

Currently, without rcu attribute info in BTF, the verifier treats
rcu tagged pointer as a normal pointer. This might be a problem
for sleepable program where rcu_read_lock()/unlock() is not available.
For example, for a sleepable fentry program, if rcu protected memory
access is interleaved with a sleepable helper/kfunc, it is possible
the memory access after the sleepable helper/kfunc might be invalid
since the object might have been freed then. To prevent such cases,
introducing rcu tagging for memory accesses in verifier can help
to reject such programs.

To enable rcu tagging in BTF, during kernel compilation,
define __rcu as attribute btf_type_tag("rcu") so __rcu information can
be preserved in dwarf and btf, and later can be used for bpf prog verification.

Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221124053206.2373141-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agonet: use %pS for kfree_skb tracing event location
Stanislav Fomichev [Wed, 23 Nov 2022 04:09:47 +0000 (20:09 -0800)]
net: use %pS for kfree_skb tracing event location

For the cases where 'reason' doesn't give any clue, it's still
nice to be able to track the kfree_skb caller location. %p doesn't
help much so let's use %pS which prints the symbol+offset.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20221123040947.1015721-1-sdf@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'bonding-fix-bond-recovery-in-mode-2'
Jakub Kicinski [Thu, 24 Nov 2022 04:15:13 +0000 (20:15 -0800)]
Merge branch 'bonding-fix-bond-recovery-in-mode-2'

Jonathan Toppins says:

====================
bonding: fix bond recovery in mode 2

When a bond is configured with a non-zero updelay and in mode 2 the bond
never recovers after all slaves lose link. The first patch adds
selftests that demonstrate the issue and the second patch fixes the
issue by ignoring the updelay when there are no usable slaves.
====================

Link: https://lore.kernel.org/r/cover.1669147951.git.jtoppins@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobonding: fix link recovery in mode 2 when updelay is nonzero
Jonathan Toppins [Tue, 22 Nov 2022 21:24:29 +0000 (16:24 -0500)]
bonding: fix link recovery in mode 2 when updelay is nonzero

Before this change when a bond in mode 2 lost link, all of its slaves
lost link, the bonding device would never recover even after the
expiration of updelay. This change removes the updelay when the bond
currently has no usable links. Conforming to bonding.txt section 13.1
paragraph 4.

Fixes: 41f891004063 ("bonding: ignore updelay param when there is no active slave")
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: bonding: up/down delay w/ slave link flapping
Jonathan Toppins [Tue, 22 Nov 2022 20:25:04 +0000 (15:25 -0500)]
selftests: bonding: up/down delay w/ slave link flapping

Verify when a bond is configured with {up,down}delay and the link state
of slave members flaps if there are no remaining members up the bond
should immediately select a member to bring up. (from bonding.txt
section 13.1 paragraph 4)

Suggested-by: Liang Li <liali@redhat.com>
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoethtool: avoiding integer overflow in ethtool_phys_id()
Maxim Korotkov [Tue, 22 Nov 2022 12:29:01 +0000 (15:29 +0300)]
ethtool: avoiding integer overflow in ethtool_phys_id()

The value of an arithmetic expression "n * id.data" is subject
to possible overflow due to a failure to cast operands to a larger data
type before performing arithmetic. Used macro for multiplication instead
operator for avoiding overflow.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Maxim Korotkov <korotkov.maxim.s@gmail.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221122122901.22294-1-korotkov.maxim.s@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests/bpf: Add selftests for bpf_task_from_pid()
David Vernet [Tue, 22 Nov 2022 14:53:00 +0000 (08:53 -0600)]
selftests/bpf: Add selftests for bpf_task_from_pid()

Add some selftest testcases that validate the expected behavior of the
bpf_task_from_pid() kfunc that was added in the prior patch.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221122145300.251210-3-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Add bpf_task_from_pid() kfunc
David Vernet [Tue, 22 Nov 2022 14:52:59 +0000 (08:52 -0600)]
bpf: Add bpf_task_from_pid() kfunc

Callers can currently store tasks as kptrs using bpf_task_acquire(),
bpf_task_kptr_get(), and bpf_task_release(). These are useful if a
caller already has a struct task_struct *, but there may be some callers
who only have a pid, and want to look up the associated struct
task_struct * from that to e.g. find task->comm.

This patch therefore adds a new bpf_task_from_pid() kfunc which allows
BPF programs to get a struct task_struct * kptr from a pid.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221122145300.251210-2-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Unify and simplify btf_func_proto_check error handling
Stanislav Fomichev [Thu, 24 Nov 2022 00:28:38 +0000 (16:28 -0800)]
bpf: Unify and simplify btf_func_proto_check error handling

Replace 'err = x; break;' with 'return x;'.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221124002838.2700179-1-sdf@google.com
2 years agobpf: Update bpf_{g,s}etsockopt() documentation
Ji Rongfeng [Fri, 18 Nov 2022 08:18:18 +0000 (16:18 +0800)]
bpf: Update bpf_{g,s}etsockopt() documentation

* append missing optnames to the end
* simplify bpf_getsockopt()'s doc

Signed-off-by: Ji Rongfeng <SikoJobs@outlook.com>
Link: https://lore.kernel.org/r/DU0P192MB15479B86200B1216EC90E162D6099@DU0P192MB1547.EURP192.PROD.OUTLOOK.COM
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2 years agodocs/bpf: Fix sphinx warnings in BPF map docs
Donald Hunter [Tue, 22 Nov 2022 14:39:33 +0000 (14:39 +0000)]
docs/bpf: Fix sphinx warnings in BPF map docs

Fix duplicate C declaration warnings when using sphinx >= 3.1.

Reported-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Akira Yokosawa <akiyks@gmail.com>
Link: https://lore.kernel.org/bpf/ed4dac84-1b12-5c58-e4de-93ab9ac67c09@gmail.com
Link: https://lore.kernel.org/bpf/20221122143933.91321-1-donald.hunter@gmail.com
2 years agoselftests/bpf: Add reproducer for decl_tag in func_proto argument
Stanislav Fomichev [Wed, 23 Nov 2022 03:54:21 +0000 (19:54 -0800)]
selftests/bpf: Add reproducer for decl_tag in func_proto argument

It should trigger a WARN_ON_ONCE in btf_type_id_size:

  RIP: 0010:btf_type_id_size+0x8bd/0x940 kernel/bpf/btf.c:1952
  btf_func_proto_check kernel/bpf/btf.c:4506 [inline]
  btf_check_all_types kernel/bpf/btf.c:4734 [inline]
  btf_parse_type_sec+0x1175/0x1980 kernel/bpf/btf.c:4763
  btf_parse kernel/bpf/btf.c:5042 [inline]
  btf_new_fd+0x65a/0xb00 kernel/bpf/btf.c:6709
  bpf_btf_load+0x6f/0x90 kernel/bpf/syscall.c:4342
  __sys_bpf+0x50a/0x6c0 kernel/bpf/syscall.c:5034
  __do_sys_bpf kernel/bpf/syscall.c:5093 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:5091 [inline]
  __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5091
  do_syscall_64+0x54/0x70 arch/x86/entry/common.c:48

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221123035422.872531-1-sdf@google.com
2 years agobpf: Prevent decl_tag from being referenced in func_proto arg
Stanislav Fomichev [Wed, 23 Nov 2022 03:54:22 +0000 (19:54 -0800)]
bpf: Prevent decl_tag from being referenced in func_proto arg

Syzkaller managed to hit another decl_tag issue:

  btf_func_proto_check kernel/bpf/btf.c:4506 [inline]
  btf_check_all_types kernel/bpf/btf.c:4734 [inline]
  btf_parse_type_sec+0x1175/0x1980 kernel/bpf/btf.c:4763
  btf_parse kernel/bpf/btf.c:5042 [inline]
  btf_new_fd+0x65a/0xb00 kernel/bpf/btf.c:6709
  bpf_btf_load+0x6f/0x90 kernel/bpf/syscall.c:4342
  __sys_bpf+0x50a/0x6c0 kernel/bpf/syscall.c:5034
  __do_sys_bpf kernel/bpf/syscall.c:5093 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:5091 [inline]
  __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5091
  do_syscall_64+0x54/0x70 arch/x86/entry/common.c:48

This seems similar to commit ea68376c8bed ("bpf: prevent decl_tag from being
referenced in func_proto") but for the argument.

Reported-by: syzbot+8dd0551dda6020944c5d@syzkaller.appspotmail.com
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221123035422.872531-2-sdf@google.com
2 years agodocs/bpf: Document BPF_MAP_TYPE_BLOOM_FILTER
Donald Hunter [Wed, 23 Nov 2022 14:11:51 +0000 (14:11 +0000)]
docs/bpf: Document BPF_MAP_TYPE_BLOOM_FILTER

Add documentation for BPF_MAP_TYPE_BLOOM_FILTER including kernel
BPF helper usage, userspace usage and examples.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://lore.kernel.org/bpf/20221123141151.54556-1-donald.hunter@gmail.com
2 years agodocs/bpf: Fix sphinx warnings for devmap
Maryam Tahhan [Wed, 23 Nov 2022 09:23:21 +0000 (09:23 +0000)]
docs/bpf: Fix sphinx warnings for devmap

Sphinx version >=3.1 warns about duplicate function declarations in the
DEVMAP documentation. This is because the function name is the same for
kernel and user space BPF progs but the parameters and return types
they take is what differs. This patch moves from using the ``c:function::``
directive to using the ``code-block:: c`` directive. The patches also fix
the indentation for the text associated with the "new" code block delcarations.
The missing support of c:namespace-push:: and c:namespace-pop:: directives by
helper scripts for kernel documentation prevents using the ``c:function::``
directive with proper namespacing.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221123092321.88558-3-mtahhan@redhat.com
2 years agodocs/bpf: Fix sphinx warnings for cpumap
Maryam Tahhan [Wed, 23 Nov 2022 09:23:20 +0000 (09:23 +0000)]
docs/bpf: Fix sphinx warnings for cpumap

Sphinx version >=3.1 warns about duplicate function declarations in the
CPUMAP documentation. This is because the function name is the same for
kernel and user space BPF progs but the parameters and return types
they take is what differs. This patch moves from using the ``c:function::``
directive to using the ``code-block:: c`` directive. The patches also fix
the indentation for the text associated with the "new" code block delcarations.
The missing support of c:namespace-push:: and c:namespace-pop:: directives by
helper scripts for kernel documentation prevents using the ``c:function::``
directive with proper namespacing.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221123092321.88558-2-mtahhan@redhat.com
2 years agodocs/bpf: Add table of BPF program types to libbpf docs
Donald Hunter [Mon, 21 Nov 2022 12:17:34 +0000 (12:17 +0000)]
docs/bpf: Add table of BPF program types to libbpf docs

Extend the libbpf documentation with a table of program types,
attach points and ELF section names.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20221121121734.98329-1-donald.hunter@gmail.com
2 years agobpf: Fix a BTF_ID_LIST bug with CONFIG_DEBUG_INFO_BTF not set
Yonghong Song [Wed, 23 Nov 2022 15:57:59 +0000 (07:57 -0800)]
bpf: Fix a BTF_ID_LIST bug with CONFIG_DEBUG_INFO_BTF not set

With CONFIG_DEBUG_INFO_BTF not set, we hit the following compilation error,
  /.../kernel/bpf/verifier.c:8196:23: error: array index 6 is past the end of the array
  (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
        if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx])
                             ^                  ~~~~~~~~~~~~~~~~~~~~~~~
  /.../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
  BTF_ID_LIST(special_kfunc_list)
  ^
  /.../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
  #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                            ^
  /.../kernel/bpf/verifier.c:8443:19: error: array index 5 is past the end of the array
  (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                 btf_id == special_kfunc_list[KF_bpf_list_pop_back];
                           ^                  ~~~~~~~~~~~~~~~~~~~~
  /.../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
  BTF_ID_LIST(special_kfunc_list)
  ^
  /.../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
  #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
  ...

Fix the problem by increase the size of BTF_ID_LIST to 16 to avoid compilation error
and also prevent potentially unintended issue due to out-of-bound access.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221123155759.2669749-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoMerge branch 'net-complete-conversion-to-i2c_probe_new'
Jakub Kicinski [Wed, 23 Nov 2022 20:50:11 +0000 (12:50 -0800)]
Merge branch 'net-complete-conversion-to-i2c_probe_new'

Jakub Kicinski says:

====================
net: Complete conversion to i2c_probe_new

Reposting for Uwe the networking slice of his mega-series:
https://lore.kernel.org/all/20221118224540.619276-1-uwe@kleine-koenig.org/
so that our build bot can confirm the obvious.

fix mlx5 -> mlxsw while at it.
====================

Link: https://lore.kernel.org/r/20221123045507.2091409-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: st21nfca: i2c: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:55:07 +0000 (20:55 -0800)]
nfc: st21nfca: i2c: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: st-nci: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:55:06 +0000 (20:55 -0800)]
nfc: st-nci: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: s3fwrn5: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:55:05 +0000 (20:55 -0800)]
nfc: s3fwrn5: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: pn544: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:55:04 +0000 (20:55 -0800)]
nfc: pn544: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: pn533: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:55:03 +0000 (20:55 -0800)]
nfc: pn533: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoNFC: nxp-nci: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:55:02 +0000 (20:55 -0800)]
NFC: nxp-nci: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: mrvl: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:55:01 +0000 (20:55 -0800)]
nfc: mrvl: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: microread: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:55:00 +0000 (20:55 -0800)]
nfc: microread: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/mlxsw: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:54:59 +0000 (20:54 -0800)]
net/mlxsw: Convert to i2c's .probe_new()

.probe_new() doesn't get the i2c_device_id * parameter, so determine
that explicitly in the probe function.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: xrs700x: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:54:58 +0000 (20:54 -0800)]
net: dsa: xrs700x: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: microchip: ksz9477: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:54:57 +0000 (20:54 -0800)]
net: dsa: microchip: ksz9477: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: lan9303: Convert to i2c's .probe_new()
Uwe Kleine-König [Wed, 23 Nov 2022 04:54:56 +0000 (20:54 -0800)]
net: dsa: lan9303: Convert to i2c's .probe_new()

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests/bpf: Mount debugfs in setns_by_fd
Stanislav Fomichev [Wed, 23 Nov 2022 20:08:29 +0000 (12:08 -0800)]
selftests/bpf: Mount debugfs in setns_by_fd

Jiri reports broken test_progs after recent commit 68f8e3d4b916
("selftests/bpf: Make sure zero-len skbs aren't redirectable").
Apparently we don't remount debugfs when we switch back networking namespace.
Let's explicitly mount /sys/kernel/debug.

0: https://lore.kernel.org/bpf/63b85917-a2ea-8e35-620c-808560910819@meta.com/T/#ma66ca9c92e99eee0a25e40f422489b26ee0171c1

Fixes: a30338840fa5 ("selftests/bpf: Move open_netns() and close_netns() into network_helpers.c")
Reported-by: Jiri Olsa <olsajiri@gmail.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20221123200829.2226254-1-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Don't use idx variable when registering kfunc dtors
David Vernet [Wed, 23 Nov 2022 13:52:53 +0000 (07:52 -0600)]
bpf: Don't use idx variable when registering kfunc dtors

In commit fda01efc6160 ("bpf: Enable cgroups to be used as kptrs"), I
added an 'int idx' variable to kfunc_init() which was meant to
dynamically set the index of the btf id entries of the
'generic_dtor_ids' array. This was done to make the code slightly less
brittle as the struct cgroup * kptr kfuncs such as bpf_cgroup_aquire()
are compiled out if CONFIG_CGROUPS is not defined. This, however, causes
an lkp build warning:

>> kernel/bpf/helpers.c:2005:40: warning: multiple unsequenced
   modifications to 'idx' [-Wunsequenced]
.btf_id       = generic_dtor_ids[idx++],

Fix the warning by just hard-coding the indices.

Fixes: fda01efc6160 ("bpf: Enable cgroups to be used as kptrs")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221123135253.637525-1-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoice: Use ICE_RLAN_BASE_S instead of magic number
Anatolii Gerasymenko [Thu, 3 Nov 2022 14:30:05 +0000 (15:30 +0100)]
ice: Use ICE_RLAN_BASE_S instead of magic number

Commit 72adf2421d9b ("ice: Move common functions out of ice_main.c part
2/7") moved an older version of ice_setup_rx_ctx() function with
usage of magic number 7.
Reimplement the commit 5ab522443bd1 ("ice: Cleanup magic number") to use
ICE_RLAN_BASE_S instead of magic number.

Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoice: Fix configuring VIRTCHNL_OP_CONFIG_VSI_QUEUES with unbalanced queues
Marcin Szycik [Mon, 7 Nov 2022 16:10:38 +0000 (17:10 +0100)]
ice: Fix configuring VIRTCHNL_OP_CONFIG_VSI_QUEUES with unbalanced queues

Currently the VIRTCHNL_OP_CONFIG_VSI_QUEUES command may fail if there are
less RX queues than TX queues requested.

To fix it, only configure RXDID if RX queue exists.

Fixes: e753df8fbca5 ("ice: Add support Flex RXD")
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoice: Accumulate ring statistics over reset
Benjamin Mikailenko [Fri, 18 Nov 2022 21:20:02 +0000 (16:20 -0500)]
ice: Accumulate ring statistics over reset

Resets may occur with or without user interaction. For example, a TX hang
or reconfiguration of parameters will result in a reset. During reset, the
VSI is freed, freeing any statistics structures inside as well. This would
create an issue for the user where a reset happens in the background,
statistics set to zero, and the user checks ring statistics expecting them
to be populated.

To ensure this doesn't happen, accumulate ring statistics over reset.

Define a new ring statistics structure, ice_ring_stats. The new structure
lives in the VSI's parent, preserving ring statistics when VSI is freed.

1. Define a new structure vsi_ring_stats in the PF scope
2. Allocate/free stats only during probe, unload, or change in ring size
3. Replace previous ring statistics functionality with new structure

Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoice: Accumulate HW and Netdev statistics over reset
Benjamin Mikailenko [Fri, 18 Nov 2022 21:20:01 +0000 (16:20 -0500)]
ice: Accumulate HW and Netdev statistics over reset

Resets happen with or without user interaction. For example, incidents
such as TX hang or a reconfiguration of parameters will result in a reset.
During reset, hardware and software statistics were set to zero. This
created an issue for the user where a reset happens in the background,
statistics set to zero, and the user checks statistics expecting them to
be populated.

To ensure this doesn't happen, keep accumulating stats over reset.

1. Remove function calls which reset hardware and netdev statistics.
2. Do not rollover statistics in ice_stat_update40 during reset.

Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoice: Remove and replace ice speed defines with ethtool.h versions
Brett Creeley [Mon, 31 Oct 2022 17:09:12 +0000 (10:09 -0700)]
ice: Remove and replace ice speed defines with ethtool.h versions

The driver is currently using ICE_LINK_SPEED_* defines that mirror what
ethtool.h defines, with one exception ICE_LINK_SPEED_UNKNOWN.

This issue is fixed by the following changes:

1. replace ICE_LINK_SPEED_UNKNOWN with 0 because SPEED_UNKNOWN in
   ethtool.h is "-1" and that doesn't match the driver's expected behavior
2. transform ICE_LINK_SPEED_*MBPS to SPEED_* using static tables and
   fls()-1 to convert from BIT() to an index in a table.

Suggested-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoice: Check for PTP HW lock more frequently
Karol Kolacinski [Mon, 3 Oct 2022 09:55:18 +0000 (11:55 +0200)]
ice: Check for PTP HW lock more frequently

It was observed that PTP HW semaphore can be held for ~50 ms in worst
case.
SW should wait longer and check more frequently if the HW lock is held.

Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agosfc: ensure type is valid before updating seen_gen
Edward Cree [Mon, 21 Nov 2022 21:37:08 +0000 (21:37 +0000)]
sfc: ensure type is valid before updating seen_gen

In the case of invalid or corrupted v2 counter update packets,
 efx_tc_rx_version_2() returns EFX_TC_COUNTER_TYPE_MAX.  In this case
 we should not attempt to update generation counts as this will write
 beyond the end of the seen_gen array.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527356 ("Memory - illegal accesses")
Fixes: 25730d8be5d8 ("sfc: add extra RX channel to receive MAE counter updates on ef100")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoppp: associate skb with a device at tx
Stanislav Fomichev [Mon, 21 Nov 2022 18:29:13 +0000 (10:29 -0800)]
ppp: associate skb with a device at tx

Syzkaller triggered flow dissector warning with the following:

r0 = openat$ppp(0xffffffffffffff9c, &(0x7f0000000000), 0xc0802, 0x0)
ioctl$PPPIOCNEWUNIT(r0, 0xc004743e, &(0x7f00000000c0))
ioctl$PPPIOCSACTIVE(r0, 0x40107446, &(0x7f0000000240)={0x2, &(0x7f0000000180)=[{0x20, 0x0, 0x0, 0xfffff034}, {0x6}]})
pwritev(r0, &(0x7f0000000040)=[{&(0x7f0000000140)='\x00!', 0x2}], 0x1, 0x0, 0x0)

[    9.485814] WARNING: CPU: 3 PID: 329 at net/core/flow_dissector.c:1016 __skb_flow_dissect+0x1ee0/0x1fa0
[    9.485929]  skb_get_poff+0x53/0xa0
[    9.485937]  bpf_skb_get_pay_offset+0xe/0x20
[    9.485944]  ? ppp_send_frame+0xc2/0x5b0
[    9.485949]  ? _raw_spin_unlock_irqrestore+0x40/0x60
[    9.485958]  ? __ppp_xmit_process+0x7a/0xe0
[    9.485968]  ? ppp_xmit_process+0x5b/0xb0
[    9.485974]  ? ppp_write+0x12a/0x190
[    9.485981]  ? do_iter_write+0x18e/0x2d0
[    9.485987]  ? __import_iovec+0x30/0x130
[    9.485997]  ? do_pwritev+0x1b6/0x240
[    9.486016]  ? trace_hardirqs_on+0x47/0x50
[    9.486023]  ? __x64_sys_pwritev+0x24/0x30
[    9.486026]  ? do_syscall_64+0x3d/0x80
[    9.486031]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd

Flow dissector tries to find skb net namespace either via device
or via socket. Neigher is set in ppp_send_frame, so let's manually
use ppp->dev.

Cc: Paul Mackerras <paulus@samba.org>
Cc: linux-ppp@vger.kernel.org
Reported-by: syzbot+41cab52ab62ee99ed24a@syzkaller.appspotmail.com
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoDocumentation: devlink: Add blank line padding on numbered lists in Devlink Port...
Bagas Sanjaya [Mon, 21 Nov 2022 03:58:55 +0000 (10:58 +0700)]
Documentation: devlink: Add blank line padding on numbered lists in Devlink Port documentation

kernel test robot reported indentation warnings:

Documentation/networking/devlink/devlink-port.rst:220: WARNING: Unexpected indentation.
Documentation/networking/devlink/devlink-port.rst:222: WARNING: Block quote ends without a blank line; unexpected unindent.

These warnings cause lists (arbitration flow for which the warnings blame to
and 3-step subfunction setup) to be rendered inline instead. Also, for the
former list, automatic list numbering is messed up.

Fix these warnings by adding missing blank line padding.

Link: https://lore.kernel.org/linux-doc/202211200926.kfOPiVti-lkp@intel.com/
Fixes: 242dd64375b80a ("Documentation: Add documentation for new devlink-rate attributes")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'revert-veth-avoid-drop-packets-when-xdp_redirect-performs-and-its-fix'
Jakub Kicinski [Wed, 23 Nov 2022 04:42:11 +0000 (20:42 -0800)]
Merge branch 'revert-veth-avoid-drop-packets-when-xdp_redirect-performs-and-its-fix'

Heng Qi says:

====================
Revert "veth: Avoid drop packets when xdp_redirect performs" and its fix

This patch 2e0de6366ac16 enables napi of the peer veth automatically when
the veth loads the xdp, but it breaks down as reported by Paolo and John.
So reverting it and its fix, we will rework the patch and make it more
robust based on comments.
====================

Link: https://lore.kernel.org/r/20221122035015.19296-1-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "veth: Avoid drop packets when xdp_redirect performs"
Heng Qi [Tue, 22 Nov 2022 03:50:15 +0000 (11:50 +0800)]
Revert "veth: Avoid drop packets when xdp_redirect performs"

This reverts commit 2e0de6366ac16ab4d0abb2aaddbc8a1eba216d11.

Based on the issues reported by John and Paolo and their comments,
this patch and the corresponding fix 5e5dc33d5da are reverted, and
we'll remake it.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "bpf: veth driver panics when xdp prog attached before veth_open"
Heng Qi [Tue, 22 Nov 2022 03:50:14 +0000 (11:50 +0800)]
Revert "bpf: veth driver panics when xdp prog attached before veth_open"

This reverts commit 5e5dc33d5dacb34b0165061bc5a10efd2fd3b66f.

This patch fixes the panic maked by 2e0de6366ac16. Now Paolo
and Toke suggest reverting the patch 2e0de6366ac16 and making
it stronger, so do this first.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>