]> git.apps.os.sepia.ceph.com Git - ceph-client.git/log
ceph-client.git
2 years agonet: dsa: kill off dsa_priv.h
Vladimir Oltean [Mon, 21 Nov 2022 13:55:55 +0000 (15:55 +0200)]
net: dsa: kill off dsa_priv.h

The last remnants in dsa_priv.h are a netlink-related definition for
which we create a new header, and DSA_MAX_NUM_OFFLOADING_BRIDGES which
is only used from dsa.c, so move it there.

Some inclusions need to be adjusted now that we no longer have headers
included transitively from dsa_priv.h.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move tag_8021q headers to their proper place
Vladimir Oltean [Mon, 21 Nov 2022 13:55:54 +0000 (15:55 +0200)]
net: dsa: move tag_8021q headers to their proper place

tag_8021q definitions are all over the place. Some are exported to
linux/dsa/8021q.h (visible by DSA core, taggers, switch drivers and
everyone else), and some are in dsa_priv.h.

Move the structures that don't need external visibility into tag_8021q.c,
and the ones which don't need the world or switch drivers to see them
into tag_8021q.h.

We also have the tag_8021q.h inclusion from switch.c, which is basically
the entire reason why tag_8021q.c was built into DSA in commit
8b6e638b4be2 ("net: dsa: build tag_8021q.c as part of DSA core").
I still don't know how to better deal with that, so leave it alone.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move definitions from dsa_priv.h to slave.c
Vladimir Oltean [Mon, 21 Nov 2022 13:55:53 +0000 (15:55 +0200)]
net: dsa: move definitions from dsa_priv.h to slave.c

There are some definitions in dsa_priv.h which are only used from
slave.c. So move them to slave.c.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: rename dsa2.c back into dsa.c and create its header
Vladimir Oltean [Mon, 21 Nov 2022 13:55:52 +0000 (15:55 +0200)]
net: dsa: rename dsa2.c back into dsa.c and create its header

The previous change moved the code into the larger file (dsa2.c) to
minimize the delta. Rename that now to dsa.c, and create dsa.h, where
all related definitions from dsa_priv.h go.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: merge dsa.c into dsa2.c
Vladimir Oltean [Mon, 21 Nov 2022 13:55:51 +0000 (15:55 +0200)]
net: dsa: merge dsa.c into dsa2.c

There is no longer a meaningful distinction between what goes into
dsa2.c and what goes into dsa.c. Merge the 2 into a single file.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move notifier definitions to switch.h
Vladimir Oltean [Mon, 21 Nov 2022 13:55:50 +0000 (15:55 +0200)]
net: dsa: move notifier definitions to switch.h

Reduce bloat in dsa_priv.h by moving the cross-chip notifier data
structures to switch.h.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move dsa_tree_notify() and dsa_broadcast() to switch.c
Vladimir Oltean [Mon, 21 Nov 2022 13:55:49 +0000 (15:55 +0200)]
net: dsa: move dsa_tree_notify() and dsa_broadcast() to switch.c

There isn't an intuitive place for these 2 cross-chip notifier functions
according to the function-to-file classification based on names
(dsa_switch_*() goes to switch.c), but I consider these to be part of
the cross-chip notifier handling, therefore part of switch.c. Move them
there to reduce bloat in dsa2.c (the place where all code with no better
place to go goes).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move headers exported by switch.c to switch.h
Vladimir Oltean [Mon, 21 Nov 2022 13:55:48 +0000 (15:55 +0200)]
net: dsa: move headers exported by switch.c to switch.h

Reduce code bloat in dsa_priv.h by moving the prototypes exported by
switch.h into their own header file.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move tagging protocol code to tag.{c,h}
Vladimir Oltean [Mon, 21 Nov 2022 13:55:47 +0000 (15:55 +0200)]
net: dsa: move tagging protocol code to tag.{c,h}

It would be nice if tagging protocol drivers could include just the
header they need, since they are (mostly) data path and isolated from
most of the other DSA core code does.

Create a tag.c and a tag.h file which are meant to support tagging
protocol drivers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move headers exported by slave.c to slave.h
Vladimir Oltean [Mon, 21 Nov 2022 13:55:46 +0000 (15:55 +0200)]
net: dsa: move headers exported by slave.c to slave.h

Minimize the use of the bloated dsa_priv.h by moving the prototypes
exported by slave.c to their own header file.

This is just approximate to get the code structure right. There are some
interdependencies with static inline code left in dsa_priv.h, so leave
slave.h included from there for now.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move headers exported by master.c to master.h
Vladimir Oltean [Mon, 21 Nov 2022 13:55:45 +0000 (15:55 +0200)]
net: dsa: move headers exported by master.c to master.h

Minimize the use of the bloated dsa_priv.h by moving the prototypes
exported by master.c to their own header file.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move headers exported by port.c to port.h
Vladimir Oltean [Mon, 21 Nov 2022 13:55:44 +0000 (15:55 +0200)]
net: dsa: move headers exported by port.c to port.h

Minimize the use of the bloated dsa_priv.h by moving the prototypes
exported by port.c to their own header file.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move rest of devlink setup/teardown to devlink.c
Vladimir Oltean [Mon, 21 Nov 2022 13:55:43 +0000 (15:55 +0200)]
net: dsa: move rest of devlink setup/teardown to devlink.c

The code that needed further refactoring into dedicated functions in
dsa2.c was left aside. Move it now to devlink.c, and make dsa2.c stop
including net/devlink.h.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: if ds->setup is true, ds->devlink is always non-NULL
Vladimir Oltean [Mon, 21 Nov 2022 13:55:42 +0000 (15:55 +0200)]
net: dsa: if ds->setup is true, ds->devlink is always non-NULL

Simplify dsa_switch_teardown() to remove the NULL checking for
ds->devlink.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: move bulk of devlink code to devlink.{c,h}
Vladimir Oltean [Mon, 21 Nov 2022 13:55:41 +0000 (15:55 +0200)]
net: dsa: move bulk of devlink code to devlink.{c,h}

dsa.c and dsa2.c are bloated with too much off-topic code. Identify all
code related to devlink and move it to a new devlink.c file.

Steer clear of the dsa_priv.h dumping ground antipattern and create a
dedicated devlink.h for it, which will be included only by the C files
which need it. Usage of dsa_priv.h will be minimized in later patches.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: modularize DSA_TAG_PROTO_NONE
Vladimir Oltean [Mon, 21 Nov 2022 13:55:40 +0000 (15:55 +0200)]
net: dsa: modularize DSA_TAG_PROTO_NONE

There is no reason that I can see why the no-op tagging protocol should
be registered manually, so make it a module and make all drivers which
have any sort of reference to DSA_TAG_PROTO_NONE select it.

Note that I don't know if ksz_get_tag_protocol() really needs this,
or if it's just the logic which is poorly written. All switches seem to
have their own tagging protocol, and DSA_TAG_PROTO_NONE is just a
fallback that never gets used.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: unexport dsa_dev_to_net_device()
Vladimir Oltean [Mon, 21 Nov 2022 13:55:39 +0000 (15:55 +0200)]
net: dsa: unexport dsa_dev_to_net_device()

dsa.o and dsa2.o are linked into the same dsa_core.o, there is no reason
to export this symbol when its only caller is local.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: remove redundant health state set to error
Moshe Shemesh [Sun, 20 Nov 2022 08:36:52 +0000 (10:36 +0200)]
devlink: remove redundant health state set to error

Reporter health_state is set twice to error in devlink_health_report().
Remove second time as it is redundant.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/1668933412-5498-1-git-send-email-moshe@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: Fix build break when CONFIG_IPV6=n
Saeed Mahameed [Tue, 22 Nov 2022 18:41:58 +0000 (10:41 -0800)]
tcp: Fix build break when CONFIG_IPV6=n

The cited commit caused the following build break when CONFIG_IPV6 was
disabled

net/ipv4/tcp_input.c: In function ‘tcp_syn_flood_action’:
include/net/sock.h:387:37: error: ‘const struct sock_common’ has no member named ‘skc_v6_rcv_saddr’; did you mean ‘skc_rcv_saddr’?

Fix by using inet6_rcv_saddr() macro which handles this situation
nicely.

Fixes: d9282e48c608 ("tcp: Add listening address to SYN flood message")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
CC: Matthieu Baerts <matthieu.baerts@tessares.net>
CC: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221122184158.170798-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'i2c/client_device_id_helper-immutable' of git://git.kernel.org/pub...
Jakub Kicinski [Wed, 23 Nov 2022 03:50:20 +0000 (19:50 -0800)]
Merge branch 'i2c/client_device_id_helper-immutable' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull in a dependency for an API cleanup:
https://lore.kernel.org/all/20221118224540.619276-1-uwe@kleine-koenig.org/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'cleanup-ocelot_stats-exposure'
Paolo Abeni [Tue, 22 Nov 2022 14:36:45 +0000 (15:36 +0100)]
Merge branch 'cleanup-ocelot_stats-exposure'

Colin Foster says:

====================
cleanup ocelot_stats exposure

The ocelot_stats structures became redundant across all users. Replace
this redundancy with a static const struct. After doing this, several
definitions inside include/soc/mscc/ocelot.h no longer needed to be
shared. Patch 2 removes them.

Checkpatch throws an error for a complicated macro not in parentheses. I
understand the reason for OCELOT_COMMON_STATS was to allow expansion, but
interestingly this patch set is essentially reverting the ability for
expansion. I'm keeping the macro in this set, but am open to remove it,
since it doesn't _actually_ provide any immediate benefits anymore.
====================

Link: https://lore.kernel.org/r/20221119231406.3167852-1-colin.foster@in-advantage.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mscc: ocelot: issue a warning if stats are incorrectly ordered
Colin Foster [Sat, 19 Nov 2022 23:14:06 +0000 (15:14 -0800)]
net: mscc: ocelot: issue a warning if stats are incorrectly ordered

Ocelot uses regmap_bulk_read() operations to efficiently read stats
registers. Currently the implementation relies on the stats layout to be
ordered to be most efficient.

Issue a warning if any future implementations happen to break this pattern.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Co-developed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mscc: ocelot: remove unnecessary exposure of stats structures
Colin Foster [Sat, 19 Nov 2022 23:14:05 +0000 (15:14 -0800)]
net: mscc: ocelot: remove unnecessary exposure of stats structures

Since commit 4d1d157fb6a4 ("net: mscc: ocelot: share the common stat
definitions between all drivers") there is no longer a need to share the
stats structures to the world. Relocate these definitions to inside
ocelot_stats.c instead of a global include header.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mscc: ocelot: remove redundant stats_layout pointers
Colin Foster [Sat, 19 Nov 2022 23:14:04 +0000 (15:14 -0800)]
net: mscc: ocelot: remove redundant stats_layout pointers

Ever since commit 4d1d157fb6a4 ("net: mscc: ocelot: share the common stat
definitions between all drivers") the stats_layout entry in ocelot and
felix drivers have become redundant. Remove the unnecessary code.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoselftests: net: Add cross-compilation support for BPF programs
Björn Töpel [Sat, 19 Nov 2022 17:18:41 +0000 (18:18 +0100)]
selftests: net: Add cross-compilation support for BPF programs

The selftests/net does not have proper cross-compilation support, and
does not properly state libbpf as a dependency. Mimic/copy the BPF
build from selftests/bpf, which has the nice side-effect that libbpf
is built as well.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/r/20221119171841.2014936-1-bjorn@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agosamples: pktgen: Use "grep -E" instead of "egrep"
Tiezhu Yang [Sat, 19 Nov 2022 02:55:04 +0000 (10:55 +0800)]
samples: pktgen: Use "grep -E" instead of "egrep"

The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
fix this up by moving the related file to use "grep -E" instead.

  sed -i "s/egrep/grep -E/g" `grep egrep -rwl samples/pktgen`

Here are the steps to install the latest grep:

  wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
  tar xf grep-3.8.tar.gz
  cd grep-3.8 && ./configure && make
  sudo make install
  export PATH=/usr/local/bin:$PATH

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1668826504-32162-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoocteontx2-pf: Add additional checks while configuring ucast/bcast/mcast rules
Suman Ghosh [Fri, 18 Nov 2022 05:33:29 +0000 (11:03 +0530)]
octeontx2-pf: Add additional checks while configuring ucast/bcast/mcast rules

1. If a profile does not support DMAC extraction then avoid installing NPC
flow rules for unicast. Similarly, if LXMB(L2 and L3) extraction is not
supported by the profile then avoid installing broadcast and multicast
rules.
2. Allow MCAM entry insertion for promiscuous mode.
3. For the profiles where DMAC is not extracted in MKEX key default
unicast entry installed by AF is not valid. Hence do not use action
from the AF installed default unicast entry for such cases.
4. Adjacent packet header fields in a packet like IP header source
and destination addresses or UDP/TCP header source port and destination
can be extracted together in MKEX profile. Therefore MKEX profile can be
configured to in two ways:
a. Total of 4 bytes from start of UDP header(src port
   + destination port)
or
b. Two bytes from start and two bytes from offset 2

Signed-off-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://lore.kernel.org/r/20221118053329.2288486-1-sumang@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: bcmgenet: Clear RGMII_LINK upon link down
Florian Fainelli [Fri, 18 Nov 2022 21:37:54 +0000 (13:37 -0800)]
net: bcmgenet: Clear RGMII_LINK upon link down

Clear the RGMII_LINK bit upon detecting link down to be consistent with
setting the bit upon link up. We also move the clearing of the
out-of-band disable to the runtime initialization rather than for each
link up/down transition.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221118213754.1383364-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: microchip: sparx5: fix uninitialized variables
Dan Carpenter [Fri, 18 Nov 2022 15:12:52 +0000 (18:12 +0300)]
net: microchip: sparx5: fix uninitialized variables

Smatch complains that "err" can be uninitialized on these paths.  Also
it's just nicer to "return 0;" instead of "return err;"

Fixes: 3a344f99bb55 ("net: microchip: sparx5: Add support for TC flower ARP dissector")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/r/Y3eg9Ml/LmLR3L3C@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fix __sock_gen_cookie()
Eric Dumazet [Fri, 18 Nov 2022 04:38:43 +0000 (04:38 +0000)]
net: fix __sock_gen_cookie()

I was mistaken how atomic64_try_cmpxchg(&sk_cookie, &res, new)
is working.

I was assuming @res would contain the final sk_cookie value,
regardless of the success of our cmpxchg()

We could do something like:

if (atomic64_try_cmpxchg(&sk_cookie, &res, new)
res = new;

But we can avoid a conditional and read sk_cookie again.

atomic64_cmpxchg(&sk_cookie, res, new);
res = atomic64_read(&sk_cookie);

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527347 ("Error handling issues")
Fixes: 4ebf802cf1c6 ("net: __sock_gen_cookie() cleanup")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221118043843.3703186-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mptcp-netlink'
David S. Miller [Mon, 21 Nov 2022 13:09:08 +0000 (13:09 +0000)]
Merge branch 'mptcp-netlink'

Mat Martineau says:

====================
mptcp: More specific netlink command errors

This series makes the error reporting for the MPTCP_PM_CMD_ADD_ADDR netlink
command more specific, since there are multiple reasons the command could
fail.

Note that patch 2 adds a GENL_SET_ERR_MSG_FMT() macro to genetlink.h,
which is outside the MPTCP subsystem.

Patch 1 refactors in-kernel listening socket and endpoint creation to
simplify the second patch.

Patch 2 updates the error values returned by the in-kernel path manager
when it fails to create a local endpoint.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomptcp: more detailed error reporting on endpoint creation
Paolo Abeni [Fri, 18 Nov 2022 18:46:08 +0000 (10:46 -0800)]
mptcp: more detailed error reporting on endpoint creation

Endpoint creation can fail for a number of reasons; in case of failure
append the error number to the extended ack message, using a newly
introduced generic helper.

Additionally let mptcp_pm_nl_append_new_local_addr() report different
error reasons.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomptcp: deduplicate error paths on endpoint creation
Paolo Abeni [Fri, 18 Nov 2022 18:46:07 +0000 (10:46 -0800)]
mptcp: deduplicate error paths on endpoint creation

When endpoint creation fails, we need to free the newly allocated
entry and eventually destroy the paired mptcp listener socket.

Consolidate such action in a single point let all the errors path
reach it.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: Return errno in sk->sk_prot->get_port().
Kuniyuki Iwashima [Fri, 18 Nov 2022 18:25:06 +0000 (10:25 -0800)]
net: Return errno in sk->sk_prot->get_port().

We assume the correct errno is -EADDRINUSE when sk->sk_prot->get_port()
fails, so some ->get_port() functions return just 1 on failure and the
callers return -EADDRINUSE instead.

However, mptcp_get_port() can return -EINVAL.  Let's not ignore the error.

Note the only exception is inet_autobind(), all of whose callers return
-EAGAIN instead.

Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: renesas: rswitch: Fix MAC address info
Yoshihiro Shimoda [Fri, 18 Nov 2022 00:27:24 +0000 (09:27 +0900)]
net: ethernet: renesas: rswitch: Fix MAC address info

Smatch detected the following warning.

    drivers/net/ethernet/renesas/rswitch.c:1717 rswitch_init() warn:
    '%pM' cannot be followed by 'n'

The 'n' should be '\n'.

Reported-by: Dan Carpenter <error27@gmail.com>
Suggested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'sarx5-VCAP-debugfs'
David S. Miller [Mon, 21 Nov 2022 11:33:02 +0000 (11:33 +0000)]
Merge branch 'sarx5-VCAP-debugfs'

netdev.vger.kernel.org archive mirror
Steen Hegelund says:

====================
net: Add support for VCAP debugFS in Sparx5

This provides support for getting VCAP instance, VCAP rule and VCAP port
keyset configuration information via the debug file system.

It builds on top of the initial IS2 VCAP support found in these series:

https://lore.kernel.org/all/20221020130904.1215072-1-steen.hegelund@microchip.com/
https://lore.kernel.org/all/20221109114116.3612477-1-steen.hegelund@microchip.com/
https://lore.kernel.org/all/20221111130519.1459549-1-steen.hegelund@microchip.com/

Functionality:
==============

The VCAP API exposes a /sys/kernel/debug/sparx5/vcaps folder containing
the following entries:

- raw_<vcap>_<instance>
    This is a raw dump of the VCAP instance with a line for each available
    VCAP rule.  This information is limited to the VCAP rule address, the
    rule size and the rule keyset name as this requires very little
    information from the VCAP cache.

    This can be used to detect if a valid rule is stored at the correct
    address.

- <vcap>_<instance>
    This dumps the VCAP instance configuration: address ranges, chain id
    ranges, word size of keys and actions etc, and for each VCAP rule the
    details of keys (values and masks) and actions are shown.

    This is useful when discovering if the expected rule is present and in
    which order it will be matched.

- <interface>
    This shows the keyset configuration per lookup and traffic type and the
    set of sticky bits (common for all interfaces). This is cleared when
    shown, so it is possible to sample over a period of time.

    It also shows if this port/lookup is enabled for matching in the VCAP.

    This can be used to find out which keyset the traffic being sent to a
    port, will be matched against, and if such traffic has been seen by one
    of the ports.

Delivery:
=========

This is current plan for delivering the full VCAP feature set of Sparx5:

- TC protocol all support for IS2 VCAP
- Sparx5 IS0 VCAP support
- TC policer and drop action support (depends on the Sparx5 QoS support
  upstreamed separately)
- Sparx5 ES0 VCAP support
- TC flower template support
- TC matchall filter support for mirroring and policing ports
- TC flower filter mirror action support
- Sparx5 ES2 VCAP support

Version History:
================
v2      Removed a 'support' folder (used for integration testing) that had
        been added in patch 6/8 by a mistake.
        Wrapped long lines.

v1      Initial version
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Add VCAP debugfs KUNIT test
Steen Hegelund [Thu, 17 Nov 2022 21:31:14 +0000 (22:31 +0100)]
net: microchip: sparx5: Add VCAP debugfs KUNIT test

This tests the functionality of the debugFS support:

- finding valid keyset on an address
- raw VCAP output
- full rule VCAP output

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Add VCAP locking to protect rules
Steen Hegelund [Thu, 17 Nov 2022 21:31:13 +0000 (22:31 +0100)]
net: microchip: sparx5: Add VCAP locking to protect rules

This ensures that the VCAP cache and the lists maintained in the VCAP
instance is protected when accessed by different clients.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Add VCAP debugFS key/action support for the VCAP API
Steen Hegelund [Thu, 17 Nov 2022 21:31:12 +0000 (22:31 +0100)]
net: microchip: sparx5: Add VCAP debugFS key/action support for the VCAP API

This add support for displaying the keys and actions in a rule.
The keys and action display format will be determined by the size and the
type of the key or action. The longer keys will typically be displayed as a
hexadecimal byte array.

The actionset is not decoded in full as the Sparx5 IS2 only has one
supported action, so this will be added later with other VCAP types.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Add VCAP rule debugFS support for the VCAP API
Steen Hegelund [Thu, 17 Nov 2022 21:31:11 +0000 (22:31 +0100)]
net: microchip: sparx5: Add VCAP rule debugFS support for the VCAP API

This add support to show all rules in a VCAP instance. The information
shown is:

 - rule id
 - address range
 - size
 - chain id
 - keyset name, subword size, register span
 - actionset name, subword size, register span
 - counter value
 - sticky bit (one bit width counter)

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Add raw VCAP debugFS support for the VCAP API
Steen Hegelund [Thu, 17 Nov 2022 21:31:10 +0000 (22:31 +0100)]
net: microchip: sparx5: Add raw VCAP debugFS support for the VCAP API

This adds support for decoding VCAP rules with a minimum number of
attributes: address, rule size and keyset.

This allows for a quick inspection of a VCAP instance to determine if the
rule are present and in the correct order.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Add VCAP debugFS support
Steen Hegelund [Thu, 17 Nov 2022 21:31:09 +0000 (22:31 +0100)]
net: microchip: sparx5: Add VCAP debugFS support

Add a debugFS root folder for Sparx5 and add a vcap folder underneath with
the VCAP instances and the ports

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Ensure VCAP last_used_addr is set back to default
Steen Hegelund [Thu, 17 Nov 2022 21:31:08 +0000 (22:31 +0100)]
net: microchip: sparx5: Ensure VCAP last_used_addr is set back to default

This ensures that the last_used_addr in a VCAP instance is returned to the
default value when all rules have been deleted.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: Ensure L3 protocol has a default value
Steen Hegelund [Thu, 17 Nov 2022 21:31:07 +0000 (22:31 +0100)]
net: microchip: sparx5: Ensure L3 protocol has a default value

This ensures that the l3_proto always have a valid value and that any
dissector parsing errors causes the flower rule to be discarded.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'gve-alternate-missed-completions'
David S. Miller [Mon, 21 Nov 2022 10:52:14 +0000 (10:52 +0000)]
Merge branch 'gve-alternate-missed-completions'

Jeroen de Borst says:

====================
gve: Handle alternate miss-completions

Some versions of the virtual NIC present miss-completions in
an alternative way. Let the diver handle these alternate completions
and announce this capability to the device.

The capability is announced uing a new AdminQ command that sends
driver information to the device. The device can refuse a driver
if it is lacking support for a capability, or it can adopt it's
behavior to work around OS specific issues.

Changed in v5:
- Removed comments in fucntion calls
- Switched ENOTSUPP back to EOPNOTSUPP and made sure it gets passed
Changed in v4:
- Clarified new AdminQ command in cover letter
- Changed EOPNOTSUPP to ENOTSUPP to match device's response
Changed in v3:
- Rewording cover letter
- Added 'Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>'
Changes in v2:
- Changed the subject to include 'gve:'
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agogve: Handle alternate miss completions
Jeroen de Borst [Thu, 17 Nov 2022 16:27:01 +0000 (08:27 -0800)]
gve: Handle alternate miss completions

The virtual NIC has 2 ways of indicating a miss-path
completion. This handles the alternate.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agogve: Adding a new AdminQ command to verify driver
Jeroen de Borst [Thu, 17 Nov 2022 16:27:00 +0000 (08:27 -0800)]
gve: Adding a new AdminQ command to verify driver

Check whether the driver is compatible with the device
presented.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoNFC: nci: Extend virtual NCI deinit test
Dmitry Vyukov [Thu, 17 Nov 2022 16:21:01 +0000 (17:21 +0100)]
NFC: nci: Extend virtual NCI deinit test

Extend the test to check the scenario when NCI core tries to send data
to already closed device to ensure that nothing bad happens.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Bongsu Jeon <bongsu.jeon@samsung.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'axiennet-mdio-bus-freq'
David S. Miller [Mon, 21 Nov 2022 10:36:04 +0000 (10:36 +0000)]
Merge branch 'axiennet-mdio-bus-freq'

Andy Chiu says:

====================
net: axienet: Use a DT property to configure frequency of the MDIO bus

Some FPGA platforms have to set frequency of the MDIO bus lower than 2.5
MHz. Thus, we use a DT property, which is "clock-frequency", to work
with it at boot time. The default 2.5 MHz would be set if the property
is not pressent. Also, factor out mdio enable/disable functions due to
the api change since 253761a0e61b7.

Changelog:
--- v5 ---
1. Make dt-binding patch prior to the implementation patch.
2. Disable mdio bus in error path.
3. Update description of some functions.
--- v4 ---
1. change MAX_MDIO_FREQ to DEFAULT_MDIO_FREQ as suggested by Andrew.
--- v3 RESEND ---
1. Repost the exact same patch again
--- v3 ---
1. Fix coding style, and make probing of the driver fail if MDC overflow
--- v2 ---
1. Use clock-frequency, as defined in mdio.yaml, to configure MDIO
   clock.
2. Only print out frequency if it is set to a non-standard value.
3. Reduce the scope of axienet_mdio_enable and remove
   axienet_mdio_disable because no one really uses it anymore.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: axienet: set mdio clock according to bus-frequency
Andy Chiu [Thu, 17 Nov 2022 15:40:14 +0000 (23:40 +0800)]
net: axienet: set mdio clock according to bus-frequency

Some FPGA platforms have 80KHz MDIO bus frequency constraint when
connecting Ethernet to its on-board external Marvell PHY. Thus, we may
have to set MDIO clock according to the DT. Otherwise, use the default
2.5 MHz, as specified by 802.3, if the entry is not present.

Also, change MAX_MDIO_FREQ to DEFAULT_MDIO_FREQ because we may actually
set MDIO bus frequency higher than 2.5MHz if undelying devices support
it. And properly disable the mdio bus clock in error path.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: describe the support of "clock-frequency" in mdio
Andy Chiu [Thu, 17 Nov 2022 15:40:13 +0000 (23:40 +0800)]
dt-bindings: describe the support of "clock-frequency" in mdio

mdio bus frequency is going to be configurable at boottime by a property
in DT now, so add a description to it.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: axienet: Unexport and remove unused mdio functions
Andy Chiu [Thu, 17 Nov 2022 15:40:12 +0000 (23:40 +0800)]
net: axienet: Unexport and remove unused mdio functions

Both axienet_mdio_{enable/disable} functions are no longer used in
xilinx_axienet_main.c due to 253761a0e61b7. And axienet_mdio_disable is
not even used in the mdio.c. So unexport and remove them.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: prevent uninitialized variable
Dan Carpenter [Thu, 17 Nov 2022 15:29:05 +0000 (18:29 +0300)]
net: microchip: sparx5: prevent uninitialized variable

Smatch complains that:

    drivers/net/ethernet/microchip/sparx5/sparx5_dcb.c:112
    sparx5_dcb_apptrust_validate() error: uninitialized symbol 'match'.

This would only happen if the:

if (sparx5_dcb_apptrust_policies[i].nselectors != nselectors)

condition is always true (they are not equal).  The "nselectors"
variable comes from dcbnl_ieee_set() and it is a number between 0-256.
This seems like a probably a real bug.

Fixes: 23f8382cd95d ("net: microchip: sparx5: add support for apptrust")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: mtk_eth_soc: fix RSTCTRL_PPE{0,1} definitions
Lorenzo Bianconi [Thu, 17 Nov 2022 14:29:53 +0000 (15:29 +0100)]
net: ethernet: mtk_eth_soc: fix RSTCTRL_PPE{0,1} definitions

Fix RSTCTRL_PPE0 and RSTCTRL_PPE1 register mask definitions for
MTK_NETSYS_V2.
Remove duplicated definitions.

Fixes: 160d3a9b1929 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: microchip: sparx5: kunit test: Fix compile warnings.
Horatiu Vultur [Thu, 17 Nov 2022 13:28:12 +0000 (14:28 +0100)]
net: microchip: sparx5: kunit test: Fix compile warnings.

When VCAP_KUNIT_TEST is enabled the following warnings are generated:

drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:257:34: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:258:41: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:342:23: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:359:23: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:1327:34: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:1328:41: warning: Using plain integer as NULL pointer

Therefore fix this.

Fixes: dccc30cc4906 ("net: microchip: sparx5: Add KUNIT test of counters and sorted rules")
Fixes: c956b9b318d9 ("net: microchip: sparx5: Adding KUNIT tests of key/action values in VCAP API")
Fixes: 67d637516fa9 ("net: microchip: sparx5: Adding KUNIT test for the VCAP API")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'nfp-ipsec-offload'
David S. Miller [Mon, 21 Nov 2022 08:51:36 +0000 (08:51 +0000)]
Merge branch 'nfp-ipsec-offload'

Simon Horman says:

====================
nfp: IPsec offload support

Huanhuan Wang says:

this series adds support for IPsec offload to the NFP driver.

It covers three enhancements:

1. Patches 1/3:
   - Extend the capability word and control word to to support
     new features.

2. Patch 2/3:
   - Add framework to support IPsec offloading for NFP driver,
     but IPsec offload control plane interface xfrm callbacks which
     interact with upper layer are not implemented in this patch.

3. Patch 3/3:
   - IPsec control plane interface xfrm callbacks are implemented
     in this patch.

Changes since v3
* Remove structure fields that describe firmware but
  are not used for Kernel offload
* Add WARN_ON(!xa_empty()) before call to xa_destroy()
* Added helpers for hash methods

Changes since v2
* OFFLOAD_HANDLE_ERROR macro and the associated code removed
* Unnecessary logging removed
* Hook function xdo_dev_state_free in struct xfrmdev_ops removed
* Use Xarray to maintain SA entries

Changes since v1
* Explicitly return failure when XFRM_STATE_ESN is set
* Fix the issue that AEAD algorithm is not correctly offloaded
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonfp: implement xfrm callbacks and expose ipsec offload feature to upper layer
Huanhuan Wang [Thu, 17 Nov 2022 13:21:02 +0000 (14:21 +0100)]
nfp: implement xfrm callbacks and expose ipsec offload feature to upper layer

Xfrm callbacks are implemented to offload SA info into firmware
by mailbox. It supports 16K SA info in total.

Expose ipsec offload feature to upper layer, this feature will
signal the availability of the offload.

Based on initial work of Norm Bagley <norman.bagley@netronome.com>.

Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonfp: add framework to support ipsec offloading
Huanhuan Wang [Thu, 17 Nov 2022 13:21:01 +0000 (14:21 +0100)]
nfp: add framework to support ipsec offloading

A new metadata type and config structure are introduced to
interact with firmware to support ipsec offloading. This
feature relies on specific firmware that supports ipsec
encrypt/decrypt by advertising related capability bit.

The xfrm callbacks which interact with upper layer are
implemented in the following patch.

Based on initial work of Norm Bagley <norman.bagley@netronome.com>.

Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonfp: extend capability and control words
Yinjun Zhang [Thu, 17 Nov 2022 13:21:00 +0000 (14:21 +0100)]
nfp: extend capability and control words

Currently the 32-bit capability word is almost exhausted, now
allocate some more words to support new features, and control
word is also extended accordingly. Packet-type offloading is
implemented in NIC application firmware, but it's not used in
kernel driver, so reserve this bit here in case it's redefined
for other use.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobna: Avoid clashing function prototypes
Gustavo A. R. Silva [Wed, 16 Nov 2022 16:59:44 +0000 (10:59 -0600)]
bna: Avoid clashing function prototypes

When built with Control Flow Integrity, function prototypes between
caller and function declaration must match. These mismatches are visible
at compile time with the new -Wcast-function-type-strict in Clang[1].

Fix a total of 227 warnings like these:

drivers/net/ethernet/brocade/bna/bna_enet.c:519:3: warning: cast from 'void (*)(struct bna_ethport *, enum bna_ethport_event)' to 'bfa_fsm_t' (aka 'void (*)(void *, int)') converts to incompatible function type [-Wcast-function-type-strict]
                bfa_fsm_set_state(ethport, bna_ethport_sm_down);
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The bna state machine code heavily overloads its state machine functions,
so these have been separated into their own sets of structs, enums,
typedefs, and helper functions. There are almost zero binary code changes,
all seem to be related to header file line numbers changing, or the
addition of the new stats helper.

Important to mention is that while I was manually implementing this changes
I was staring at this[2] patch from Kees Cook. Thanks, Kees. :)

Link: https://github.com/KSPP/linux/issues/240
[1] https://reviews.llvm.org/D134831
[2] https://lore.kernel.org/linux-hardening/20220929230334.2109344-1-keescook@chromium.org/
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: mediatek: ppe: assign per-port queues for offloaded traffic
Felix Fietkau [Wed, 16 Nov 2022 08:07:34 +0000 (09:07 +0100)]
net: ethernet: mediatek: ppe: assign per-port queues for offloaded traffic

Keeps traffic sent to the switch within link speed limits

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20221116080734.44013-7-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: tag_mtk: assign per-port queues
Felix Fietkau [Wed, 16 Nov 2022 08:07:33 +0000 (09:07 +0100)]
net: dsa: tag_mtk: assign per-port queues

Keeps traffic sent to the switch within link speed limits

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221116080734.44013-6-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues
Felix Fietkau [Wed, 16 Nov 2022 08:07:32 +0000 (09:07 +0100)]
net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues

When sending traffic to multiple ports with different link speeds, queued
packets to one port can drown out tx to other ports.
In order to better handle transmission to multiple ports, use the hardware
shaper feature to implement weighted fair queueing between ports.
Weight and maximum rate are automatically adjusted based on the link speed
of the port.
The first 3 queues are unrestricted and reserved for non-DSA direct tx on
GMAC ports. The following queues are automatically assigned by the MTK DSA
tag driver based on the target port number.
The PPE offload code configures the queues for offloaded traffic in the same
way.
This feature is only supported on devices supporting QDMA. All queues still
share the same DMA ring and descriptor pool.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20221116080734.44013-5-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_eth_soc: avoid port_mg assignment on MT7622 and newer
Felix Fietkau [Wed, 16 Nov 2022 08:07:31 +0000 (09:07 +0100)]
net: ethernet: mtk_eth_soc: avoid port_mg assignment on MT7622 and newer

On newer chips, this field is unused and contains some bits related to queue
assignment. Initialize it to 0 in those cases.
Fix offload_version on MT7621 and MT7623, which still need the previous value.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20221116080734.44013-4-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_eth_soc: drop packets to WDMA if the ring is full
Felix Fietkau [Wed, 16 Nov 2022 08:07:30 +0000 (09:07 +0100)]
net: ethernet: mtk_eth_soc: drop packets to WDMA if the ring is full

Improves handling of DMA ring overflow.
Clarify other WDMA drop related comment.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20221116080734.44013-3-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_eth_soc: increase tx ring size for QDMA devices
Felix Fietkau [Wed, 16 Nov 2022 08:07:29 +0000 (09:07 +0100)]
net: ethernet: mtk_eth_soc: increase tx ring size for QDMA devices

In order to use the hardware traffic shaper feature, a larger tx ring is
needed, especially for the scratch ring, which the hardware shaper uses to
reorder packets.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20221116080734.44013-2-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: remove reference to non-existing config PCS
Lukas Bulwahn [Wed, 16 Nov 2022 10:24:50 +0000 (11:24 +0100)]
net: fman: remove reference to non-existing config PCS

Commit a7c2a32e7f22 ("net: fman: memac: Use lynx pcs driver") makes the
Freescale Data-Path Acceleration Architecture Frame Manager use lynx pcs
driver by selecting PCS_LYNX.

It also selects the non-existing config PCS as well, which has no effect.

Remove this select to a non-existing config.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20221116102450.13928-1-lukas.bulwahn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonetlink: remove the flex array from struct nlmsghdr
Jakub Kicinski [Fri, 18 Nov 2022 03:39:03 +0000 (19:39 -0800)]
netlink: remove the flex array from struct nlmsghdr

I've added a flex array to struct nlmsghdr in
commit 738136a0e375 ("netlink: split up copies in the ack construction")
to allow accessing the data easily. It leads to warnings with clang,
if user space wraps this structure into another struct and the flex
array is not at the end of the container.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/all/20221114023927.GA685@u2004-local/
Link: https://lore.kernel.org/r/20221118033903.1651026-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomrp: introduce active flags to prevent UAF when applicant uninit
Schspa Shi [Wed, 16 Nov 2022 11:45:11 +0000 (19:45 +0800)]
mrp: introduce active flags to prevent UAF when applicant uninit

The caller of del_timer_sync must prevent restarting of the timer, If
we have no this synchronization, there is a small probability that the
cancellation will not be successful.

And syzbot report the fellowing crash:
==================================================================
BUG: KASAN: use-after-free in hlist_add_head include/linux/list.h:929 [inline]
BUG: KASAN: use-after-free in enqueue_timer+0x18/0xa4 kernel/time/timer.c:605
Write at addr f9ff000024df6058 by task syz-fuzzer/2256
Pointer tag: [f9], memory tag: [fe]

CPU: 1 PID: 2256 Comm: syz-fuzzer Not tainted 6.1.0-rc5-syzkaller-00008-
ge01d50cbd6ee #0
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace.part.0+0xe0/0xf0 arch/arm64/kernel/stacktrace.c:156
 dump_backtrace arch/arm64/kernel/stacktrace.c:162 [inline]
 show_stack+0x18/0x40 arch/arm64/kernel/stacktrace.c:163
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x68/0x84 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:284 [inline]
 print_report+0x1a8/0x4a0 mm/kasan/report.c:395
 kasan_report+0x94/0xb4 mm/kasan/report.c:495
 __do_kernel_fault+0x164/0x1e0 arch/arm64/mm/fault.c:320
 do_bad_area arch/arm64/mm/fault.c:473 [inline]
 do_tag_check_fault+0x78/0x8c arch/arm64/mm/fault.c:749
 do_mem_abort+0x44/0x94 arch/arm64/mm/fault.c:825
 el1_abort+0x40/0x60 arch/arm64/kernel/entry-common.c:367
 el1h_64_sync_handler+0xd8/0xe4 arch/arm64/kernel/entry-common.c:427
 el1h_64_sync+0x64/0x68 arch/arm64/kernel/entry.S:576
 hlist_add_head include/linux/list.h:929 [inline]
 enqueue_timer+0x18/0xa4 kernel/time/timer.c:605
 mod_timer+0x14/0x20 kernel/time/timer.c:1161
 mrp_periodic_timer_arm net/802/mrp.c:614 [inline]
 mrp_periodic_timer+0xa0/0xc0 net/802/mrp.c:627
 call_timer_fn.constprop.0+0x24/0x80 kernel/time/timer.c:1474
 expire_timers+0x98/0xc4 kernel/time/timer.c:1519

To fix it, we can introduce a new active flags to make sure the timer will
not restart.

Reported-by: syzbot+6fd64001c20aa99e34a4@syzkaller.appspotmail.com
Signed-off-by: Schspa Shi <schspa@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge tag 'rxrpc-next-20221116' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Fri, 18 Nov 2022 12:09:20 +0000 (12:09 +0000)]
Merge tag 'rxrpc-next-20221116' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fix oops and missing config conditionals

The patches that were pulled into net-next previously[1] had some issues
that this patchset fixes:

 (1) Fix missing IPV6 config conditionals.

 (2) Fix an oops caused by calling udpv6_sendmsg() directly on an AF_INET
     socket.

 (3) Fix the validation of network addresses on entry to socket functions
     so that we don't allow an AF_INET6 address if we've selected an
     AF_INET transport socket.

Link: https://lore.kernel.org/r/166794587113.2389296.16484814996876530222.stgit@warthog.procyon.org.uk/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: fix napi_disable() logic error
Eric Dumazet [Thu, 17 Nov 2022 09:26:41 +0000 (09:26 +0000)]
net: fix napi_disable() logic error

Dan reported a new warning after my recent patch:

New smatch warnings:
net/core/dev.c:6409 napi_disable() error: uninitialized symbol 'new'.

Indeed, we must first wait for STATE_SCHED and STATE_NPSVC to be cleared,
to make sure @new variable has been initialized properly.

Fixes: 4ffa1d1c6842 ("net: adopt try_cmpxchg() in napi_{enable|disable}()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agorxrpc: uninitialized variable in rxrpc_send_ack_packet()
Dan Carpenter [Thu, 17 Nov 2022 07:44:02 +0000 (10:44 +0300)]
rxrpc: uninitialized variable in rxrpc_send_ack_packet()

The "pkt" was supposed to have been deleted in a previous patch.  It
leads to an uninitialized variable bug.

Fixes: 72f0c6fb0579 ("rxrpc: Allocate ACK records at proposal and queue for transmission")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agorxrpc: fix rxkad_verify_response()
Dan Carpenter [Thu, 17 Nov 2022 07:43:38 +0000 (10:43 +0300)]
rxrpc: fix rxkad_verify_response()

The error handling for if skb_copy_bits() fails was accidentally deleted
so the rxkad_decrypt_ticket() function is not called.

Fixes: 5d7edbc9231e ("rxrpc: Get rid of the Rx ring")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: mtk_eth_soc: remove cpu_relax in mtk_pending_work
Lorenzo Bianconi [Wed, 16 Nov 2022 23:58:46 +0000 (00:58 +0100)]
net: ethernet: mtk_eth_soc: remove cpu_relax in mtk_pending_work

Get rid of cpu_relax in mtk_pending_work routine since MTK_RESETTING is
set only in mtk_pending_work() and it runs holding rtnl lock

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: mtk_eth_soc: do not overwrite mtu configuration running reset routine
Lorenzo Bianconi [Wed, 16 Nov 2022 23:35:04 +0000 (00:35 +0100)]
net: ethernet: mtk_eth_soc: do not overwrite mtu configuration running reset routine

Restore user configured MTU running mtk_hw_init() during tx timeout routine
since it will be overwritten after a hw reset.

Reported-by: Felix Fietkau <nbd@nbd.name>
Fixes: 9ea4d311509f ("net: ethernet: mediatek: add the whole ethernet reset into the reset process")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipa: avoid a null pointer dereference
Alex Elder [Wed, 16 Nov 2022 22:37:18 +0000 (16:37 -0600)]
net: ipa: avoid a null pointer dereference

Dan Carpenter reported that Smatch found an instance where a pointer
which had previously been assumed could be null (as indicated by a
null check) was later dereferenced without a similar check.

In practice this doesn't lead to a problem because currently the
pointers used are all non-null.  Nevertheless this patch addresses
the reported problem.

In addition, I spotted another bug that arose in the same commit.
When the command to initialize a routing table memory region was
added, the number of entries computed for the non-hashed table
was wrong (it ended up being a Boolean rather than the count
intended).  This bug is fixed here as well.

Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/kernel-janitors/Y3OOP9dXK6oEydkf@kili
Tested-by: Caleb Connolly <caleb.connolly@linaro.com>
Fixes: 5cb76899fb47 ("net: ipa: reduce arguments to ipa_table_init_add()")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge tag 'wireless-next-2022-11-18' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Fri, 18 Nov 2022 11:44:36 +0000 (11:44 +0000)]
Merge tag 'wireless-next-2022-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.2

Second set of patches for v6.2. Only driver patches this time, nothing
really special. Unused platform data support was removed from wl1251
and rtw89 got WoWLAN support.

Major changes:

ath11k

* support configuring channel dwell time during scan

rtw89

* new dynamic header firmware format support

* Wake-over-WLAN support

rtl8xxxu

* enable IEEE80211_HW_SUPPORT_FAST_XMIT
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'sctp-vrf'
David S. Miller [Fri, 18 Nov 2022 11:42:54 +0000 (11:42 +0000)]
Merge branch 'sctp-vrf'

Xin Long says:

====================
sctp: support vrf processing

This patchset adds the VRF processing in SCTP. Simliar to TCP/UDP,
it includes socket bind and socket/association lookup changes.

For socket bind change, it allows sockets to bind to a VRF device
and allows multiple sockets with the same IP and PORT to bind to
different interfaces in patch 1-3.

For socket/association lookup change, it adds dif and sdif check
in both asoc and ep lookup in patch 4 and 5, and when binding to
nodev, users can decide if accept the packets received from one
l3mdev by setup a sysctl option in patch 6.

Note with VRF support, in a netns, an association will be decided
by src ip + src port + dst ip + dst port + bound_dev_if, and it's
possible for ss to have:

  State       Local Address:Port      Peer Address:Port
   ESTAB     192.168.1.2%vrf-s1:1234
   `- ESTAB   192.168.1.2%veth1:1234   192.168.1.1:1234
   ESTAB     192.168.1.2%vrf-s2:1234
   `- ESTAB   192.168.1.2%veth2:1234   192.168.1.1:1234

See the selftest in patch 7 for more usage.

Also, thanks Carlo for testing this patch series on their use.

v1->v2:
  - In Patch 5, move sctp_sk_bound_dev_eq() definition to net/sctp/
    input.c to avoid a build error when IP_SCTP is disabled, as Paolo
    suggested.
  - In Patch 7, avoid one sleep by disabling the IPv6 dad, and remove
    another sleep by using ss to check if the server's ready, and also
    delete two unncessary sleeps in sctp_hello.c, as Paolo suggested.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoselftests: add a selftest for sctp vrf
Xin Long [Wed, 16 Nov 2022 20:01:22 +0000 (15:01 -0500)]
selftests: add a selftest for sctp vrf

This patch adds 12 small test cases: 01-04 test for the sysctl
net.sctp.l3mdev_accept. 05-10 test for only binding to a right
l3mdev device, the connection can be created. 11-12 test for
two socks binding to different l3mdev devices at the same time,
each of them can process the packets from the corresponding
peer. The tests run for both IPv4 and IPv6 SCTP.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosctp: add sysctl net.sctp.l3mdev_accept
Xin Long [Wed, 16 Nov 2022 20:01:21 +0000 (15:01 -0500)]
sctp: add sysctl net.sctp.l3mdev_accept

This patch is to add sysctl net.sctp.l3mdev_accept to allow
users to change the pernet global l3mdev_accept.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosctp: add dif and sdif check in asoc and ep lookup
Xin Long [Wed, 16 Nov 2022 20:01:20 +0000 (15:01 -0500)]
sctp: add dif and sdif check in asoc and ep lookup

This patch at first adds a pernet global l3mdev_accept to decide if it
accepts the packets from a l3mdev when a SCTP socket doesn't bind to
any interface. It's set to 1 to avoid any possible incompatible issue,
and in next patch, a sysctl will be introduced to allow to change it.

Then similar to inet/udp_sk_bound_dev_eq(), sctp_sk_bound_dev_eq() is
added to check either dif or sdif is equal to sk_bound_dev_if, and to
check sid is 0 or l3mdev_accept is 1 if sk_bound_dev_if is not set.
This function is used to match a association or a endpoint, namely
called by sctp_addrs_lookup_transport() and sctp_endpoint_is_match().
All functions that needs updating are:

sctp_rcv():
  asoc:
  __sctp_rcv_lookup()
    __sctp_lookup_association() -> sctp_addrs_lookup_transport()
    __sctp_rcv_lookup_harder()
      __sctp_rcv_init_lookup()
         __sctp_lookup_association() -> sctp_addrs_lookup_transport()
      __sctp_rcv_walk_lookup()
         __sctp_rcv_asconf_lookup()
           __sctp_lookup_association() -> sctp_addrs_lookup_transport()

  ep:
  __sctp_rcv_lookup_endpoint() -> sctp_endpoint_is_match()

sctp_connect():
  sctp_endpoint_is_peeled_off()
    __sctp_lookup_association()
      sctp_has_association()
        sctp_lookup_association()
          __sctp_lookup_association() -> sctp_addrs_lookup_transport()

sctp_diag_dump_one():
  sctp_transport_lookup_process() -> sctp_addrs_lookup_transport()

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosctp: add skb_sdif in struct sctp_af
Xin Long [Wed, 16 Nov 2022 20:01:19 +0000 (15:01 -0500)]
sctp: add skb_sdif in struct sctp_af

Add skb_sdif function in struct sctp_af to get the enslaved device
for both ipv4 and ipv6 when adding SCTP VRF support in sctp_rcv in
the next patch.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosctp: check sk_bound_dev_if when matching ep in get_port
Xin Long [Wed, 16 Nov 2022 20:01:18 +0000 (15:01 -0500)]
sctp: check sk_bound_dev_if when matching ep in get_port

In sctp_get_port_local(), when binding to IP and PORT, it should
also check sk_bound_dev_if to match listening sk if it's set by
SO_BINDTOIFINDEX, so that multiple sockets with the same IP and
PORT, but different sk_bound_dev_if can be listened at the same
time.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosctp: check ipv6 addr with sk_bound_dev if set
Xin Long [Wed, 16 Nov 2022 20:01:17 +0000 (15:01 -0500)]
sctp: check ipv6 addr with sk_bound_dev if set

When binding to an ipv6 address, it calls ipv6_chk_addr() to check if
this address is on any dev. If a socket binds to a l3mdev but no dev
is passed to do this check, all l3mdev and slaves will be skipped and
the check will fail.

This patch is to pass the bound_dev to make sure the devices under the
same l3mdev can be returned in ipv6_chk_addr(). When the bound_dev is
not a l3mdev or l3slave, l3mdev_master_dev_rcu() will return NULL in
__ipv6_chk_addr_and_flags(), it will keep compitable with before when
NULL dev was passed.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosctp: verify the bind address with the tb_id from l3mdev
Xin Long [Wed, 16 Nov 2022 20:01:16 +0000 (15:01 -0500)]
sctp: verify the bind address with the tb_id from l3mdev

After binding to a l3mdev, it should use the route table from the
corresponding VRF to verify the addr when binding to an address.

Note ipv6 doesn't need it, as binding to ipv6 address does not
verify the addr with route lookup.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: libwx: Fix dead code for duplicate check
Jiawen Wu [Wed, 16 Nov 2022 01:58:35 +0000 (09:58 +0800)]
net: libwx: Fix dead code for duplicate check

Fix duplicate check on polling timeout.

Fixes: 1efa9bfe58c5 ("net: libwx: Implement interaction with firmware")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: mscc: macsec: do not copy encryption keys
Antoine Tenart [Tue, 15 Nov 2022 15:44:51 +0000 (16:44 +0100)]
net: phy: mscc: macsec: do not copy encryption keys

Following 1b16b3fdf675 ("net: phy: mscc: macsec: clear encryption keys when freeing a flow"),
go one step further and instead of calling memzero_explicit on the key
when freeing a flow, simply not copy the key in the first place as it's
only used when a new flow is set up.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'net-ipa-change-gsi-firmware-load-specification'
Jakub Kicinski [Fri, 18 Nov 2022 05:46:58 +0000 (21:46 -0800)]
Merge branch 'net-ipa-change-gsi-firmware-load-specification'

Alex Elder says:

====================
net: ipa: change GSI firmware load specification

Currently, GSI firmware must be loaded for IPA before it can be
used--either by the modem, or by the AP.  New hardware supports a
third option, with the bootloader taking responsibility for loading
GSI firmware.  In that case, neither the AP nor the modem needs to
do that.

The first patch in this series deprecates the "modem-init" Device
Tree property in the IPA binding, using a new "qcom,gsi-loader"
property instead.  The second and third implement logic in the code
to support either the "old" or the "new" way of specifying how GSI
firmware is loaded.

The last two patches implement a new value for the "qcom,gsi-loader"
property.  If the value is "skip", neither the AP nor modem needs to
load the GSI firmware.  The first of these patches implements the
change in the IPA binding; the second implements it in the code.
====================

Link: https://lore.kernel.org/r/20221116073257.34010-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: permit GSI firmware loading to be skipped
Alex Elder [Wed, 16 Nov 2022 07:32:56 +0000 (01:32 -0600)]
net: ipa: permit GSI firmware loading to be skipped

Define a new value "skip" for the "qcom,gsi-loader" Device Tree
property.  If used, it indicates that neither the AP nor the modem
need to load GSI firmware (because it has already been loaded--for
example by the boot loader).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: qcom,ipa: support skipping GSI firmware load
Alex Elder [Wed, 16 Nov 2022 07:32:55 +0000 (01:32 -0600)]
dt-bindings: net: qcom,ipa: support skipping GSI firmware load

Add a new enumerated value to those defined for the qcom,gsi-loader
property.  If the qcom,gsi-loader is "skip", the GSI firmware will
already be loaded, so neither the AP nor modem is required to load
GSI firmware.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: introduce "qcom,gsi-loader" property
Alex Elder [Wed, 16 Nov 2022 07:32:54 +0000 (01:32 -0600)]
net: ipa: introduce "qcom,gsi-loader" property

Introduce a new way of specifying how the GSI firmware gets loaded
for IPA.  Currently, this is indicated by the presence or absence of
the Boolean "modem-init" Device Tree property.  The new property
must have a value--either "self" or "modem"--which indicates whether
the AP or modem is the GSI firmware loader, respectively.

For legacy systems, the new property will not exist, and the
"modem-init" property will be used.  For newer systems, the
"qcom,gsi-loader" property *must* exist, and must have one of the
two prescribed values.  It is an error to have both properties
defined, and it is an error for the new property to have an
unrecognized value.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipa: encapsulate decision about firmware load
Alex Elder [Wed, 16 Nov 2022 07:32:53 +0000 (01:32 -0600)]
net: ipa: encapsulate decision about firmware load

The GSI layer used for IPA requires firmware to be loaded.

Currently either the AP or the modem loads the firmware,
distinguished by whether the "modem-init" Device Tree
property is defined.

Some newer systems implement a third option.  In preparation for
that, encapsulate the code that determines how the GSI firmware
gets loaded in a new function, ipa_firmware_loader().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: qcom,ipa: deprecate modem-init
Alex Elder [Wed, 16 Nov 2022 07:32:52 +0000 (01:32 -0600)]
dt-bindings: net: qcom,ipa: deprecate modem-init

GSI firmware for IPA must be loaded during initialization, either by
the AP or by the modem.  The loader is currently specified based on
whether the Boolean modem-init property is present.

Instead, use a new property with an enumerated value to indicate
explicitly how GSI firmware gets loaded.  With this in place, a
third approach can be added in an upcoming patch.

The new qcom,gsi-loader property has two defined values:
  - self:   The AP loads GSI firmware
  - modem:  The modem loads GSI firmware
The modem-init property must still be supported, but is now marked
deprecated.

Update the example so it represents the SC7180 SoC, and provide
examples for the qcom,gsi-loader, memory-region, and firmware-name
properties.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosctp: move SCTP_PAD4 and SCTP_TRUNC4 to linux/sctp.h
Xin Long [Tue, 15 Nov 2022 15:40:21 +0000 (10:40 -0500)]
sctp: move SCTP_PAD4 and SCTP_TRUNC4 to linux/sctp.h

Move these two macros from net/sctp/sctp.h to linux/sctp.h, so that
it will be enough to include only linux/sctp.h in nft_exthdr.c and
xt_sctp.c. It should not include "net/sctp/sctp.h" if a module does
not have a dependence on SCTP module.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Link: https://lore.kernel.org/r/ef6468a687f36da06f575c2131cd4612f6b7be88.1668526821.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosctp: change to include linux/sctp.h in net/sctp/checksum.h
Xin Long [Tue, 15 Nov 2022 15:39:53 +0000 (10:39 -0500)]
sctp: change to include linux/sctp.h in net/sctp/checksum.h

Currently "net/sctp/checksum.h" including "net/sctp/sctp.h" is
included in quite some places in netfilter and openswitch and
net/sched. It's not necessary to include "net/sctp/sctp.h" if
a module does not have dependence on SCTP, "linux/sctp.h" is
the right one to include.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Link: https://lore.kernel.org/r/ca7ea96d62a26732f0491153c3979dc1c0d8d34a.1668526793.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'implement-devlink-rate-api-and-extend-it'
Jakub Kicinski [Fri, 18 Nov 2022 05:41:41 +0000 (21:41 -0800)]
Merge branch 'implement-devlink-rate-api-and-extend-it'

Michal Wilczynski says:

====================
Implement devlink-rate API and extend it

This patch series implements devlink-rate for ice driver. Unfortunately
current API isn't flexible enough for our use case, so there is a need to
extend it. Some functions have been introduced to enable the driver to
export current Tx scheduling configuration.

Pasting justification for this series from commit implementing devlink-rate
in ice driver(that is a part of this series):

There is a need to support modification of Tx scheduler tree, in the
ice driver. This will allow user to control Tx settings of each node in
the internal hierarchy of nodes. As a result user will be able to use
Hierarchy QoS implemented entirely in the hardware.

This patch implemenents devlink-rate API. It also exports initial
default hierarchy. It's mostly dictated by the fact that the tree
can't be removed entirely, all we can do is enable the user to modify
it. For example root node shouldn't ever be removed, also nodes that
have children are off-limits.

Example initial tree with 2 VF's:

[root@fedora ~]# devlink port function rate show
pci/0000:4b:00.0/node_27: type node parent node_26
pci/0000:4b:00.0/node_26: type node parent node_0
pci/0000:4b:00.0/node_34: type node parent node_33
pci/0000:4b:00.0/node_33: type node parent node_32
pci/0000:4b:00.0/node_32: type node parent node_16
pci/0000:4b:00.0/node_19: type node parent node_18
pci/0000:4b:00.0/node_18: type node parent node_17
pci/0000:4b:00.0/node_17: type node parent node_16
pci/0000:4b:00.0/node_21: type node parent node_20
pci/0000:4b:00.0/node_20: type node parent node_3
pci/0000:4b:00.0/node_14: type node parent node_5
pci/0000:4b:00.0/node_5: type node parent node_3
pci/0000:4b:00.0/node_13: type node parent node_4
pci/0000:4b:00.0/node_12: type node parent node_4
pci/0000:4b:00.0/node_11: type node parent node_4
pci/0000:4b:00.0/node_10: type node parent node_4
pci/0000:4b:00.0/node_9: type node parent node_4
pci/0000:4b:00.0/node_8: type node parent node_4
pci/0000:4b:00.0/node_7: type node parent node_4
pci/0000:4b:00.0/node_6: type node parent node_4
pci/0000:4b:00.0/node_4: type node parent node_3
pci/0000:4b:00.0/node_3: type node parent node_16
pci/0000:4b:00.0/node_16: type node parent node_15
pci/0000:4b:00.0/node_15: type node parent node_0
pci/0000:4b:00.0/node_2: type node parent node_1
pci/0000:4b:00.0/node_1: type node parent node_0
pci/0000:4b:00.0/node_0: type node
pci/0000:4b:00.0/1: type leaf parent node_27
pci/0000:4b:00.0/2: type leaf parent node_27

Let me visualize part of the tree:

                        +---------+
                        |  node_0 |
                        +---------+
                             |
                        +----v----+
                        | node_26 |
                        +----+----+
                             |
                        +----v----+
                        | node_27 |
                        +----+----+
                             |
                    |-----------------|
               +----v----+       +----v----+
               |   VF 1  |       |   VF 2  |
               +----+----+       +----+----+

So at this point there is a couple things that can be done.
For example we could only assign parameters to VF's.

[root@fedora ~]# devlink port function rate set pci/0000:4b:00.0/1 \
                 tx_max 5Gbps

This would cap the VF 1 BW to 5Gbps.

But let's say you would like to create a completely new branch.
This can be done like this:

[root@fedora ~]# devlink port function rate add \
                 pci/0000:4b:00.0/node_custom parent node_0
[root@fedora ~]# devlink port function rate add \
                 pci/0000:4b:00.0/node_custom_1 parent node_custom
[root@fedora ~]# devlink port function rate set \
                 pci/0000:4b:00.0/1 parent node_custom_1

This creates a completely new branch and reassigns VF 1 to it.

A number of parameters is supported per each node: tx_max, tx_share,
tx_priority and tx_weight.
====================

Link: https://lore.kernel.org/r/20221115104825.172668-1-michal.wilczynski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoDocumentation: Add documentation for new devlink-rate attributes
Michal Wilczynski [Tue, 15 Nov 2022 10:48:25 +0000 (11:48 +0100)]
Documentation: Add documentation for new devlink-rate attributes

Provide documentation for newly introduced netlink attributes for
devlink-rate: tx_priority and tx_weight.

Mention the possibility to export tree from the driver.

Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Add documentation for devlink-rate implementation
Michal Wilczynski [Tue, 15 Nov 2022 10:48:24 +0000 (11:48 +0100)]
ice: Add documentation for devlink-rate implementation

Add documentation to a newly added devlink-rate feature. Provide some
examples on how to use the commands, which netlink attributes are
supported and descriptions of the attributes.

Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Prevent ADQ, DCB coexistence with Custom Tx scheduler
Michal Wilczynski [Tue, 15 Nov 2022 10:48:23 +0000 (11:48 +0100)]
ice: Prevent ADQ, DCB coexistence with Custom Tx scheduler

ADQ, DCB might interfere with Custom Tx Scheduler changes that user
might introduce using devlink-rate API.

Check if ADQ, DCB is active, when user tries to change any setting
in exported Tx scheduler tree. If any of those are active block the user
from doing so, and log an appropriate message.

Remove the exported hierarchy if user enable ADQ or DCB.
Prevent ADQ or DCB from getting configured if user already made some
changes using devlink-rate API.

Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Implement devlink-rate API
Michal Wilczynski [Tue, 15 Nov 2022 10:48:22 +0000 (11:48 +0100)]
ice: Implement devlink-rate API

There is a need to support modification of Tx scheduler tree, in the
ice driver. This will allow user to control Tx settings of each node in
the internal hierarchy of nodes. As a result user will be able to use
Hierarchy QoS implemented entirely in the hardware.

This patch implemenents devlink-rate API. It also exports initial
default hierarchy. It's mostly dictated by the fact that the tree
can't be removed entirely, all we can do is enable the user to modify
it. For example root node shouldn't ever be removed, also nodes that
have children are off-limits.

Example initial tree with 2 VF's:

[root@fedora ~]# devlink port function rate show

pci/0000:4b:00.0/node_27: type node parent node_26
pci/0000:4b:00.0/node_26: type node parent node_0
pci/0000:4b:00.0/node_34: type node parent node_33
pci/0000:4b:00.0/node_33: type node parent node_32
pci/0000:4b:00.0/node_32: type node parent node_16
pci/0000:4b:00.0/node_19: type node parent node_18
pci/0000:4b:00.0/node_18: type node parent node_17
pci/0000:4b:00.0/node_17: type node parent node_16
pci/0000:4b:00.0/node_21: type node parent node_20
pci/0000:4b:00.0/node_20: type node parent node_3
pci/0000:4b:00.0/node_14: type node parent node_5
pci/0000:4b:00.0/node_5: type node parent node_3
pci/0000:4b:00.0/node_13: type node parent node_4
pci/0000:4b:00.0/node_12: type node parent node_4
pci/0000:4b:00.0/node_11: type node parent node_4
pci/0000:4b:00.0/node_10: type node parent node_4
pci/0000:4b:00.0/node_9: type node parent node_4
pci/0000:4b:00.0/node_8: type node parent node_4
pci/0000:4b:00.0/node_7: type node parent node_4
pci/0000:4b:00.0/node_6: type node parent node_4
pci/0000:4b:00.0/node_4: type node parent node_3
pci/0000:4b:00.0/node_3: type node parent node_16
pci/0000:4b:00.0/node_16: type node parent node_15
pci/0000:4b:00.0/node_15: type node parent node_0
pci/0000:4b:00.0/node_2: type node parent node_1
pci/0000:4b:00.0/node_1: type node parent node_0
pci/0000:4b:00.0/node_0: type node
pci/0000:4b:00.0/1: type leaf parent node_27
pci/0000:4b:00.0/2: type leaf parent node_27

Let me visualize part of the tree:

                    +---------+
                    |  node_0 |
                    +---------+
                         |
                    +----v----+
                    | node_26 |
                    +----+----+
                         |
                    +----v----+
                    | node_27 |
                    +----+----+
                         |
                |-----------------|
           +----v----+       +----v----+
           |   VF 1  |       |   VF 2  |
           +----+----+       +----+----+

So at this point there is a couple things that can be done.
For example we could only assign parameters to VF's.

[root@fedora ~]# devlink port function rate set pci/0000:4b:00.0/1 \
                 tx_max 5Gbps

This would cap the VF 1 BW to 5Gbps.

But let's say you would like to create a completely new branch.
This can be done like this:

[root@fedora ~]# devlink port function rate add \
                 pci/0000:4b:00.0/node_custom parent node_0
[root@fedora ~]# devlink port function rate add \
                 pci/0000:4b:00.0/node_custom_1 parent node_custom
[root@fedora ~]# devlink port function rate set \
                 pci/0000:4b:00.0/1 parent node_custom_1

This creates a completely new branch and reassigns VF 1 to it.

A number of parameters is supported per each node: tx_max, tx_share,
tx_priority and tx_weight.

Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>