]> git.apps.os.sepia.ceph.com Git - ceph-client.git/log
ceph-client.git
3 years agoperf c2c: Sort on peer snooping for load operations
Leo Yan [Thu, 11 Aug 2022 06:24:49 +0000 (14:24 +0800)]
perf c2c: Sort on peer snooping for load operations

This patch adds a new option 'peer' so can sort on the cache hit for
peer snooping.

For displaying with option 'peer', the "Shared Data Cache Line Table"
and "Shared Cache Line Distribution Pareto" both sort with the metrics
"tot_peer".

As result, we can get the 'peer' display:

  # perf c2c report -d peer --coalesce tid,pid,iaddr,dso -N --stdio

  =================================================
             Shared Data Cache Line Table
  =================================================
  #
  #        ----------- Cacheline ----------     Peer  ------- Load Peer -------    Total    Total    Total  --------- Stores --------  ----- Core Load Hit -----  - LLC Load Hit --  - RMT Load Hit --  --- Load Dram ----
  # Index             Address  Node  PA cnt    Snoop    Total    Local   Remote  records    Loads   Stores    L1Hit   L1Miss      N/A       FB       L1       L2    LclHit  LclHitm    RmtHit  RmtHitm       Lcl       Rmt
  # .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  ........  .......  ........  ........
  #
        0      0xaaaac17d6000   N/A       0  100.00%       99       99        0    18851    18851        0        0        0        0        0    18752        0        99        0         0        0         0         0

  =================================================
        Shared Cache Line Distribution Pareto
  =================================================
  #
  #        -- Peer Snoop --  ------- Store Refs ------  --------- Data address ---------                                                  ---------- cycles ----------    Total       cpu                                    Shared
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss      N/A              Offset  Node  PA cnt      Pid                Tid        Code address  rmt peer  lcl peer      load  records       cnt                  Symbol            Object      Source:Line  Node{cpus %peers %stores}
  # .....  .......  .......  .......  .......  .......  ..................  ....  ......  .......  .................  ..................  ........  ........  ........  .......  ........  ......................  ................  ...............  ....
  #
    ----------------------------------------------------------------------
        0        0       99        0        0        0      0xaaaac17d6000
    ----------------------------------------------------------------------
             0.00%    3.03%    0.00%    0.00%    0.00%                0x20   N/A       0     3603     3603:memstress      0xaaaac17c25ac         0       376        41     9314         2  [.] 0x00000000000025ac  memstress         memstress[25ac]   0{ 2 100.0%    n/a}
             0.00%    3.03%    0.00%    0.00%    0.00%                0x20   N/A       0     3603     3606:memstress      0xaaaac17c25ac         0       375        44     9155         1  [.] 0x00000000000025ac  memstress         memstress[25ac]   0{ 1 100.0%    n/a}
             0.00%   48.48%    0.00%    0.00%    0.00%                0x29   N/A       0     3603     3606:memstress      0xaaaac17c3e88         0       180       170       65         1  [.] 0x0000000000003e88  memstress         memstress[3e88]   0{ 1 100.0%    n/a}
             0.00%   45.45%    0.00%    0.00%    0.00%                0x29   N/A       0     3603     3603:memstress      0xaaaac17c3e88         0       180       175       70         2  [.] 0x0000000000003e88  memstress         memstress[3e88]   0{ 2 100.0%    n/a}

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-14-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf c2c: Refactor display string
Leo Yan [Thu, 11 Aug 2022 06:24:48 +0000 (14:24 +0800)]
perf c2c: Refactor display string

The display type is shown by combination the display string array and a
suffix string "HITMs", which is not friendly to extend display for other
sorting type (e.g. extension for peer operations).

This patch moves the suffix string "HITMs" into display string array for
HITM types, so it can allow us to not necessarily to output string
"HITMs" for new incoming display type.

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-13-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf c2c: Refactor node header
Leo Yan [Thu, 11 Aug 2022 06:24:47 +0000 (14:24 +0800)]
perf c2c: Refactor node header

The node header array contains 3 items, each item is used for one of
the 3 flavors for node accessing info.  To extend sorting on other
snooping type and not always stick to HITMs, the second header string
"Node{cpus %hitms %stores}" should be adjusted (e.g. it's changed as
"Node{cpus %peer %stores}").

For this reason, this patch changes the node header array to three
flat variables and uses switch-case in function setup_nodes_header(),
thus it is easier for altering the header string.

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-12-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf c2c: Rename dimension from 'percent_hitm' to 'percent_costly_snoop'
Leo Yan [Thu, 11 Aug 2022 06:24:46 +0000 (14:24 +0800)]
perf c2c: Rename dimension from 'percent_hitm' to 'percent_costly_snoop'

Use more general naming for the main sort dimension, this can allow us
not to sort only on HITM snoop type, so it can be extended to support
other costly snooping operations.  So rename the dimension to the prefix
'percent_costly_".

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-11-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf c2c: Use explicit names for display macros
Leo Yan [Thu, 11 Aug 2022 06:24:45 +0000 (14:24 +0800)]
perf c2c: Use explicit names for display macros

Perf c2c tool has an assumption that it heavily depends on HITM snoop
type to detect cache false sharing, unfortunately, HITM is not supported
on some architectures.

Essentially, perf c2c tool wants to find some very costly snooping
operations for false cache sharing, this means it's not necessarily
to stick using HITM tags and we can explore other snooping types
(e.g. SNOOPX_PEER).

For this reason, this patch renames HITM related display macros with
suffix '_HITM', so it can be distinct if later add more display types
for on other snooping type.

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-10-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf c2c: Add mean dimensions for peer operations
Leo Yan [Thu, 11 Aug 2022 06:24:44 +0000 (14:24 +0800)]
perf c2c: Add mean dimensions for peer operations

This patch adds two dimensions for the mean value of peer operations.

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-9-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf c2c: Add dimensions of peer metrics for cache line view
Leo Yan [Thu, 11 Aug 2022 06:24:43 +0000 (14:24 +0800)]
perf c2c: Add dimensions of peer metrics for cache line view

This patch adds dimensions of peer ops, which will be used for Shared
cache line distribution pareto.

It adds the percentage dimensions for local and remote peer operations,
and the dimensions for accounting operation numbers which is used for
stdio mode.

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-8-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf c2c: Add dimensions for peer load operations
Leo Yan [Thu, 11 Aug 2022 06:24:42 +0000 (14:24 +0800)]
perf c2c: Add dimensions for peer load operations

This patch adds three dimensions for peer load operations of 'lcl_peer',
'rmt_peer' and 'tot_peer'.  These three dimensions will be used in the
shared data cache line table.

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-7-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf c2c: Output statistics for peer snooping
Leo Yan [Thu, 11 Aug 2022 06:24:41 +0000 (14:24 +0800)]
perf c2c: Output statistics for peer snooping

This patch outputs statistics for peer snooping for whole trace events
and global shared cache line.

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf mem: Add statistics for peer snooping
Leo Yan [Thu, 11 Aug 2022 06:24:40 +0000 (14:24 +0800)]
perf mem: Add statistics for peer snooping

Since the flag PERF_MEM_SNOOPX_PEER is added to support cache snooping
from peer cache line, it can come from a peer core, a peer cluster, or
a remote NUMA node.

This patch adds statistics for the flag PERF_MEM_SNOOPX_PEER.  Note, we
take PERF_MEM_SNOOPX_PEER as an affiliated info, it needs to cooperate
with cache level statistics.  Therefore, we account the load operations
for both the cache level's metrics (e.g. ld_l2hit, ld_llchit, etc.) and
peer related metrics when flag PERF_MEM_SNOOPX_PEER is set.

So three new metrics are introduced: 'lcl_peer' is for local cache
access, the metric 'rmt_peer' is for remote access (includes remote DRAM
and any caches in remote node), and the metric 'tot_peer' is accounting
the sum value of 'lcl_peer' and 'rmt_peer'.

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf arm-spe: Use SPE data source for neoverse cores
Ali Saidi [Thu, 11 Aug 2022 06:24:39 +0000 (14:24 +0800)]
perf arm-spe: Use SPE data source for neoverse cores

When synthesizing data from SPE, augment the type with source information
for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use
the same encoding. I can't find encoding information for any other SPE
implementations to unify their choices with Arm's thus that is left for
future work.

This change populates the mem_lvl_num for Neoverse cores as well as the
deprecated mem_lvl namespace.

Reviewed-by: German Gomez <german.gomez@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Ali Saidi <alisaidi@amazon.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-4-leo.yan@linaro.org
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf mem: Print snoop peer flag
Leo Yan [Thu, 11 Aug 2022 06:24:38 +0000 (14:24 +0800)]
perf mem: Print snoop peer flag

Since PERF_MEM_SNOOPX_PEER flag is a new snoop type, print this flag if
it is set.

Before:
       memstress  3603 [020]   122.463754:          1            l1d-miss:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK  N/A               aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1          l1d-access:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK  N/A               aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1            llc-miss:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK  N/A               aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1          llc-access:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK  N/A               aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1          tlb-access:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK  N/A               aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1              memory:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK  N/A               aaaac17c3e88 [unknown] (/home/ubuntu/memstress)

After:

       memstress  3603 [020]   122.463754:          1            l1d-miss:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK  N/A              aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1          l1d-access:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK  N/A              aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1            llc-miss:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK  N/A              aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1          llc-access:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK  N/A              aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1          tlb-access:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK  N/A              aaaac17c3e88 [unknown] (/home/ubuntu/memstress)
       memstress  3603 [020]   122.463754:          1              memory:       8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK  N/A              aaaac17c3e88 [unknown] (/home/ubuntu/memstress)

Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tools: Sync addition of PERF_MEM_SNOOPX_PEER
Ali Saidi [Thu, 11 Aug 2022 06:24:37 +0000 (14:24 +0800)]
perf tools: Sync addition of PERF_MEM_SNOOPX_PEER

Add a flag to the 'perf mem' data struct to signal that a request caused
a cache-to-cache transfer of a line from a peer of the requestor and
wasn't sourced from a lower cache level.

The line being moved from one peer cache to another has latency and
performance implications.

On Arm64 Neoverse systems the data source can indicate a cache-to-cache
transfer but not if the line is dirty or clean, so instead of
overloading HITM define a new flag that indicates this type of transfer.

Committer notes:

This really is not syncing with the kernel since the patch to the kernel
wasn't merged.

But we're going ahead of this as it seems trivial and is just a matter
of the perf kernel maintainers to give their ack or for us to find
another way of expressing this in the perf records synthesized in
userspace from the ARM64 hardware traces.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Ali Saidi <alisaidi@amazon.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-2-leo.yan@linaro.org
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf arm64: Add missing -I for tools/arch/arm64/include/ to find asm/sysreg.h when...
Leo Yan [Thu, 11 Aug 2022 22:06:38 +0000 (19:06 -0300)]
perf arm64: Add missing -I for tools/arch/arm64/include/ to find asm/sysreg.h when building arm_spe.h

This cures a current problem where tools/perf/util/arm-spe.c isn't
finding a ARM64 specific asm header, so lets add it for now to make
progress.

Adding a .o specific rule seems clunky, lets try and find if this is
really the right solution.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/lkml/20220811124825.GA868014@leoy-huanghe.lan
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tools: Tidy guest option documentation
Adrian Hunter [Thu, 11 Aug 2022 17:04:11 +0000 (20:04 +0300)]
perf tools: Tidy guest option documentation

Move common guest options into include files. Use attribute substitution to
customize an example, using "[verse]" to define the block instead of a
"literal" block which does not permit substitution.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220811170411.84154-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf inject: Fix missing guestmount option documentation
Adrian Hunter [Thu, 11 Aug 2022 17:04:10 +0000 (20:04 +0300)]
perf inject: Fix missing guestmount option documentation

The 'perf inject' documentation is missing the guestmount option. Add it.

Fixes: 97406a7e4fa6e5ca ("perf inject: Add support for injecting guest sideband events")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220811170411.84154-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf script: Fix missing guest option documentation
Adrian Hunter [Thu, 11 Aug 2022 17:04:09 +0000 (20:04 +0300)]
perf script: Fix missing guest option documentation

The 'perf script' documentation is missing several options relating to
guests.  Add them.

Fixes: 15a108af1a18b597 ("perf script: Allow specifying the files to process guest samples")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220811170411.84154-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf offcpu: Update offcpu test for child process
Namhyung Kim [Thu, 11 Aug 2022 18:54:56 +0000 (11:54 -0700)]
perf offcpu: Update offcpu test for child process

Record off-cpu data with perf bench sched messaging workload and count
the number of offcpu-time events.  Also update the test script not to
run next tests if failed already and revise the error messages.

  $ sudo ./perf test offcpu -v
   88: perf record offcpu profiling tests                              :
  --- start ---
  test child forked, pid 344780
  Checking off-cpu privilege
  Basic off-cpu test
  Basic off-cpu test [Success]
  Child task off-cpu test
  Child task off-cpu test [Success]
  test child finished with 0
  ---- end ----
  perf record offcpu profiling tests: Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf offcpu: Track child processes
Namhyung Kim [Thu, 11 Aug 2022 18:54:55 +0000 (11:54 -0700)]
perf offcpu: Track child processes

When -p option used or a workload is given, it needs to handle child
processes.  The perf_event can inherit those task events
automatically.  We can add a new BPF program in task_newtask
tracepoint to track child processes.

Before:
  $ sudo perf record --off-cpu -- perf bench sched messaging
  $ sudo perf report --stat | grep -A1 offcpu
  offcpu-time stats:
            SAMPLE events:        1

After:
  $ sudo perf record -a --off-cpu -- perf bench sched messaging
  $ sudo perf report --stat | grep -A1 offcpu
  offcpu-time stats:
            SAMPLE events:      856

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf offcpu: Parse process id separately
Namhyung Kim [Thu, 11 Aug 2022 18:54:54 +0000 (11:54 -0700)]
perf offcpu: Parse process id separately

The current target code uses thread id for tracking tasks because
perf_events need to be opened for each task.  But we can use tgid in
BPF maps and check it easily.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf offcpu: Check process id for the given workload
Namhyung Kim [Thu, 11 Aug 2022 18:54:53 +0000 (11:54 -0700)]
perf offcpu: Check process id for the given workload

Current task filter checks task->pid which is different for each
thread.  But we want to profile all the threads in the process.  So
let's compare process id (or thread-group id: tgid) instead.

Before:
  $ sudo perf record --off-cpu -- perf bench sched messaging -t
  $ sudo perf report --stat | grep -A1 offcpu
  offcpu-time stats:
            SAMPLE events:        2

After:
  $ sudo perf record --off-cpu -- perf bench sched messaging -t
  $ sudo perf report --stat | grep -A1 offcpu
  offcpu-time stats:
            SAMPLE events:      850

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tools: Do not pass NULL to parse_events()
Adrian Hunter [Tue, 9 Aug 2022 08:07:02 +0000 (11:07 +0300)]
perf tools: Do not pass NULL to parse_events()

Many cases do not use the extra error information provided by
parse_events and instead pass NULL as the struct parse_events_error
pointer. Add a wrapper for those cases so that the pointer is never
NULL.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220809080702.6921-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Fix Track with sched_switch test for hybrid case
Adrian Hunter [Tue, 9 Aug 2022 08:07:01 +0000 (11:07 +0300)]
perf tests: Fix Track with sched_switch test for hybrid case

If cpu_core PMU event fails to parse, try also cpu_atom PMU event when
parsing cycles event.

Fixes: 43eb05d066795bdf ("perf tests: Support 'Track with sched_switch' test for hybrid")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220809080702.6921-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf parse-events: Fix segfault when event parser gets an error
Adrian Hunter [Tue, 9 Aug 2022 08:07:00 +0000 (11:07 +0300)]
perf parse-events: Fix segfault when event parser gets an error

parse_events() is often called with parse_events_error set to NULL.
Make parse_events_error__handle() not segfault in that case.

A subsequent patch changes to avoid passing NULL in the first place.

Fixes: 43eb05d066795bdf ("perf tests: Support 'Track with sched_switch' test for hybrid")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220809080702.6921-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf machine: Fix missing free of machine->kallsyms_filename
Adrian Hunter [Tue, 9 Aug 2022 13:07:58 +0000 (16:07 +0300)]
perf machine: Fix missing free of machine->kallsyms_filename

Add missing free of machine->kallsyms_filename to machine__exit().

Fixes: a5367ecb5353fbf2 ("perf tools: Automatically use guest kcore_dir if present")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220809130758.12800-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf script: Fix reference to perf insert instead of perf inject
Adrian Hunter [Tue, 9 Aug 2022 12:32:58 +0000 (15:32 +0300)]
perf script: Fix reference to perf insert instead of perf inject

Amend "perf insert" to "perf inject".

Fixes: e28fb159f1163e76 ("perf script: Add machine_pid and vcpu")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220809123258.9086-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf sched latency: Fix subcommand matching error
Yang Jihong [Mon, 8 Aug 2022 09:24:08 +0000 (17:24 +0800)]
perf sched latency: Fix subcommand matching error

perf sched latency use strncmp to match subcommands which matching does not
meet expectation.

Before:

  # perf sched lat1234 >/dev/null
  # echo $?
  0
  #

Solution: Use strstarts to match subcommand.

After:

   # perf sched lat1234

   Usage: perf sched [<options>] {record|latency|map|replay|script|timehist}

      -D, --dump-raw-trace  dump raw trace in ASCII
      -f, --force           don't complain, do it
      -i, --input <file>    input file name
      -v, --verbose         be more verbose (show symbol address, etc)

  # echo $?
  129
  #
  # perf sched lat >/dev/null
  # echo $?
  0
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220808092408.107399-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf kvm: Fix subcommand matching error
Yang Jihong [Mon, 8 Aug 2022 09:24:07 +0000 (17:24 +0800)]
perf kvm: Fix subcommand matching error

Currently the 'diff', 'top', 'buildid-list' and 'stat' perf commands use
strncmp() to match subcommands.  As a result, matching does not meet
expectation.

For example:
  # perf kvm diff1234
  # Event 'cycles'
  #
  # Baseline  Delta Abs  Shared Object  Symbol
  # ........  .........  .............  ......
  #

  # Event 'dummy:HG'
  #
  # Baseline  Delta Abs  Shared Object  Symbol
  # ........  .........  .............  ......
  #
  # echo $?
  0
  #

Invalid information should be returned, but success is actually returned.

Solution: Use strstarts() to match subcommands.

After:
  # perf kvm diff1234

   Usage: perf kvm [<options>] {top|record|report|diff|buildid-list|stat}

      -i, --input <file>    Input file name
      -o, --output <file>   Output file name
      -v, --verbose         be more verbose (show counter open errors, etc)
          --guest           Collect guest os data
          --guest-code      Guest code can be found in hypervisor process
          --guestkallsyms <file>
                            file saving guest os /proc/kallsyms
          --guestmodules <file>
                            file saving guest os /proc/modules
          --guestmount <directory>
                            guest mount directory under which every guest os instance has a subdir
          --guestvmlinux <file>
                            file saving guest os vmlinux
          --host            Collect host os data

  # echo $?
  129
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220808092408.107399-2-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf probe: Fix an error handling path in 'parse_perf_probe_command()'
Christophe JAILLET [Sat, 6 Aug 2022 14:51:26 +0000 (16:51 +0200)]
perf probe: Fix an error handling path in 'parse_perf_probe_command()'

If a memory allocation fail, we should branch to the error handling path
in order to free some resources allocated a few lines above.

Fixes: 15354d54698648e2 ("perf probe: Generate event name with line number")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: kernel-janitors@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/b71bcb01fa0c7b9778647235c3ab490f699ba278.1659797452.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf inject jit: Ignore memfd and anonymous mmap events if jitdump present
Brian Robbins [Fri, 5 Aug 2022 22:06:45 +0000 (15:06 -0700)]
perf inject jit: Ignore memfd and anonymous mmap events if jitdump present

Some processes store jitted code in memfd mappings to avoid having rwx
mappings.  These processes map the code with a writeable mapping and a
read-execute mapping.  They write the code using the writeable mapping
and then unmap the writeable mapping.  All subsequent execution is
through the read-execute mapping.

perf inject --jit ignores //anon* mappings for each process where a
jitdump is present because it expects to inject mmap events for each
jitted code range, and said jitted code ranges will overlap with the
//anon* mappings.

Ignore /memfd: and [anon:* mappings so that jitted code contained in
/memfd: and [anon:* mappings is treated the same way as jitted code
contained in //anon* mappings.

Signed-off-by: Brian Robbins <brianrob@linux.microsoft.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220805220645.95855-1-brianrob@linux.microsoft.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf list: Add PMU pai_crypto event description for IBM z16
Thomas Richter [Thu, 4 Aug 2022 07:52:21 +0000 (09:52 +0200)]
perf list: Add PMU pai_crypto event description for IBM z16

Add the event description for the IBM z16 pai_crypto PMU released with
commit 1bf54f32f525 ("s390/pai: Add support for cryptography counters")

The document SA22-7832-13 "z/Architecture Principles of Operation",
published May, 2022, contains the description of the
Processor Activity Instrumentation Facility and the cryptography
counter set., See Pages 5-110 to 5-113.

Patch reworked to fit for the converted jevents processing.

Committer notes:

Couldn't find 1bf54f32f525 ("s390/pai: Add support for cryptography
counters") in torvalds/master, in what tree is that cset?

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220804075221.1132849-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf vendor events: Remove bad jaketown uncore events
Ian Rogers [Fri, 5 Aug 2022 01:38:56 +0000 (18:38 -0700)]
perf vendor events: Remove bad jaketown uncore events

The event converter scripts at:

  https://github.com/intel/event-converter-for-linux-perf

passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:

  https://github.com/intel/event-converter-for-linux-perf/pull/15/commits/afd779df99ee41aac646eae1ae5ae651cda3394d

Fixes: 376d8b581b7639c9 ("perf vendor events: Update Intel jaketown")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805013856.1842878-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf vendor events: Remove bad ivytown uncore events
Ian Rogers [Fri, 5 Aug 2022 01:38:55 +0000 (18:38 -0700)]
perf vendor events: Remove bad ivytown uncore events

The event converter scripts at:

  https://github.com/intel/event-converter-for-linux-perf

passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:

  https://github.com/intel/event-converter-for-linux-perf/pull/15/commits/afd779df99ee41aac646eae1ae5ae651cda3394d

Fixes: 6220136831e34615 ("perf vendor events: Update Intel ivytown")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805013856.1842878-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf vendor events: Remove bad broadwellde uncore events
Ian Rogers [Fri, 5 Aug 2022 01:38:54 +0000 (18:38 -0700)]
perf vendor events: Remove bad broadwellde uncore events

The event converter scripts at:

  https://github.com/intel/event-converter-for-linux-perf

passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:

  https://github.com/intel/event-converter-for-linux-perf/pull/15/commits/afd779df99ee41aac646eae1ae5ae651cda3394d

Fixes: ef908a192512bf45 ("perf vendor events: Update Intel broadwellde")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805013856.1842878-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf jevents: Add JEVENTS_ARCH make option
Ian Rogers [Thu, 4 Aug 2022 22:18:02 +0000 (15:18 -0700)]
perf jevents: Add JEVENTS_ARCH make option

Allow the architecture built into pmu-events.c to be set on the make
command line with JEVENTS_ARCH.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf jevents: Simplify generation of C-string
Ian Rogers [Thu, 4 Aug 2022 22:18:01 +0000 (15:18 -0700)]
perf jevents: Simplify generation of C-string

Previous implementation wanted variable order and '(null)' string output
to match the C implementation. The '(null)' string output was a
quirk/bug and so there is no need to carry it forward.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf jevents: Clean up pytype warnings
Ian Rogers [Thu, 4 Aug 2022 22:18:00 +0000 (15:18 -0700)]
perf jevents: Clean up pytype warnings

Improve type hints to clean up pytype warnings.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools build: Switch to new openssl API for test-libcrypto
Roberto Sassu [Tue, 19 Jul 2022 17:05:55 +0000 (19:05 +0200)]
tools build: Switch to new openssl API for test-libcrypto

Switch to new EVP API for detecting libcrypto, as Fedora 36 returns an
error when it encounters the deprecated function MD5_Init() and the others.

The error would be interpreted as missing libcrypto, while in reality it is
not.

Fixes: 6e8ccb4f624a73c5 ("tools/bpf: properly account for libbfd variations")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: llvm@lists.linux.dev
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220719170555.2576993-4-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoRevert "perf build: Suppress openssl v3 deprecation warnings in libcrypto feature...
Arnaldo Carvalho de Melo [Tue, 9 Aug 2022 19:23:53 +0000 (16:23 -0300)]
Revert "perf build: Suppress openssl v3 deprecation warnings in libcrypto feature test"

This reverts commit 10fef869a58e37ec649b61eddab545f2da57a79b.

Because a proper fix was submitted.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf build: Remove FEATURE_CHECK_LDFLAGS-disassembler-{four-args,init-styled} setting
Roberto Sassu [Tue, 19 Jul 2022 17:05:54 +0000 (19:05 +0200)]
perf build: Remove FEATURE_CHECK_LDFLAGS-disassembler-{four-args,init-styled} setting

As the building mechanism is now able to retry detection with different
combinations of linking flags, setting
FEATURE_CHECK_LDFLAGS-disassembler-four-args and
FEATURE_CHECK_LDFLAGS-disassembler-init-styled is not necessary anymore,
so remove it.

Committer notes:

Use the same technique to find the set of bfd-related libraries to link as in:

  3308ffc5016e6136 ("tools, build: Retry detection of bfd-related features")

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220719170555.2576993-3-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agobpftool: Complete libbfd feature detection
Roberto Sassu [Tue, 19 Jul 2022 17:05:53 +0000 (19:05 +0200)]
bpftool: Complete libbfd feature detection

Commit 6e8ccb4f624a7 ("tools/bpf: properly account for libbfd variations")
sets the linking flags depending on which flavor of the libbfd feature was
detected.

However, the flavors except libbfd cannot be detected, as they are not in
the feature list.

Complete the list of features to detect by adding libbfd-liberty and
libbfd-liberty-z.

Committer notes:

Adjust conflict with with:

  1e1613f64cc8a09d ("tools bpftool: Don't display disassembler-four-args feature test")
  600b7b26c07a070d ("tools bpftool: Fix compilation error with new binutils")

Fixes: 6e8ccb4f624a73c5 ("tools/bpf: properly account for libbfd variations")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: llvm@lists.linux.dev
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220719170555.2576993-2-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools, build: Retry detection of bfd-related features
Roberto Sassu [Tue, 19 Jul 2022 17:05:52 +0000 (19:05 +0200)]
tools, build: Retry detection of bfd-related features

While separate features have been defined to determine which linking flags
are required to use libbfd depending on the distribution (libbfd,
libbfd-liberty and libbfd-liberty-z), the same has not been done for other
features requiring linking to libbfd.

For example, disassembler-four-args requires linking to libbfd too, but it
should use the right linking flags. If not all the required ones are
specified, e.g. -liberty, detection will always fail even if the feature is
available.

Instead of creating new features, similarly to libbfd, simply retry
detection with the different set of flags until detection succeeds (or
fails, if the libraries are missing). In this way, feature detection is
transparent for the users of this building mechanism (e.g. perf), and those
users don't have for example to set an appropriate value for the
FEATURE_CHECK_LDFLAGS-disassembler-four-args variable.

The number of retries and features for which the retry mechanism is
implemented is low enough to make the increase in the complexity of
Makefile negligible.

Tested with perf and bpftool on Ubuntu 20.04.4 LTS, Fedora 36 and openSUSE
Tumbleweed.

Committer notes:

Do the retry for disassembler-init-styled as well.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220719170555.2576993-1-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf test: JSON format checking
Claire Jensen [Fri, 5 Aug 2022 20:01:05 +0000 (13:01 -0700)]
perf test: JSON format checking

Add field checking tests for perf stat JSON output.

Sanity checks the expected number of fields are present, that the
expected keys are present and they have the correct values.

Committer notes:

Had to fix this:

  -               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib' \
  +               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \

Committer testing:

  [root@quaco ~]# perf test json
   90: perf stat JSON output linter                                    : Ok
  [root@quaco ~]# set -o vi
  [root@quaco ~]# perf test -v json
   90: perf stat JSON output linter                                    :
  --- start ---
  test child forked, pid 560794
  Checking json output: no args [Success]
  Checking json output: system wide [Success]
  Checking json output: system wide Checking json output: system wide no aggregation [Success]
  Checking json output: interval [Success]
  Checking json output: event [Success]
  Checking json output: per core [Success]
  Checking json output: per thread [Success]
  Checking json output: per die [Success]
  Checking json output: per node [Success]
  Checking json output: per socket [Success]
  test child finished with 0
  ---- end ----
  perf stat JSON output linter: Ok
  [root@quaco ~]#

Signed-off-by: Claire Jensen <cjense@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: Claire Jensen <clairej735@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805200105.2020995-3-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Add JSON output option
Claire Jensen [Fri, 5 Aug 2022 20:01:04 +0000 (13:01 -0700)]
perf stat: Add JSON output option

CSV output is tricky to format and column layout changes are susceptible
to breaking parsers. New JSON-formatted output has variable names to
identify fields that are consistent and informative, making the output
parseable.

CSV output example:

  1.20,msec,task-clock:u,1204272,100.00,0.697,CPUs utilized
  0,,context-switches:u,1204272,100.00,0.000,/sec
  0,,cpu-migrations:u,1204272,100.00,0.000,/sec
  70,,page-faults:u,1204272,100.00,58.126,K/sec

JSON output example:

  {"counter-value" : "3805.723968", "unit" : "msec", "event" :
  "cpu-clock", "event-runtime" : 3805731510100.00, "pcnt-running"
  : 100.00, "metric-value" : 4.007571, "metric-unit" : "CPUs utilized"}
  {"counter-value" : "6166.000000", "unit" : "", "event" :
  "context-switches", "event-runtime" : 3805723045100.00, "pcnt-running"
  : 100.00, "metric-value" : 1.620191, "metric-unit" : "K/sec"}
  {"counter-value" : "466.000000", "unit" : "", "event" :
  "cpu-migrations", "event-runtime" : 3805727613100.00, "pcnt-running"
  : 100.00, "metric-value" : 122.447136, "metric-unit" : "/sec"}
  {"counter-value" : "208.000000", "unit" : "", "event" :
  "page-faults", "event-runtime" : 3805726799100.00, "pcnt-running"
  : 100.00, "metric-value" : 54.654516, "metric-unit" : "/sec"}

Also added documentation for JSON option.

There is some tidy up of CSV code including a potential memory over run
in the os.nfields set up. To facilitate this an AGGR_MAX value is added.

Committer notes:

Fixed up using PRIu64 to format u64 values, not %lu.

Committer testing:

  ⬢[acme@toolbox perf]$ perf stat -j sleep 1
  {"counter-value" : "0.731750", "unit" : "msec", "event" : "task-clock:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000731, "metric-unit" : "CPUs utilized"}
  {"counter-value" : "0.000000", "unit" : "", "event" : "context-switches:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"}
  {"counter-value" : "0.000000", "unit" : "", "event" : "cpu-migrations:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"}
  {"counter-value" : "75.000000", "unit" : "", "event" : "page-faults:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 102.494021, "metric-unit" : "K/sec"}
  {"counter-value" : "578765.000000", "unit" : "", "event" : "cycles:u", "event-runtime" : 379366, "pcnt-running" : 49.00, "metric-value" : 0.790933, "metric-unit" : "GHz"}
  {"counter-value" : "1298.000000", "unit" : "", "event" : "stalled-cycles-frontend:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 0.224271, "metric-unit" : "frontend cycles idle"}
  {"counter-value" : "21984.000000", "unit" : "", "event" : "stalled-cycles-backend:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 3.798433, "metric-unit" : "backend cycles idle"}
  {"counter-value" : "468197.000000", "unit" : "", "event" : "instructions:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 0.808959, "metric-unit" : "insn per cycle"}
  {"metric-value" : 0.046955, "metric-unit" : "stalled cycles per insn"}
  {"counter-value" : "103335.000000", "unit" : "", "event" : "branches:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 141.216262, "metric-unit" : "M/sec"}
  {"counter-value" : "2381.000000", "unit" : "", "event" : "branch-misses:u", "event-runtime" : 388654, "pcnt-running" : 50.00, "metric-value" : 2.304156, "metric-unit" : "of all branches"}
  ⬢[acme@toolbox perf]$

Signed-off-by: Claire Jensen <cjense@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: Claire Jensen <clairej735@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805200105.2020995-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoMerge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Linus Torvalds [Tue, 9 Aug 2022 03:15:13 +0000 (20:15 -0700)]
Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull ksmbd updates from Steve French:

 - fixes for memory access bugs (out of bounds access, oops, leak)

 - multichannel fixes

 - session disconnect performance improvement, and session register
   improvement

 - cleanup

* tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix heap-based overflow in set_ntacl_dacl()
  ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT
  ksmbd: prevent out of bound read for SMB2_WRITE
  ksmbd: fix use-after-free bug in smb2_tree_disconect
  ksmbd: fix memory leak in smb2_handle_negotiate
  ksmbd: fix racy issue while destroying session on multichannel
  ksmbd: use wait_event instead of schedule_timeout()
  ksmbd: fix kernel oops from idr_remove()
  ksmbd: add channel rwlock
  ksmbd: replace sessions list in connection with xarray
  MAINTAINERS: ksmbd: add entry for documentation
  ksmbd: remove unused ksmbd_share_configs_cleanup function

3 years agoMerge tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 9 Aug 2022 03:04:35 +0000 (20:04 -0700)]
Merge tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull more iov_iter updates from Al Viro:

 - more new_sync_{read,write}() speedups - ITER_UBUF introduction

 - ITER_PIPE cleanups

 - unification of iov_iter_get_pages/iov_iter_get_pages_alloc and
   switching them to advancing semantics

 - making ITER_PIPE take high-order pages without splitting them

 - handling copy_page_from_iter() for high-order pages properly

* tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
  fix copy_page_from_iter() for compound destinations
  hugetlbfs: copy_page_to_iter() can deal with compound pages
  copy_page_to_iter(): don't split high-order page in case of ITER_PIPE
  expand those iov_iter_advance()...
  pipe_get_pages(): switch to append_pipe()
  get rid of non-advancing variants
  ceph: switch the last caller of iov_iter_get_pages_alloc()
  9p: convert to advancing variant of iov_iter_get_pages_alloc()
  af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
  iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
  block: convert to advancing variants of iov_iter_get_pages{,_alloc}()
  iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
  iov_iter: saner helper for page array allocation
  fold __pipe_get_pages() into pipe_get_pages()
  ITER_XARRAY: don't open-code DIV_ROUND_UP()
  unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
  unify xarray_get_pages() and xarray_get_pages_alloc()
  unify pipe_get_pages() and pipe_get_pages_alloc()
  iov_iter_get_pages(): sanity-check arguments
  iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
  ...

3 years agofix copy_page_from_iter() for compound destinations
Al Viro [Fri, 29 Jul 2022 16:54:53 +0000 (12:54 -0400)]
fix copy_page_from_iter() for compound destinations

had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when
ITER_BVEC had first appeared)...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agohugetlbfs: copy_page_to_iter() can deal with compound pages
Al Viro [Thu, 23 Jun 2022 21:24:09 +0000 (17:24 -0400)]
hugetlbfs: copy_page_to_iter() can deal with compound pages

... since April 2021

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agocopy_page_to_iter(): don't split high-order page in case of ITER_PIPE
Al Viro [Thu, 23 Jun 2022 21:21:37 +0000 (17:21 -0400)]
copy_page_to_iter(): don't split high-order page in case of ITER_PIPE

... just shove it into one pipe_buffer.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoexpand those iov_iter_advance()...
Al Viro [Sat, 11 Jun 2022 08:04:33 +0000 (04:04 -0400)]
expand those iov_iter_advance()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agopipe_get_pages(): switch to append_pipe()
Al Viro [Tue, 14 Jun 2022 20:38:53 +0000 (16:38 -0400)]
pipe_get_pages(): switch to append_pipe()

now that we are advancing the iterator, there's no need to
treat the first page separately - just call append_pipe()
in a loop.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoget rid of non-advancing variants
Al Viro [Fri, 10 Jun 2022 17:05:12 +0000 (13:05 -0400)]
get rid of non-advancing variants

mechanical change; will be further massaged in subsequent commits

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoceph: switch the last caller of iov_iter_get_pages_alloc()
Al Viro [Fri, 10 Jun 2022 15:43:27 +0000 (11:43 -0400)]
ceph: switch the last caller of iov_iter_get_pages_alloc()

here nothing even looks at the iov_iter after the call, so we couldn't
care less whether it advances or not.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years ago9p: convert to advancing variant of iov_iter_get_pages_alloc()
Al Viro [Fri, 10 Jun 2022 15:42:02 +0000 (11:42 -0400)]
9p: convert to advancing variant of iov_iter_get_pages_alloc()

that one is somewhat clumsier than usual and needs serious testing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoaf_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
Al Viro [Thu, 9 Jun 2022 15:14:04 +0000 (11:14 -0400)]
af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()

... and adjust the callers

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
Al Viro [Thu, 9 Jun 2022 15:07:52 +0000 (11:07 -0400)]
iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()

... and untangle the cleanup on failure to add into pipe.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoblock: convert to advancing variants of iov_iter_get_pages{,_alloc}()
Al Viro [Thu, 9 Jun 2022 14:37:57 +0000 (10:37 -0400)]
block: convert to advancing variants of iov_iter_get_pages{,_alloc}()

... doing revert if we end up not using some pages

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
Al Viro [Thu, 9 Jun 2022 14:28:36 +0000 (10:28 -0400)]
iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()

Most of the users immediately follow successful iov_iter_get_pages()
with advancing by the amount it had returned.

Provide inline wrappers doing that, convert trivial open-coded
uses of those.

BTW, iov_iter_get_pages() never returns more than it had been asked
to; such checks in cifs ought to be removed someday...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter: saner helper for page array allocation
Al Viro [Fri, 17 Jun 2022 18:45:41 +0000 (14:45 -0400)]
iov_iter: saner helper for page array allocation

All call sites of get_pages_array() are essenitally identical now.
Replace with common helper...

Returns number of slots available in resulting array or 0 on OOM;
it's up to the caller to make sure it doesn't ask to zero-entry
array (i.e. neither maxpages nor size are allowed to be zero).

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agofold __pipe_get_pages() into pipe_get_pages()
Al Viro [Fri, 17 Jun 2022 18:30:39 +0000 (14:30 -0400)]
fold __pipe_get_pages() into pipe_get_pages()

... and don't mangle maxsize there - turn the loop into counting
one instead.  Easier to see that we won't run out of array that
way.  Note that special treatment of the partial buffer in that
thing is an artifact of the non-advancing semantics of
iov_iter_get_pages() - if not for that, it would be append_pipe(),
same as the body of the loop that follows it.  IOW, once we make
iov_iter_get_pages() advancing, the whole thing will turn into
calculate how many pages do we want
allocate an array (if needed)
call append_pipe() that many times.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_XARRAY: don't open-code DIV_ROUND_UP()
Al Viro [Sat, 11 Jun 2022 00:30:35 +0000 (20:30 -0400)]
ITER_XARRAY: don't open-code DIV_ROUND_UP()

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agounify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
Al Viro [Fri, 17 Jun 2022 17:54:15 +0000 (13:54 -0400)]
unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts

same as for pipes and xarrays; after that iov_iter_get_pages() becomes
a wrapper for __iov_iter_get_pages_alloc().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agounify xarray_get_pages() and xarray_get_pages_alloc()
Al Viro [Fri, 17 Jun 2022 17:48:03 +0000 (13:48 -0400)]
unify xarray_get_pages() and xarray_get_pages_alloc()

same as for pipes

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agounify pipe_get_pages() and pipe_get_pages_alloc()
Al Viro [Fri, 17 Jun 2022 17:35:35 +0000 (13:35 -0400)]
unify pipe_get_pages() and pipe_get_pages_alloc()

The differences between those two are
* pipe_get_pages() gets a non-NULL struct page ** value pointing to
preallocated array + array size.
* pipe_get_pages_alloc() gets an address of struct page ** variable that
contains NULL, allocates the array and (on success) stores its address in
that variable.

Not hard to combine - always pass struct page ***, have
the previous pipe_get_pages_alloc() caller pass ~0U as cap for
array size.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter_get_pages(): sanity-check arguments
Al Viro [Fri, 17 Jun 2022 19:15:14 +0000 (15:15 -0400)]
iov_iter_get_pages(): sanity-check arguments

zero maxpages is bogus, but best treated as "just return 0";
NULL pages, OTOH, should be treated as a hard bug.

get rid of now completely useless checks in xarray_get_pages{,_alloc}().

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
Al Viro [Sat, 11 Jun 2022 00:38:20 +0000 (20:38 -0400)]
iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper

Incidentally, ITER_XARRAY did *not* free the sucker in case when
iter_xarray_populate_pages() returned 0...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: fold data_start() and pipe_space_for_user() together
Al Viro [Wed, 15 Jun 2022 13:44:38 +0000 (09:44 -0400)]
ITER_PIPE: fold data_start() and pipe_space_for_user() together

All their callers are next to each other; all of them
want the total amount of pages and, possibly, the
offset in the partial final buffer.

Combine into a new helper (pipe_npages()), fix the
bogosity in pipe_space_for_user(), while we are at it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: cache the type of last buffer
Al Viro [Wed, 15 Jun 2022 06:02:51 +0000 (02:02 -0400)]
ITER_PIPE: cache the type of last buffer

We often need to find whether the last buffer is anon or not, and
currently it's rather clumsy:
check if ->iov_offset is non-zero (i.e. that pipe is not empty)
if so, get the corresponding pipe_buffer and check its ->ops
if it's &default_pipe_buf_ops, we have an anon buffer.

Let's replace the use of ->iov_offset (which is nowhere near similar to
its role for other flavours) with signed field (->last_offset), with
the following rules:
empty, no buffers occupied: 0
anon, with bytes up to N-1 filled: N
zero-copy, with bytes up to N-1 filled: -N

That way abs(i->last_offset) is equal to what used to be in i->iov_offset
and empty vs. anon vs. zero-copy can be distinguished by the sign of
i->last_offset.

Checks for "should we extend the last buffer or should we start
a new one?" become easier to follow that way.

Note that most of the operations can only be done in a sane
state - i.e. when the pipe has nothing past the current position of
iterator.  About the only thing that could be done outside of that
state is iov_iter_advance(), which transitions to the sane state by
truncating the pipe.  There are only two cases where we leave the
sane state:
1) iov_iter_get_pages()/iov_iter_get_pages_alloc().  Will be
dealt with later, when we make get_pages advancing - the callers are
actually happier that way.
2) iov_iter copied, then something is put into the copy.  Since
they share the underlying pipe, the original gets behind.  When we
decide that we are done with the copy (original is not usable until then)
we advance the original.  direct_io used to be done that way; nowadays
it operates on the original and we do iov_iter_revert() to discard
the excessive data.  At the moment there's nothing in the kernel that
could do that to ITER_PIPE iterators, so this reason for insane state
is theoretical right now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: clean iov_iter_revert()
Al Viro [Sun, 12 Jun 2022 21:54:35 +0000 (17:54 -0400)]
ITER_PIPE: clean iov_iter_revert()

Fold pipe_truncate() into it, clean up.  We can release buffers
in the same loop where we walk backwards to the iterator beginning
looking for the place where the new position will be.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: clean pipe_advance() up
Al Viro [Wed, 15 Jun 2022 20:03:25 +0000 (16:03 -0400)]
ITER_PIPE: clean pipe_advance() up

instead of setting ->iov_offset for new position and calling
pipe_truncate() to adjust ->len of the last buffer and discard
everything after it, adjust ->len at the same time we set ->iov_offset
and use pipe_discard_from() to deal with buffers past that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: lose iter_head argument of __pipe_get_pages()
Al Viro [Thu, 16 Jun 2022 18:26:23 +0000 (14:26 -0400)]
ITER_PIPE: lose iter_head argument of __pipe_get_pages()

it's only used to get to the partial buffer we can add to,
and that's always the last one, i.e. pipe->head - 1.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: fold push_pipe() into __pipe_get_pages()
Al Viro [Sat, 11 Jun 2022 06:52:03 +0000 (02:52 -0400)]
ITER_PIPE: fold push_pipe() into __pipe_get_pages()

Expand the only remaining call of push_pipe() (in
__pipe_get_pages()), combine it with the page-collecting loop there.

Note that the only reason it's not a loop doing append_pipe() is
that append_pipe() is advancing, while iov_iter_get_pages() is not.
As soon as it switches to saner semantics, this thing will switch
to using append_pipe().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: allocate buffers as we go in copy-to-pipe primitives
Al Viro [Tue, 14 Jun 2022 17:53:53 +0000 (13:53 -0400)]
ITER_PIPE: allocate buffers as we go in copy-to-pipe primitives

New helper: append_pipe().  Extends the last buffer if possible,
allocates a new one otherwise.  Returns page and offset in it
on success, NULL on failure.  iov_iter is advanced past the
data we've got.

Use that instead of push_pipe() in copy-to-pipe primitives;
they get simpler that way.  Handling of short copy (in "mc" one)
is done simply by iov_iter_revert() - iov_iter is in consistent
state after that one, so we can use that.

[Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in]
[another braino fix, this time in copy_pipe_to_iter() and pipe_zero();
caught by testcase from Hugh Dickins]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: helpers for adding pipe buffers
Al Viro [Mon, 13 Jun 2022 18:30:15 +0000 (14:30 -0400)]
ITER_PIPE: helpers for adding pipe buffers

There are only two kinds of pipe_buffer in the area used by ITER_PIPE.

1) anonymous - copy_to_iter() et.al. end up creating those and copying
data there.  They have zero ->offset, and their ->ops points to
default_pipe_page_ops.

2) zero-copy ones - those come from copy_page_to_iter(), and page
comes from caller.  ->offset is also caller-supplied - it might be
non-zero.  ->ops points to page_cache_pipe_buf_ops.

Move creation and insertion of those into helpers - push_anon(pipe, size)
and push_page(pipe, page, offset, size) resp., separating them from
the "could we avoid creating a new buffer by merging with the current
head?" logics.

Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: helper for getting pipe buffer by index
Al Viro [Tue, 14 Jun 2022 14:24:37 +0000 (10:24 -0400)]
ITER_PIPE: helper for getting pipe buffer by index

pipe_buffer instances of a pipe are organized as a ring buffer,
with power-of-2 size.  Indices are kept *not* reduced modulo ring
size, so the buffer refered to by index N is
pipe->bufs[N & (pipe->ring_size - 1)].

Ring size can change over the lifetime of a pipe, but not while
the pipe is locked.  So for any iov_iter primitives it's a constant.
Original conversion of pipes to this layout went overboard trying
to microoptimize that - calculating pipe->ring_size - 1, storing
it in a local variable and using through the function.  In some
cases it might be warranted, but most of the times it only
obfuscates what's going on in there.

Introduce a helper (pipe_buf(pipe, N)) that would encapsulate
that and use it in the obvious cases.  More will follow...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agosplice: stop abusing iov_iter_advance() to flush a pipe
Al Viro [Sun, 12 Jun 2022 20:07:49 +0000 (16:07 -0400)]
splice: stop abusing iov_iter_advance() to flush a pipe

Use pipe_discard_from() explicitly in generic_file_read_iter(); don't bother
with rather non-obvious use of iov_iter_advance() in there.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoswitch new_sync_{read,write}() to ITER_UBUF
Al Viro [Sun, 22 May 2022 20:55:40 +0000 (16:55 -0400)]
switch new_sync_{read,write}() to ITER_UBUF

Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agonew iov_iter flavour - ITER_UBUF
Al Viro [Sun, 22 May 2022 18:59:25 +0000 (14:59 -0400)]
new iov_iter flavour - ITER_UBUF

Equivalent of single-segment iovec.  Initialized by iov_iter_ubuf(),
checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC
ones.

We are going to expose the things like ->write_iter() et.al. to those
in subsequent commits.

New predicate (user_backed_iter()) that is true for ITER_IOVEC and
ITER_UBUF; places like direct-IO handling should use that for
checking that pages we modify after getting them from iov_iter_get_pages()
would need to be dirtied.

DO NOT assume that replacing iter_is_iovec() with user_backed_iter()
will solve all problems - there's code that uses iter_is_iovec() to
decide how to poke around in iov_iter guts and for that the predicate
replacement obviously won't suffice.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoMerge tag 'rproc-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Mon, 8 Aug 2022 22:16:29 +0000 (15:16 -0700)]
Merge tag 'rproc-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull remoteproc updates from Bjorn Andersson:
 "This introduces support for the remoteproc on Mediatek MT8188, and
  enables caches for MT8186 SCP. It adds support for PRU cores found on
  the TI K3 AM62x SoCs.

  It moves the recovery work after a firmware crash to an unbound
  workqueue, to allow recovery to happen in parallel.

  A new DMA API is introduced to release dma_mem for a device.

  It adds support a panic handler for the Qualcomm modem remoteproc,
  with the goal of having caches flushed in memory dumps for post-mortem
  debugging and it introduces a mechanism to wait for the modem firmware
  on SM8450 to decrypt part of its memory for post-mortem debugging.

  Qualcomm sysmon is restricted to only inform remote processors about
  peers that are actually running, to avoid a race where Linux tries to
  notify a recovering remote processor about its peers new state. A
  mechanism for waiting for the sysmon connection to be established is
  also introduced, to avoid out-of-sync updates for rapidly restarting
  remote processors.

  A number of Devicetree binding cleanups and conversions to YAML are
  introduced, to facilitate Devicetree validation. Lastly it introduces
  a number of smaller fixes and cleanups in the core and a few different
  drivers"

* tag 'rproc-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (42 commits)
  remoteproc: qcom_q6v5_pas: Do not fail if regulators are not found
  drivers/remoteproc: fix repeated words in comments
  remoteproc: Directly use ida_alloc()/free()
  remoteproc: Use unbounded workqueue for recovery work
  remoteproc: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
  remoteproc: qcom_q6v5_pas: Deal silently with optional px and cx regulators
  remoteproc: sysmon: Send sysmon state only for running rprocs
  remoteproc: sysmon: Wait for SSCTL service to come up
  remoteproc: qcom: q6v5: Set q6 state to offline on receiving wdog irq
  remoteproc: qcom: pas: Check if coredump is enabled
  remoteproc: qcom: pas: Mark devices as wakeup capable
  remoteproc: qcom: pas: Mark va as io memory
  remoteproc: qcom: pas: Add decrypt shutdown support for modem
  remoteproc: qcom: q6v5-mss: add powerdomains to MSM8996 config
  remoteproc: qcom_q6v5: Introduce panic handler for MSS
  remoteproc: qcom_q6v5_mss: Update MBA log info
  remoteproc: qcom: correct kerneldoc
  remoteproc: qcom_q6v5_mss: map/unmap metadata region before/after use
  remoteproc: qcom: using pm_runtime_resume_and_get to simplify the code
  remoteproc: mediatek: Support MT8188 SCP
  ...

3 years agoMerge tag 'rpmsg-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Mon, 8 Aug 2022 22:14:43 +0000 (15:14 -0700)]
Merge tag 'rpmsg-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull rpmsg updates from Bjorn Andersson:
 "This contains fixes and cleanups in the rpmsg core, Qualcomm SMD and
  GLINK drivers, a circular lock dependency in the Mediatek driver and
  a possible race condition in the rpmsg_char driver is resolved"

* tag 'rpmsg-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  rpmsg: convert sysfs snprintf to sysfs_emit
  rpmsg: qcom_smd: Fix refcount leak in qcom_smd_parse_edge
  rpmsg: qcom: correct kerneldoc
  rpmsg: qcom: glink: remove unused name
  rpmsg: qcom: glink: replace strncpy() with strscpy_pad()
  rpmsg: Strcpy is not safe, use strscpy_pad() instead
  rpmsg: Fix possible refcount leak in rpmsg_register_device_override()
  rpmsg: Fix parameter naming for announce_create/destroy ops
  rpmsg: mtk_rpmsg: Fix circular locking dependency
  rpmsg: char: Add mutex protection for rpmsg_eptdev_open()

3 years agoMerge tag 'linux-watchdog-5.20-rc1' of git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Mon, 8 Aug 2022 22:04:04 +0000 (15:04 -0700)]
Merge tag 'linux-watchdog-5.20-rc1' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

 - add RTL9310 support

 - sp805_wdt: add arm cmsdk apb wdt support

 - Remove #ifdef guards for PM related functions for several watchdog
   device drivers

 - pm8916_wdt reboot improvements

 - Several other fixes and improvements

* tag 'linux-watchdog-5.20-rc1' of git://www.linux-watchdog.org/linux-watchdog: (24 commits)
  watchdog: armada_37xx_wdt: check the return value of devm_ioremap() in armada_37xx_wdt_probe()
  watchdog: dw_wdt: Fix comment typo
  watchdog: Fix comment typo
  dt-bindings: watchdog: Add fsl,scu-wdt yaml file
  watchdog:Fix typo in comment
  watchdog: pm8916_wdt: Handle watchdog enabled by bootloader
  watchdog: pm8916_wdt: Report reboot reason
  watchdog: pm8916_wdt: Avoid read of write-only PET register
  watchdog: wdat_wdt: Remove #ifdef guards for PM related functions
  watchdog: tegra_wdt: Remove #ifdef guards for PM related functions
  watchdog: st_lpc_wdt: Remove #ifdef guards for PM related functions
  watchdog: sama5d4_wdt: Remove #ifdef guards for PM related functions
  watchdog: s3c2410_wdt: Remove #ifdef guards for PM related functions
  watchdog: mtk_wdt: Remove #ifdef guards for PM related functions
  watchdog: dw_wdt: Remove #ifdef guards for PM related functions
  watchdog: bcm7038_wdt: Remove #ifdef guards for PM related functions
  watchdog: realtek-otto: add RTL9310 support
  dt-bindings: watchdog: realtek,otto-wdt: add RTL9310
  watchdog: sp805_wdt: add arm cmsdk apb wdt support
  watchdog: sp5100_tco: Fix a memory leak of EFCH MMIO resource
  ...

3 years agoMerge tag 'pm-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Mon, 8 Aug 2022 21:29:00 +0000 (14:29 -0700)]
Merge tag 'pm-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "These are ARM cpufreq updates and operating performance points (OPP)
  updates plus one cpuidle update adding a new trace point.

  Specifics:

   - Fix return error code in mtk_cpu_dvfs_info_init (Yang Yingliang).

   - Minor cleanups and support for new boards for Qcom cpufreq drivers
     (Bryan O'Donoghue, Konrad Dybcio, Pierre Gondois, and Yicong Yang).

   - Fix sparse warnings for Tegra cpufreq driver (Viresh Kumar).

   - Make dev_pm_opp_set_regulators() accept NULL terminated list
     (Viresh Kumar).

   - Add dev_pm_opp_set_config() and friends and migrate other users and
     helpers to using them (Viresh Kumar).

   - Add support for multiple clocks for a device (Viresh Kumar and
     Krzysztof Kozlowski).

   - Configure resources before adding OPP table for Venus (Stanimir
     Varbanov).

   - Keep reference count up for opp->np and opp_table->np while they
     are still in use (Liang He).

   - Minor OPP cleanups (Viresh Kumar and Yang Li).

   - Add a trace event for cpuidle to track missed (too deep or too
     shallow) wakeups (Kajetan Puchalski)"

* tag 'pm-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (55 commits)
  cpuidle: Add cpu_idle_miss trace event
  venus: pm_helpers: Fix warning in OPP during probe
  OPP: Don't drop opp->np reference while it is still in use
  OPP: Don't drop opp_table->np reference while it is still in use
  cpufreq: tegra194: Staticize struct tegra_cpufreq_soc instances
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM6375 compatible
  dt-bindings: opp: Add msm8939 to the compatible list
  dt-bindings: opp: Add missing compat devices
  dt-bindings: opp: opp-v2-kryo-cpu: Fix example binding checks
  cpufreq: Change order of online() CB and policy->cpus modification
  cpufreq: qcom-hw: Remove deprecated irq_set_affinity_hint() call
  cpufreq: qcom-hw: Disable LMH irq when disabling policy
  cpufreq: qcom-hw: Reset cancel_throttle when policy is re-enabled
  cpufreq: qcom-cpufreq-hw: use HZ_PER_KHZ macro in units.h
  cpufreq: mediatek: fix error return code in mtk_cpu_dvfs_info_init()
  OPP: Remove dev{m}_pm_opp_of_add_table_noclk()
  PM / devfreq: tegra30: Register config_clks helper
  OPP: Allow config_clks helper for single clk case
  OPP: Provide a simple implementation to configure multiple clocks
  OPP: Assert clk_count == 1 for single clk helpers
  ...

3 years agoMerge tag 'thermal-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafae...
Linus Torvalds [Mon, 8 Aug 2022 21:23:37 +0000 (14:23 -0700)]
Merge tag 'thermal-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more thermal control updates from Rafael Wysocki:
 "These fix an error code path issue leading to a NULL pointer
  dereference, drop Kconfig dependencies that are not needed any more
  after recent changes, add CPU IDs for new chips to a driver and fix up
  the tmon utility.

  Specifics:

   - Fix NULL pointer dereference in the thermal sysfs interface that
     results from an error code path mishandling (Rafael Wysocki).

   - Drop COMPILE_TEST dependency that's not needed any more from two
     thermal Kconfig entries (Jean Delvare).

   - Make the Intel TCC cooling driver support Alder Lake-N and Raptor
     Lake-P (Sumeet Pawnikar).

   - Fix possible path truncations in the tmon utility (Florian
     Fainelli)"

* tag 'thermal-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  tools/thermal: Fix possible path truncations
  thermal: Drop obsolete dependency on COMPILE_TEST
  thermal: sysfs: Fix cooling_device_stats_setup() error code path
  thermal: intel: Add TCC cooling support for Alder Lake-N and Raptor Lake-P

3 years agoMerge tag 'sysctl-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof...
Linus Torvalds [Mon, 8 Aug 2022 21:17:46 +0000 (14:17 -0700)]
Merge tag 'sysctl-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "There isn't much for 6.0 for sysctl stuff, most of the stuff went
  through the networking subsystem (Kuniyuki Iwashima's trove of fixes
  using READ_ONCE/WRITE_ONCE helpers) as most of the issues there have
  been identified on networking side. So it is good we don't have much
  updates as we would have ended up with tons of conflicts. I rebased my
  delta just now to your tree so to avoid conflicts with that stuff.
  This merge request is just minor fluff cleanups then. Perhaps for 6.1
  kernel/sysctl.c will get more love than this release"

* tag 'sysctl-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  kernel/sysctl.c: Remove trailing white space
  kernel/sysctl.c: Clean up indentation, replace spaces with tab.
  sysctl: Merge adjacent CONFIG_TREE_RCU blocks

3 years agoMerge tag 'modules-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof...
Linus Torvalds [Mon, 8 Aug 2022 21:12:19 +0000 (14:12 -0700)]
Merge tag 'modules-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
 "For the 6.0 merge window the modules code shifts to cleanup and minor
  fixes effort. This becomes much easier to do and review now due to the
  code split to its own directory from effort on the last kernel
  release. I expect to see more of this with time and as we expand on
  test coverage in the future. The cleanups and fixes come from usual
  suspects such as Christophe Leroy and Aaron Tomlin but there are also
  some other contributors.

  One particular minor fix worth mentioning is from Helge Deller, where
  he spotted a *forever* incorrect natural alignment on both ELF section
  header tables:

    * .altinstructions
    * __bug_table sections

  A lot of back and forth went on in trying to determine the ill effects
  of this misalignment being present for years and it has been
  determined there should be no real ill effects unless you have a buggy
  exception handler. Helge actually hit one of these buggy exception
  handlers on parisc which is how he ended up spotting this issue. When
  implemented correctly these paths with incorrect misalignment would
  just mean a performance penalty, but given that we are dealing with
  alternatives on modules and with the __bug_table (where info regardign
  BUG()/WARN() file/line information associated with it is stored) this
  really shouldn't be a big deal.

  The only other change with mentioning is the kmap() with
  kmap_local_page() and my only concern with that was on what is done
  after preemption, but the virtual addresses are restored after
  preemption. This is only used on module decompression.

  This all has sit on linux-next for a while except the kmap stuff which
  has been there for 3 weeks"

* tag 'modules-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  module: Replace kmap() with kmap_local_page()
  module: Show the last unloaded module's taint flag(s)
  module: Use strscpy() for last_unloaded_module
  module: Modify module_flags() to accept show_state argument
  module: Move module's Kconfig items in kernel/module/
  MAINTAINERS: Update file list for module maintainers
  module: Use vzalloc() instead of vmalloc()/memset(0)
  modules: Ensure natural alignment for .altinstructions and __bug_table sections
  module: Increase readability of module_kallsyms_lookup_name()
  module: Fix ERRORs reported by checkpatch.pl
  module: Add support for default value for module async_probe

3 years agoMerge tag 'leds-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel...
Linus Torvalds [Mon, 8 Aug 2022 18:36:21 +0000 (11:36 -0700)]
Merge tag 'leds-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds

Pull LED updates from Pavel Machek:
 "A new driver for bcm63138, is31fl319x updates, fixups for multicolor.

  The clevo-mail driver got disabled, it needs an API fix"

* tag 'leds-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds: (23 commits)
  leds: is31fl319x: use simple i2c probe function
  leds: is31fl319x: Fix devm vs. non-devm ordering
  leds: is31fl319x: Make use of dev_err_probe()
  leds: is31fl319x: Make use of device properties
  leds: is31fl319x: Cleanup formatting and dev_dbg calls
  leds: is31fl319x: Add support for is31fl319{0,1,3} chips
  leds: is31fl319x: Move chipset-specific values in chipdef struct
  leds: is31fl319x: Use non-wildcard names for vars, structs and defines
  leds: is31fl319x: Add missing si-en compatibles
  dt-bindings: leds: pwm-multicolor: document max-brigthness
  leds: turris-omnia: convert to use dev_groups
  leds: leds-bcm63138: get rid of LED_OFF
  leds: add help info about BCM63138 module name
  dt-bindings: leds: leds-bcm63138: unify full stops in descriptions
  dt-bindings: leds: lp50xx: fix LED children names
  dt-bindings: leds: class-multicolor: reference class directly in multi-led node
  leds: bcm63138: add support for BCM63138 controller
  dt-bindings: leds: add Broadcom's BCM63138 controller
  leds: clevo-mail: Mark as broken pending interface fix
  leds: pwm-multicolor: Support active-low LEDs
  ...

3 years agoMerge tag 'tty-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Mon, 8 Aug 2022 18:31:40 +0000 (11:31 -0700)]
Merge tag 'tty-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial driver updates from Greg KH:
 "Here is the big set of tty and serial driver changes for 6.0-rc1.

  It was delayed from last week as I wanted to make sure the last commit
  here got some good testing in linux-next and elsewhere as it seemed to
  show up only late in testing for some reason.

  Nothing major here, just lots of cleanups from Jiri and Ilpo to make
  the tty core cleaner (Jiri) and the rs485 code simpler to use (Ilpo).

  Also included in here is the obligatory n_gsm updates from Daniel
  Starke and lots of tiny driver updates and minor fixes and tweaks for
  other smaller serial drivers.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'tty-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (186 commits)
  tty: serial: qcom-geni-serial: Fix %lu -> %u in print statements
  tty: amiserial: Fix comment typo
  tty: serial: document uart_get_console()
  tty: serial: serial_core, reformat kernel-doc for functions
  Documentation: serial: link uart_ops properly
  Documentation: serial: move GPIO kernel-doc to the functions
  Documentation: serial: dedup kernel-doc for uart functions
  Documentation: serial: move uart_ops documentation to the struct
  dt-bindings: serial: snps-dw-apb-uart: Document Rockchip RV1126
  serial: mvebu-uart: uart2 error bits clearing
  tty: serial: fsl_lpuart: correct the count of break characters
  serial: stm32: make info structs static to avoid sparse warnings
  serial: fsl_lpuart: zero out parity bit in CS7 mode
  tty: serial: qcom-geni-serial: Fix get_clk_div_rate() which otherwise could return a sub-optimal clock rate.
  serial: 8250_bcm2835aux: Add missing clk_disable_unprepare()
  tty: vt: initialize unicode screen buffer
  serial: remove VR41XX serial driver
  serial: 8250: lpc18xx: Remove redundant sanity check for RS485 flags
  serial: 8250_dwlib: remove redundant sanity check for RS485 flags
  dt_bindings: rs485: Correct delay values
  ...

3 years agoMerge tag 'f2fs-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk...
Linus Torvalds [Mon, 8 Aug 2022 18:18:31 +0000 (11:18 -0700)]
Merge tag 'f2fs-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this cycle, we mainly fixed some corner cases that manipulate a
  per-file compression flag inappropriately. And, we found f2fs counted
  valid blocks in a section incorrectly when zone capacity is set, and
  thus, fixed it with additional sysfs entry to check it easily.

  Lastly, this series includes several patches with respect to the new
  atomic write support such as a couple of bug fixes and re-adding
  atomic_write_abort support that we removed by mistake in the previous
  release.

  Enhancements:
   - add sysfs entries to understand atomic write operations and zone
     capacity
   - introduce memory mode to get a hint for low-memory devices
   - adjust the waiting time of foreground GC
   - decompress clusters under softirq to avoid non-deterministic
     latency
   - do not skip updating inode when retrying to flush node page
   - enforce single zone capacity

  Bug fixes:
   - set the compression/no-compression flags correctly
   - revive F2FS_IOC_ABORT_VOLATILE_WRITE
   - check inline_data during compressed inode conversion
   - understand zone capacity when calculating valid block count

  As usual, the series includes several minor clean-ups and sanity
  checks"

* tag 'f2fs-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits)
  f2fs: use onstack pages instead of pvec
  f2fs: intorduce f2fs_all_cluster_page_ready
  f2fs: clean up f2fs_abort_atomic_write()
  f2fs: handle decompress only post processing in softirq
  f2fs: do not allow to decompress files have FI_COMPRESS_RELEASED
  f2fs: do not set compression bit if kernel doesn't support
  f2fs: remove device type check for direct IO
  f2fs: fix null-ptr-deref in f2fs_get_dnode_of_data
  f2fs: revive F2FS_IOC_ABORT_VOLATILE_WRITE
  f2fs: fix to do sanity check on segment type in build_sit_entries()
  f2fs: obsolete unused MAX_DISCARD_BLOCKS
  f2fs: fix to avoid use f2fs_bug_on() in f2fs_new_node_page()
  f2fs: fix to remove F2FS_COMPR_FL and tag F2FS_NOCOMP_FL at the same time
  f2fs: introduce sysfs atomic write statistics
  f2fs: don't bother wait_ms by foreground gc
  f2fs: invalidate meta pages only for post_read required inode
  f2fs: allow compression of files without blocks
  f2fs: fix to check inline_data during compressed inode conversion
  f2fs: Delete f2fs_copy_page() and replace with memcpy_page()
  f2fs: fix to invalidate META_MAPPING before DIO write
  ...

3 years agoMerge tag 'fuse-update-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi...
Linus Torvalds [Mon, 8 Aug 2022 18:10:02 +0000 (11:10 -0700)]
Merge tag 'fuse-update-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse updates from Miklos Szeredi:

 - Fix an issue with reusing the bdi in case of block based filesystems

 - Allow root (in init namespace) to access fuse filesystems in user
   namespaces if expicitly enabled with a module param

 - Misc fixes

* tag 'fuse-update-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: retire block-device-based superblock on force unmount
  vfs: function to prevent re-use of block-device-based superblocks
  virtio_fs: Modify format for virtio_fs_direct_access
  virtiofs: delete unused parameter for virtio_fs_cleanup_vqs
  fuse: Add module param for CAP_SYS_ADMIN access bypassing allow_other
  fuse: Remove the control interface for virtio-fs
  fuse: ioctl: translate ENOSYS
  fuse: limit nsec
  fuse: avoid unnecessary spinlock bump
  fuse: fix deadlock between atomic O_TRUNC and page invalidation
  fuse: write inode in fuse_release()

3 years agoMerge tag 'ovl-update-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Linus Torvalds [Mon, 8 Aug 2022 18:03:11 +0000 (11:03 -0700)]
Merge tag 'ovl-update-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs update from Miklos Szeredi:
 "Just a small update"

* tag 'ovl-update-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fix spelling mistakes
  ovl: drop WARN_ON() dentry is NULL in ovl_encode_fh()
  ovl: improve ovl_get_acl() if POSIX ACL support is off
  ovl: fix some kernel-doc comments
  ovl: warn if trusted xattr creation fails

3 years agoMerge tag 'exfat-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linki...
Linus Torvalds [Mon, 8 Aug 2022 17:57:09 +0000 (10:57 -0700)]
Merge tag 'exfat-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat

Pull exfat updates from Namjae Jeon:

 - fix the error code of rename syscall

 - cleanup and suppress the superfluous error messages

 - remove duplicate directory entry update

 - add exfat git tree in MAINTAINERS

* tag 'exfat-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  MAINTAINERS: Add Namjae's exfat git tree
  exfat: Drop superfluous new line for error messages
  exfat: Downgrade ENAMETOOLONG error message to debug messages
  exfat: Expand exfat_err() and co directly to pr_*() macro
  exfat: Define NLS_NAME_* as bit flags explicitly
  exfat: Return ENAMETOOLONG consistently for oversized paths
  exfat: remove duplicate write inode for extending dir/file
  exfat: remove duplicate write inode for truncating file
  exfat: reuse __exfat_write_inode() to update directory entry

3 years agovfs: Check the truncate maximum size in inode_newsize_ok()
David Howells [Mon, 8 Aug 2022 08:52:35 +0000 (09:52 +0100)]
vfs: Check the truncate maximum size in inode_newsize_ok()

If something manages to set the maximum file size to MAX_OFFSET+1, this
can cause the xfs and ext4 filesystems at least to become corrupt.

Ordinarily, the kernel protects against userspace trying this by
checking the value early in the truncate() and ftruncate() system calls
calls - but there are at least two places that this check is bypassed:

 (1) Cachefiles will round up the EOF of the backing file to DIO block
     size so as to allow DIO on the final block - but this might push
     the offset negative. It then calls notify_change(), but this
     inadvertently bypasses the checking. This can be triggered if
     someone puts an 8EiB-1 file on a server for someone else to try and
     access by, say, nfs.

 (2) ksmbd doesn't check the value it is given in set_end_of_file_info()
     and then calls vfs_truncate() directly - which also bypasses the
     check.

In both cases, it is potentially possible for a network filesystem to
cause a disk filesystem to be corrupted: cachefiles in the client's
cache filesystem; ksmbd in the server's filesystem.

nfsd is okay as it checks the value, but we can then remove this check
too.

Fix this by adding a check to inode_newsize_ok(), as called from
setattr_prepare(), thereby catching the issue as filesystems set up to
perform the truncate with minimal opportunity for bypassing the new
check.

Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling")
Fixes: f44158485826 ("cifsd: add file operations")
Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: stable@kernel.org
Acked-by: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Steve French <sfrench@samba.org>
cc: Hyunchul Lee <hyc.lee@gmail.com>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoMerge branches 'thermal-core' and 'thermal-tools'
Rafael J. Wysocki [Mon, 8 Aug 2022 17:39:09 +0000 (19:39 +0200)]
Merge branches 'thermal-core' and 'thermal-tools'

Merge additional changes in the thermal core and thermal tools updates
for 5.20-rc1:

 - Fix NULL poiter dereference in the thermal sysfs interface that
   results from an error code path mishandling (Rafael Wysocki).

 - Drop COMPILE_TEST dependency that's not needed any more from two
   thermal Kconfig entries (Jean Delvare).

 - Fix possible path truncations in the tmon utility (Florian Fainelli).

* thermal-core:
  thermal: Drop obsolete dependency on COMPILE_TEST
  thermal: sysfs: Fix cooling_device_stats_setup() error code path

* thermal-tools:
  tools/thermal: Fix possible path truncations

3 years agoMerge branch 'pm-cpufreq'
Rafael J. Wysocki [Mon, 8 Aug 2022 17:35:33 +0000 (19:35 +0200)]
Merge branch 'pm-cpufreq'

Merge ARM cpufreq updates for 5.20-rc1.

* pm-cpufreq:
  cpufreq: tegra194: Staticize struct tegra_cpufreq_soc instances
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM6375 compatible
  dt-bindings: opp: Add msm8939 to the compatible list
  dt-bindings: opp: Add missing compat devices
  dt-bindings: opp: opp-v2-kryo-cpu: Fix example binding checks
  cpufreq: Change order of online() CB and policy->cpus modification
  cpufreq: qcom-hw: Remove deprecated irq_set_affinity_hint() call
  cpufreq: qcom-hw: Disable LMH irq when disabling policy
  cpufreq: qcom-hw: Reset cancel_throttle when policy is re-enabled
  cpufreq: qcom-cpufreq-hw: use HZ_PER_KHZ macro in units.h
  cpufreq: mediatek: fix error return code in mtk_cpu_dvfs_info_init()

3 years agoMerge branch 'pm-opp'
Rafael J. Wysocki [Mon, 8 Aug 2022 17:32:33 +0000 (19:32 +0200)]
Merge branch 'pm-opp'

Merge operating performance points (OPP) updates for 5.20-rc1.

* pm-opp: (43 commits)
  venus: pm_helpers: Fix warning in OPP during probe
  OPP: Don't drop opp->np reference while it is still in use
  OPP: Don't drop opp_table->np reference while it is still in use
  OPP: Remove dev{m}_pm_opp_of_add_table_noclk()
  PM / devfreq: tegra30: Register config_clks helper
  OPP: Allow config_clks helper for single clk case
  OPP: Provide a simple implementation to configure multiple clocks
  OPP: Assert clk_count == 1 for single clk helpers
  OPP: Add key specific assert() method to key finding helpers
  OPP: Compare bandwidths for all paths in _opp_compare_key()
  OPP: Allow multiple clocks for a device
  dt-bindings: opp: accept array of frequencies
  OPP: Make dev_pm_opp_set_opp() independent of frequency
  OPP: Reuse _opp_compare_key() in _opp_add_static_v2()
  OPP: Remove rate_not_available parameter to _opp_add()
  OPP: Use consistent names for OPP table instances
  OPP: Use generic key finding helpers for bandwidth key
  OPP: Use generic key finding helpers for level key
  OPP: Add generic key finding helpers and use them for freq APIs
  OPP: Remove dev_pm_opp_find_freq_ceil_by_volt()
  ...

3 years agoMerge tag 'hwlock-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Mon, 8 Aug 2022 17:27:51 +0000 (10:27 -0700)]
Merge tag 'hwlock-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull hwspinlock updates from Bjorn Andersson:
 "This removes the need for representing the Qualcomm SFPB mutex using
  an intermediate syscon node and it clean up the pm_runtime_get_sync()
  usage in the OMAP hwspinlock driver"

* tag 'hwlock-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  hwspinlock: qcom: Add support for mmio usage to sfpb-mutex
  hwspinlock: using pm_runtime_resume_and_get instead of pm_runtime_get_sync

3 years agoMerge tag 'mailbox-v5.20' of git://git.linaro.org/landing-teams/working/fujitsu/integ...
Linus Torvalds [Mon, 8 Aug 2022 17:19:40 +0000 (10:19 -0700)]
Merge tag 'mailbox-v5.20' of git://git.linaro.org/landing-teams/working/fujitsu/integration

Pull mailbox updates from Jassi Brar:

 - mtk:
     - use rx_callback instead of cmdq_task_cb

 - qcom:
     - add syscon const
     - add SM6375 compatible

 - imx:
     - enable RST channel
     - clear pending irqs

* tag 'mailbox-v5.20' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: imx: clear pending interrupts
  dt-bindings: mailbox: qcom-ipcc: Add SM6375 compatible
  mailbox: imx: support RST channel
  dt-bindings: mailbox: imx-mu: add RST channel
  dt-bindings: mailbox: qcom,apcs-kpss-global: Add syscon const for relevant entries
  mailbox: mtk-cmdq: Remove proprietary cmdq_task_cb

3 years agoMerge tag 'hyperv-next-signed-20220807' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 8 Aug 2022 17:05:19 +0000 (10:05 -0700)]
Merge tag 'hyperv-next-signed-20220807' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:
 "A few miscellaneous patches. There is no large patch series for this
  merge window"

* tag 'hyperv-next-signed-20220807' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: Create debugfs file with hyper-v balloon usage information
  drm/hyperv : Removing the restruction of VRAM allocation with PCI bar size
  PCI: hv: Take a const cpumask in hv_compose_msi_req_get_cpu()
  Drivers: hv: vm_bus: Handle vmbus rescind calls after vmbus is suspended

3 years agoMerge tag 'coccinelle-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 8 Aug 2022 16:42:04 +0000 (09:42 -0700)]
Merge tag 'coccinelle-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux

Pull coccinelle semantic patch updates from Julia Lawall:

 - Update the semantic patches in the kernel that contain a URL for
   Coccinelle with a URL that is currently valid (from myself).

 - Add a semantic patch checking for unnecessary NULL tests on dev_{put,
   hold} functions (from Ziyang Xuan, followed bt a modification from
   myself).

 - Drop a semantic patch that replaces 0/1 by booleans, as this change
   was considered to be not worthwhile by some maintainers (from Steve
   Rostedt).

 - Extend an existing semantic patch with more checks for useless tests
   on variables addresses (from Jérémy Lefaure).

* tag 'coccinelle-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  update Coccinelle URL
  coccinelle: free: add version constraint
  scripts/coccinelle/free: add NULL test before dev_{put, hold} functions
  coccinelle: Remove script that checks replacing 0/1 with false/true in functions returning bool
  coccinelle: Extend address test from ifaddr semantic patch to test expressions

3 years agokernel/sysctl.c: Remove trailing white space
Fanjun Kong [Mon, 16 May 2022 09:07:53 +0000 (17:07 +0800)]
kernel/sysctl.c: Remove trailing white space

This patch removes the trailing white space in kernel/sysysctl.c

Signed-off-by: Fanjun Kong <bh1scw@gmail.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
[mcgrof: fix commit message subject]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>