]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
3 years agodoc/mgr/prometheus: correct metric name 44666/head
Tatjana Dehler [Wed, 19 Jan 2022 14:15:15 +0000 (15:15 +0100)]
doc/mgr/prometheus: correct metric name

Replace the metric name `node_disk_bytes_written` by
`node_disk_written_bytes_total` to reflect changes made in node exporter
version 0.16.0
https://github.com/prometheus/node_exporter/releases/tag/v0.16.0 /
https://github.com/prometheus/node_exporter/blob/v0.16.0/docs/example-16-compatibility-rules.yml .

Fixes: https://tracker.ceph.com/issues/53932
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
3 years agoMerge pull request #44591 from athanatos/sjust/wip-seastore-flush
Samuel Just [Tue, 18 Jan 2022 03:36:10 +0000 (19:36 -0800)]
Merge pull request #44591 from athanatos/sjust/wip-seastore-flush

crimson/os/seastore: avoid empty Transactions by adding explicit flush() call

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agoMerge pull request #44556 from cyx1231st/wip-crimson-improve-log-journal
Samuel Just [Mon, 17 Jan 2022 21:19:02 +0000 (13:19 -0800)]
Merge pull request #44556 from cyx1231st/wip-crimson-improve-log-journal

crimson/os/seastore: consolidate seastore_journal logs with cleanup and validations

Reviewed-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/os/seastore: implement FuturizedStore::flush 44591/head
Samuel Just [Fri, 14 Jan 2022 06:53:17 +0000 (06:53 +0000)]
crimson/os/seastore: implement FuturizedStore::flush

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agoMerge pull request #44566 from falcon78921/minor-messaging-nit
Sebastian Wagner [Mon, 17 Jan 2022 15:47:53 +0000 (16:47 +0100)]
Merge pull request #44566 from falcon78921/minor-messaging-nit

mgr/cephadm: fix minor grammar nit in Dry-Runs message

3 years agoMerge pull request #44510 from rzarzynski/wip-cephadm-docfix
Sebastian Wagner [Mon, 17 Jan 2022 09:21:45 +0000 (10:21 +0100)]
Merge pull request #44510 from rzarzynski/wip-cephadm-docfix

doc/cephadm: improve the development doc a bit

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agoMerge pull request #44485 from adk3798/agent-permissions
Sebastian Wagner [Mon, 17 Jan 2022 08:40:13 +0000 (09:40 +0100)]
Merge pull request #44485 from adk3798/agent-permissions

cephadm: fix permissions on agent files

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agoMerge pull request #44506 from sebastian-philipp/orch-suite-add-scsi
Sebastian Wagner [Mon, 17 Jan 2022 08:39:50 +0000 (09:39 +0100)]
Merge pull request #44506 from sebastian-philipp/orch-suite-add-scsi

qa/suites/orch/cephadm: Also run the rbd/iscsi suite

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Melissa Li <mingkli@redhat.com>
3 years agoMerge pull request #44603 from cbodley/wip-cmake-parquet
Casey Bodley [Fri, 14 Jan 2022 22:48:07 +0000 (17:48 -0500)]
Merge pull request #44603 from cbodley/wip-cmake-parquet

rgw: disable parquet by default

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agobuild: revert arrow package dependency 44603/head
Casey Bodley [Fri, 14 Jan 2022 19:54:09 +0000 (14:54 -0500)]
build: revert arrow package dependency

Signed-off-by: Casey Bodley <cbodley@redhat.com>
3 years agocmake: disable parquet by default
Casey Bodley [Fri, 14 Jan 2022 19:50:47 +0000 (14:50 -0500)]
cmake: disable parquet by default

Signed-off-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #44523 from ljflores/wip-telemetry-dashboard
Ernesto Puerta [Fri, 14 Jan 2022 19:11:15 +0000 (20:11 +0100)]
Merge pull request #44523 from ljflores/wip-telemetry-dashboard

mgr/dashboard/telemetry: reduce telemetry dashboard preview size

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: neha-ojha <NOT@FOUND>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
3 years agoMerge pull request #44550 from jdurgin/wip-pool-get-quota
Yuri Weinstein [Fri, 14 Jan 2022 18:46:49 +0000 (10:46 -0800)]
Merge pull request #44550 from jdurgin/wip-pool-get-quota

mon/OSDMonitor: avoid null dereference if stats are not available

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #42735 from amathuria/wip-amathuria-scrub-stats
Yuri Weinstein [Fri, 14 Jan 2022 18:46:28 +0000 (10:46 -0800)]
Merge pull request #42735 from amathuria/wip-amathuria-scrub-stats

osd/scrub: Add stats to PG dump for number of objects scrubbed

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
3 years agoMerge pull request #43667 from ifed01/wip-ifed-fix-ram-gridy-fsck
Neha Ojha [Fri, 14 Jan 2022 18:27:31 +0000 (10:27 -0800)]
Merge pull request #43667 from ifed01/wip-ifed-fix-ram-gridy-fsck

os/bluestore: make shared blob fsck much less RAM-greedy.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #44440 from soumyakoduri/wip-skoduri-dbstore-fixes
Soumya Koduri [Fri, 14 Jan 2022 18:08:22 +0000 (23:38 +0530)]
Merge pull request #44440 from soumyakoduri/wip-skoduri-dbstore-fixes

rgw/dbstore: Misc fixes

3 years agoMerge pull request #44552 from jdurgin/wip-releases-doc
Neha Ojha [Fri, 14 Jan 2022 17:42:08 +0000 (09:42 -0800)]
Merge pull request #44552 from jdurgin/wip-releases-doc

doc/releases: remove outdated info and versions; mark nautilus eol

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44370 from benhanokh/NCB_expand_device_fix
Yuri Weinstein [Fri, 14 Jan 2022 17:06:41 +0000 (09:06 -0800)]
Merge pull request #44370 from benhanokh/NCB_expand_device_fix

NCB code doesn't update allocation file when we expand-device

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #44251 from yaarith/telemetry-opt-in
Yuri Weinstein [Fri, 14 Jan 2022 17:06:11 +0000 (09:06 -0800)]
Merge pull request #44251 from yaarith/telemetry-opt-in

mgr/telemetry: introduce new design for varying report data

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
3 years agoMerge pull request #43849 from rzarzynski/wip-bs-lucky-buffers
Yuri Weinstein [Fri, 14 Jan 2022 16:44:06 +0000 (08:44 -0800)]
Merge pull request #43849 from rzarzynski/wip-bs-lucky-buffers

blk, os/bluestore: introduce huge page-based read buffers

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
3 years agoMerge pull request #42576 from AmnonHanuhov/wip-port_rgw_classes
Radoslaw Zarzynski [Fri, 14 Jan 2022 15:48:42 +0000 (16:48 +0100)]
Merge pull request #42576 from AmnonHanuhov/wip-port_rgw_classes

crimson/osd: Port rgw object classes to run in crimson

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #44518 from gregsfortytwo/wip-fix-53824
Yuri Weinstein [Fri, 14 Jan 2022 15:47:00 +0000 (07:47 -0800)]
Merge pull request #44518 from gregsfortytwo/wip-fix-53824

osd: PeeringState: fix selection order in calc_replicated_acting_stretch

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/os/seastore: consolidate seastore_journal logs with structured level and... 44556/head
Yingxin Cheng [Wed, 12 Jan 2022 05:42:11 +0000 (13:42 +0800)]
crimson/os/seastore: consolidate seastore_journal logs with structured level and format

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/os/seastore/journal: validate segments before replay
Yingxin Cheng [Wed, 12 Jan 2022 05:32:34 +0000 (13:32 +0800)]
crimson/os/seastore/journal: validate segments before replay

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/os/seastore/seastore_types: pretty print data structures
Yingxin Cheng [Wed, 12 Jan 2022 05:04:14 +0000 (13:04 +0800)]
crimson/os/seastore/seastore_types: pretty print data structures

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/os/seastore: count consumed records in cursor with cleanups
Yingxin Cheng [Wed, 12 Jan 2022 04:59:07 +0000 (12:59 +0800)]
crimson/os/seastore: count consumed records in cursor with cleanups

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/os/seastore: drop duplicated record_group_t::current_dlength
Yingxin Cheng [Tue, 11 Jan 2022 12:48:38 +0000 (20:48 +0800)]
crimson/os/seastore: drop duplicated record_group_t::current_dlength

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/os/seastore: classify journal related logs in seastore_types.cc
Yingxin Cheng [Mon, 10 Jan 2022 16:15:51 +0000 (00:15 +0800)]
crimson/os/seastore: classify journal related logs in seastore_types.cc

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/os/seastore: convert ExtentReader to seastore logging
Yingxin Cheng [Mon, 10 Jan 2022 15:20:53 +0000 (23:20 +0800)]
crimson/os/seastore: convert ExtentReader to seastore logging

Also set the logger to seastore_journal as the component works at
the journal layer.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agomgr/dashboard/telemetry: add test for formatReport() 44523/head
Laura Flores [Fri, 14 Jan 2022 14:37:10 +0000 (14:37 +0000)]
mgr/dashboard/telemetry: add test for formatReport()

Tests a scenario where all keys are removed, and one
where a key is ignored.

Signed-off-by: Laura Flores <lflores@redhat.com>
3 years agocrimson/os/seastore/journal: convert to seastore logging
Yingxin Cheng [Fri, 7 Jan 2022 06:57:56 +0000 (14:57 +0800)]
crimson/os/seastore/journal: convert to seastore logging

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agodoc: document new OBJECTS_SCRUBBED column in pg dump 42735/head
Aishwarya Mathuria [Fri, 14 Jan 2022 14:10:33 +0000 (19:40 +0530)]
doc: document new OBJECTS_SCRUBBED column in pg dump

Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
3 years agocrimson/osd: Implement missing objclass functions used by cls_rgw 42576/head
Amnon Hanuhov [Thu, 29 Jul 2021 13:19:48 +0000 (16:19 +0300)]
crimson/osd: Implement missing objclass functions used by cls_rgw

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: Add support for CEPH_OSD_OP_OMAPRMKEYS
Amnon Hanuhov [Thu, 29 Jul 2021 13:11:27 +0000 (16:11 +0300)]
crimson/osd: Add support for CEPH_OSD_OP_OMAPRMKEYS

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: Add a getter for last_user_version
Amnon Hanuhov [Thu, 29 Jul 2021 12:36:18 +0000 (15:36 +0300)]
crimson/osd: Add a getter for last_user_version

last_user_version is the last user object version applied to store

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: drop PGBackend& from OpsExecuter ctor
Amnon Hanuhov [Wed, 11 Aug 2021 16:49:44 +0000 (19:49 +0300)]
crimson/osd: drop PGBackend& from OpsExecuter ctor

OpsExecuter holds a Ref<PG> so the PGBackend can be extracted from it
using get_backend()

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: drop pg_pool_t from OpsExecuter ctor
Amnon Hanuhov [Wed, 11 Aug 2021 16:34:55 +0000 (19:34 +0300)]
crimson/osd: drop pg_pool_t from OpsExecuter ctor

OpsExecuter now holds a Ref<PG> so the pool info can be extracted from it
using get_pool().info

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: Store a reference to PG inside OpsExecuter
Amnon Hanuhov [Thu, 24 Jun 2021 15:59:53 +0000 (18:59 +0300)]
crimson/osd: Store a reference to PG inside OpsExecuter

This is needed as some ObjClass methods make use of pg information related to the given cls_method_context_t

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agoMerge pull request #44507 from votdev/issue_53813_nfs_page_not_found
Ernesto Puerta [Fri, 14 Jan 2022 11:56:55 +0000 (12:56 +0100)]
Merge pull request #44507 from votdev/issue_53813_nfs_page_not_found

mgr/dashboard: NFS pages shows 'Page not found'

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
3 years agoMerge pull request #43685 from p-se/fix-grafana-graphs-ceph_daemon
Ernesto Puerta [Fri, 14 Jan 2022 11:50:13 +0000 (12:50 +0100)]
Merge pull request #43685 from p-se/fix-grafana-graphs-ceph_daemon

mgr/dashboard: fix Grafana OSD/host panels

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: p-se <NOT@FOUND>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #44573 from rhcs-dashboard/53858-fix-smart-data-single-daemon
Ernesto Puerta [Fri, 14 Jan 2022 11:48:52 +0000 (12:48 +0100)]
Merge pull request #44573 from rhcs-dashboard/53858-fix-smart-data-single-daemon

mgr/dashboard: fix: get SMART data from single-daemon device

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #44559 from ideepika/wip-iscsi-53830
Ilya Dryomov [Fri, 14 Jan 2022 09:30:27 +0000 (10:30 +0100)]
Merge pull request #44559 from ideepika/wip-iscsi-53830

test/rbd/iscsi: correct the hostname in gwcli_create.t to match hostname -f

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #44571 from idryomov/wip-xfstests-qemu-cert
Ilya Dryomov [Fri, 14 Jan 2022 09:28:06 +0000 (10:28 +0100)]
Merge pull request #44571 from idryomov/wip-xfstests-qemu-cert

qa/run_xfstests_qemu.sh: stop reporting success without actually running any tests

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agocrimson: add and use FuturizedStore::flush() interface
Samuel Just [Fri, 14 Jan 2022 04:58:16 +0000 (04:58 +0000)]
crimson: add and use FuturizedStore::flush() interface

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agoMerge pull request #44570 from vshankar/wip-53857
Venky Shankar [Fri, 14 Jan 2022 03:12:20 +0000 (08:42 +0530)]
Merge pull request #44570 from vshankar/wip-53857

qa: adjust for MDSs to get deployed before verifying their availability

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44555 from cyx1231st/wip-fix-seastore-jounral-fast-submit
Samuel Just [Fri, 14 Jan 2022 01:23:37 +0000 (17:23 -0800)]
Merge pull request #44555 from cyx1231st/wip-fix-seastore-jounral-fast-submit

crimson/os/seastore/journal: fast submit if RecordSubmitter is IDLE and no pending

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
3 years agodoc/releases: remove dev and pre-nautilus releases from timeline 44552/head
Josh Durgin [Wed, 12 Jan 2022 02:15:34 +0000 (21:15 -0500)]
doc/releases: remove dev and pre-nautilus releases from timeline

Improve readability of the table - all this information is still
preserved in older branches.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
3 years agoMerge pull request #44583 from mgfritch/fixup-44306-docker-count
Adam King [Fri, 14 Jan 2022 00:18:35 +0000 (19:18 -0500)]
Merge pull request #44583 from mgfritch/fixup-44306-docker-count

cephadm: increase number of docker.io occurances

Reviewed-by: Adam King <adking@redhat.com>
3 years agocephadm: increase number of docker.io occurances 44583/head
Michael Fritch [Thu, 13 Jan 2022 22:22:40 +0000 (15:22 -0700)]
cephadm: increase number of docker.io occurances

fixup for 0fe2e54db774271e4fc18b45aba36b66cbc71779

Signed-off-by: Michael Fritch <mfritch@suse.com>
3 years agomgr/telemetry: revise format_perf_histogram 44251/head
Yaarit Hatuka [Wed, 12 Jan 2022 23:33:08 +0000 (23:33 +0000)]
mgr/telemetry: revise format_perf_histogram

osd_perf_histograms now include only separated stats; remove the
aggregated formatting; we can revert this in case we ever add aggregated
histograms.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agoPendingReleaseNotes: add a note about telemetry
Yaarit Hatuka [Wed, 12 Jan 2022 06:34:25 +0000 (06:34 +0000)]
PendingReleaseNotes: add a note about telemetry

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add `enable / disable channel all`
Yaarit Hatuka [Wed, 12 Jan 2022 05:57:21 +0000 (05:57 +0000)]
mgr/telemetry: add `enable / disable channel all`

Enable or disable all telemetry channels at once with:
    ceph telemetry enable channel all
    ceph telemetry disable channel all

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: do not restore channels default when opting-out
Yaarit Hatuka [Wed, 12 Jan 2022 05:32:01 +0000 (05:32 +0000)]
mgr/telemetry: do not restore channels default when opting-out

Other modules do not reset their configuration; keep telemetry module
consistent with this behavior.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: verify there are new collections when nagging due to a major
Yaarit Hatuka [Wed, 12 Jan 2022 05:01:48 +0000 (05:01 +0000)]
mgr/telemetry: verify there are new collections when nagging due to a major
upgrade

When adding a new collection we define whether to nag the user about it.
We may add many collections and nag about none of them. However, in case
of a major upgrade, we wish to notify the user about these new
collections. This commit verifies there are indeed new collections when
nagging due to a major upgrade.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: improve output of `ceph telemetry collection ls`
Yaarit Hatuka [Wed, 12 Jan 2022 04:36:27 +0000 (04:36 +0000)]
mgr/telemetry: improve output of `ceph telemetry collection ls`

STATUS column now indicates whether a collection is being reported, and
the reasons why it's not (either the user is not opted-in to this
collection, or its channel is off).

Also, removed the ENROLLED and DEFAULT columns due to potential
confusion they may cause.

In case a user is not opted-in to certain collections, a message will
appear above the table with the missing collections:

    New collections are available:
    ['basic_base', 'basic_mds_metadata', 'crash_base', 'device_base',
    'ident_base', 'perf_perf']
    Run `ceph telemetry on` to opt-in to these collections.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: use dict lookup when traversing MODULE_COLLECTION
Yaarit Hatuka [Wed, 12 Jan 2022 02:08:52 +0000 (02:08 +0000)]
mgr/telemetry: use dict lookup when traversing MODULE_COLLECTION

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add test coverage for telemetry upgrade
Yaarit Hatuka [Tue, 7 Dec 2021 23:17:13 +0000 (23:17 +0000)]
mgr/telemetry: add test coverage for telemetry upgrade

Test the behavior of the module after an upgrade, as we shift from our
revision design to Collections.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agodoc/mgr/telemetry: document new commands
Yaarit Hatuka [Tue, 7 Dec 2021 22:16:28 +0000 (22:16 +0000)]
doc/mgr/telemetry: document new commands

New commands:

  ceph telemetry enable channel <channel_name>
  ceph telemetry disable channel <channel_name>
  ceph telemetry channel ls
  ceph telemetry collection ls
  ceph telemetry collection diff
  ceph telemetry preview
  ceph telemetry preview-device
  ceph telemetry preview-all

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add command to list all collections
Yaarit Hatuka [Tue, 7 Dec 2021 18:30:56 +0000 (18:30 +0000)]
mgr/telemetry: add command to list all collections

List all collections, their current enrollment state, status, default,
and description, with:

$ ceph telemetry collection ls

NAME                  ENROLLED    STATUS    DEFAULT    DESC
basic_base            TRUE        ON        ON         Basic information about the cluster (capacity, number and type of daemons, version, etc.)
basic_mds_metadata    TRUE        ON        ON         MDS metadata
crash_base            TRUE        ON        ON         Information about daemon crashes (daemon type and version, backtrace, etc.)
device_base           TRUE        ON        ON         Information about device health metrics
ident_base            TRUE        OFF       OFF        User-provided identifying information about the cluster
perf_perf             TRUE        OFF       OFF        Information about performance counters of the cluster

Please note:

NAME:
=====
Collection name; prefix indicates the channel the collection belongs to.

ENROLLED:
=========
Signifies the collections that were available in the module when the
user last opted-in to telemetry. Please note: Even if a collection is
'enrolled', its metrics will be reported only if its channel is enabled.

STATUS:
=======
Indicates whether the collection metrics are reported; this is
determined by the status (enabled / disabled) of the channel the
collection belongs to, along with the enrollment status of the
collection.

DEFAULT:
========
The default status (enabled / disabled) of the channel the collection
belongs to.

DESC:
=====
Collection description.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: fix missing type annotations
Yaarit Hatuka [Tue, 30 Nov 2021 04:32:24 +0000 (04:32 +0000)]
mgr/telemetry: fix missing type annotations

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add preview-device and preview-all commands
Yaarit Hatuka [Tue, 23 Nov 2021 21:28:47 +0000 (21:28 +0000)]
mgr/telemetry: add preview-device and preview-all commands

`ceph telemetry show` will show a sample cluster report if the user is
opted-in to telemetry. The report will be compiled of the collections
the user is opted-in to. To preview a report compiled of the most recent
collection available, use `ceph telemetry preview`.

The device channel is not included in the cluster report, since it's
being sent to a different endpoint, thus we use
`ceph telemetry show-device` in case the user is opted-in to telemetry
and the device channel is enabled. If not, it can also be previewed with
`ceph telemetry preview-device`.

If telemetry is on, and device channel is enabled, both reports can be
reviewed with `ceph telemetry show-all`, otherwise use
`ceph telemetry preview-all`.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add command to list all channels
Yaarit Hatuka [Tue, 23 Nov 2021 17:11:38 +0000 (17:11 +0000)]
mgr/telemetry: add command to list all channels

List all channels, their current state, default, and description, with:

$ ceph telemetry channel ls

NAME      ENABLED    DEFAULT    DESC
basic     ON         ON         Share basic cluster information (size, version)
ident     OFF        OFF        Share a user-provided description and/or contact email for the cluster
crash     ON         ON         Share metadata about Ceph daemon crashes (version, stack straces, etc)
device    ON         ON         Share device health metrics (e.g., SMART data, minus potentially identifying info like serial numbers)
perf      ON         OFF        Share perf counter metrics summed across the whole cluster

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add commands to enable/disable channels
Yaarit Hatuka [Tue, 23 Nov 2021 00:12:10 +0000 (00:12 +0000)]
mgr/telemetry: add commands to enable/disable channels

Currently we enable/disable a telemetry channel via CLI with:
  `ceph config set mgr mgr/telemetry/channel_basic true`
  `ceph config set mgr mgr/telemetry/channel_crash false`

We can now do this with:
  `ceph telemetry enable channel basic`
  `ceph telemetry disable channel crash`

We allow enabling / disabling lists of channels:
  `ceph telemetry enable channel basic device crash perf`
  `ceph telemetry disable channel basic device crash perf`

Please note, telemetry should be on for these commands to take effect.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: introduce new design for adding new data
Yaarit Hatuka [Mon, 15 Nov 2021 16:53:59 +0000 (16:53 +0000)]
mgr/telemetry: introduce new design for adding new data

The current design requires increasing the telemetry revision each time
we add new data to the report. As a result, users need to re-opt-in to
telemetry. This new design allows for adding new data to the report,
while allowing users to keep sending only what they already opted-in to,
hence no re-opt-in is required. In case users wish to report the new
data as well, they need to re-opt-in and enable any new channels.

Also, move formatting perf histograms to a function, so we can use it
both in `show` and `preview` commands.

Fix get_report call in dashboard to use get_report_locked.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agoMerge pull request #44554 from jdurgin/wip-rbd-qos-docs
Josh Durgin [Thu, 13 Jan 2022 20:02:03 +0000 (12:02 -0800)]
Merge pull request #44554 from jdurgin/wip-rbd-qos-docs

doc/rbd: clarify and add more detail to librbd QoS docs

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #40802 from galsalomon66/wip-s3select-parquet-object-processing-2
Casey Bodley [Thu, 13 Jan 2022 17:53:33 +0000 (12:53 -0500)]
Merge pull request #40802 from galsalomon66/wip-s3select-parquet-object-processing-2

RGW/s3select : parquet implementation:

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agotest/rbd/iscsi: correct the HOST name provided. 44559/head
Deepika Upadhyay [Wed, 12 Jan 2022 09:56:04 +0000 (15:26 +0530)]
test/rbd/iscsi: correct the HOST name provided.

hostname -f and hostname generated from gwcli_create being different
gave rise to error:

The first gateway defined must be the local machine

Fixes: https://tracker.ceph.com/issues/53830
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #44577 from clementperon/master
Kefu Chai [Thu, 13 Jan 2022 17:15:07 +0000 (01:15 +0800)]
Merge pull request #44577 from clementperon/master

cmake: Fix Finddpdk cmake module

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
3 years agoMerge pull request #44498 from phlogistonjohn/jjm-root-check-later
Adam King [Thu, 13 Jan 2022 17:10:13 +0000 (12:10 -0500)]
Merge pull request #44498 from phlogistonjohn/jjm-root-check-later

cephadm: check if cephadm is root after cli is parsed

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #44394 from melissa-kun-li/enable-autotune
Adam King [Thu, 13 Jan 2022 17:06:46 +0000 (12:06 -0500)]
Merge pull request #44394 from melissa-kun-li/enable-autotune

Enable autotune for osd_memory_target on bootstrap

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
3 years agoMerge pull request #44306 from sebastian-philipp/normalize_image_digest-ambiguity
Adam King [Thu, 13 Jan 2022 17:03:50 +0000 (12:03 -0500)]
Merge pull request #44306 from sebastian-philipp/normalize_image_digest-ambiguity

cephadm: deal with ambiguity within normalize_image_digest

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Sage Weil <sage@newdream.net>
3 years agodoc/rbd/rbd-config-ref: add more detail on QoS settings 44554/head
Josh Durgin [Wed, 12 Jan 2022 03:17:15 +0000 (22:17 -0500)]
doc/rbd/rbd-config-ref: add more detail on QoS settings

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
3 years agohandling arm64(arrow installation) 40802/head
gal salomon [Thu, 13 Jan 2022 15:47:23 +0000 (17:47 +0200)]
handling arm64(arrow installation)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
3 years agoMerge pull request #44427 from lxbsz/client_cleanup
Venky Shankar [Thu, 13 Jan 2022 15:04:54 +0000 (20:34 +0530)]
Merge pull request #44427 from lxbsz/client_cleanup

client: remove useless Lx cap check

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44451 from lxbsz/wip-53750
Venky Shankar [Thu, 13 Jan 2022 15:03:58 +0000 (20:33 +0530)]
Merge pull request #44451 from lxbsz/wip-53750

mds: directly return just after responding the link request

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44561 from cbodley/wip-51727
Casey Bodley [Thu, 13 Jan 2022 14:38:49 +0000 (09:38 -0500)]
Merge pull request #44561 from cbodley/wip-51727

qa/rgw: add PG_DEGRADED cluster warnings to log-ignorelist

Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
3 years agomgr/dashboard: fix: get SMART data from single-daemon device 44573/head
Alfonso Martínez [Thu, 13 Jan 2022 14:20:48 +0000 (15:20 +0100)]
mgr/dashboard: fix: get SMART data from single-daemon device

Return SMART data even when a device is only associated with a single daemon.

Fixes: https://tracker.ceph.com/issues/53858
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
3 years agoMerge pull request #44538 from dang/wip-dang-zipper-perf
Daniel Gryniewicz [Thu, 13 Jan 2022 14:09:33 +0000 (09:09 -0500)]
Merge pull request #44538 from dang/wip-dang-zipper-perf

RGW Zipper - don't load stats for every bucket load

Reviewed-by: Mark Nelson <mnelson@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #44002 from JoshSalomon/wip-primary-balancer
Laura Flores [Thu, 13 Jan 2022 13:45:54 +0000 (07:45 -0600)]
Merge pull request #44002 from JoshSalomon/wip-primary-balancer

3 years agocmake: dpdk: only append common dir if it has been found 44577/head
Clément Péron [Thu, 13 Jan 2022 13:32:20 +0000 (14:32 +0100)]
cmake: dpdk: only append common dir if it has been found

Signed-off-by: Clément Péron <peron.clem@gmail.com>
3 years agocmake: dpdk: use STREQUAL and not EQUAL when comparing strings
Clément Péron [Thu, 13 Jan 2022 13:27:33 +0000 (14:27 +0100)]
cmake: dpdk: use STREQUAL and not EQUAL when comparing strings

Signed-off-by: Clément Péron <peron.clem@gmail.com>
3 years agocmake: dpdk: fix typo in HINTS when looking for DPDK
Clément Péron [Thu, 13 Jan 2022 13:26:29 +0000 (14:26 +0100)]
cmake: dpdk: fix typo in HINTS when looking for DPDK

Signed-off-by: Clément Péron <peron.clem@gmail.com>
3 years agoqa: adjust for MDSs to get deployed before verifying their availability 44570/head
Venky Shankar [Tue, 11 Jan 2022 09:05:03 +0000 (14:35 +0530)]
qa: adjust for MDSs to get deployed before verifying their availability

The check happens when some MDSs are *just* deployed by cephadm causing
jobs to fail with:

     Command failed on smithi016 with status 1: 'sudo /home/ubuntu/cephtest/cephadm \
     --image docker.io/ceph/ceph:v16.2.4 shell -c /etc/ceph/ceph.conf -k \
     /etc/ceph/ceph.client.admin.keyring --fsid 403bfcae-706b-11ec-8c32-001a4aab830c \
     -- bash -c \'ceph --format=json mds versions | jq -e ". | add == 4"\''

Fixes: http://tracker.ceph.com/issues/53857
Signed-off-by: Venky Shankar <vshankar@redhat.com>
3 years agomds: directly return just after responding the link request 44451/head
Xiubo Li [Tue, 4 Jan 2022 03:18:53 +0000 (11:18 +0800)]
mds: directly return just after responding the link request

Fixes: https://tracker.ceph.com/issues/53750
Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agoMerge pull request #43286 from lxbsz/improve_setattr
Venky Shankar [Thu, 13 Jan 2022 12:53:27 +0000 (18:23 +0530)]
Merge pull request #43286 from lxbsz/improve_setattr

client: buffer the truncate if we have the Fx caps

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agosrc/osd: reset objects_scrubbed count at the beginning of a new scrub
Aishwarya Mathuria [Thu, 13 Jan 2022 12:47:59 +0000 (18:17 +0530)]
src/osd: reset objects_scrubbed count at the beginning of a new scrub

Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
3 years agoclient: remove useless Lx cap check 44427/head
Xiubo Li [Thu, 30 Dec 2021 07:03:35 +0000 (15:03 +0800)]
client: remove useless Lx cap check

Once here the new_caps must have the 'Ls' caps, the extra check
for 'Lsx' makes no sense.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agoMerge pull request #44229 from lxbsz/mds-buffix
Venky Shankar [Thu, 13 Jan 2022 12:46:13 +0000 (18:16 +0530)]
Merge pull request #44229 from lxbsz/mds-buffix

mds: remove the duplicated or incorrect respond

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44397 from lxbsz/wip-53726
Venky Shankar [Thu, 13 Jan 2022 12:45:24 +0000 (18:15 +0530)]
Merge pull request #44397 from lxbsz/wip-53726

mds: dump tree '/' when the path is empty

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44422 from lxbsz/wip-51705
Venky Shankar [Thu, 13 Jan 2022 12:44:14 +0000 (18:14 +0530)]
Merge pull request #44422 from lxbsz/wip-51705

qa: do not use any time related suffix for *_op_timeouts

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agomonitoring: Add unit tests for OSD panels in ceph-cluster dashboard 43685/head
Patrick Seidensal [Thu, 9 Dec 2021 14:01:54 +0000 (15:01 +0100)]
monitoring: Add unit tests for OSD panels in ceph-cluster dashboard

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
3 years agomonitoring: fix display ceph_osd_in in Grafana panel
Patrick Seidensal [Thu, 9 Dec 2021 13:59:49 +0000 (14:59 +0100)]
monitoring: fix display ceph_osd_in in Grafana panel

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
3 years agomgr/prometheus: Fix regression with OSD/host details/overview dashboards
Patrick Seidensal [Mon, 25 Oct 2021 13:00:14 +0000 (15:00 +0200)]
mgr/prometheus: Fix regression with OSD/host details/overview dashboards

Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.

As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk.  This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros).  The data we
have expected is simply different in some rare cases.

I have not found a sole PromQL solution to this issue. What we basically
need is the following.

1. Match on labels `host` and `instance` to get one or more OSD names
   from a metadata metric (`ceph_disk_occupation`) to let a user know
   about which OSDs belong to which disk.

2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
   in which case the value of `ceph_daemon` must not refer to more than
   a single OSD. The exact opposite to requirement 1.

As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.

Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk).  This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.

`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.

    foo * on(ceph_daemon) group_left ceph_disk_occupation

`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).

    foo * on(device,instance)
    group_left(ceph_daemon) ceph_disk_occupation_human

Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
3 years agorgw/dbstore: GetUser should return ENOENT if no user found 44440/head
Soumya Koduri [Mon, 3 Jan 2022 12:29:17 +0000 (17:59 +0530)]
rgw/dbstore: GetUser should return ENOENT if no user found

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
3 years agorgw/dbstore: Use mutex to protect DB objectmap and prepared stmt
Soumya Koduri [Mon, 3 Jan 2022 12:11:38 +0000 (17:41 +0530)]
rgw/dbstore: Use mutex to protect DB objectmap and prepared stmt

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
3 years agorgw/dbstore: Fix null ptr reference
Soumya Koduri [Mon, 3 Jan 2022 07:26:31 +0000 (12:56 +0530)]
rgw/dbstore: Fix null ptr reference

Initialize Object state once and use the same for all its
references. Also fixed a bug in SQLGetLC::prepare()

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
3 years agorgw/dbstore: Fixing s3 test 'test_bucket_delete_nonempty'
Soumya Koduri [Tue, 14 Dec 2021 06:07:14 +0000 (11:37 +0530)]
rgw/dbstore: Fixing s3 test 'test_bucket_delete_nonempty'

if delete_children not set to 'true', delete bucket should
fail with ENOTEMPTY

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
3 years agoMerge pull request #43995 from TRYTOBE8TME/wip-rgw-kafka-teuth-cleanup
Yuval Lifshitz [Thu, 13 Jan 2022 09:57:03 +0000 (11:57 +0200)]
Merge pull request #43995 from TRYTOBE8TME/wip-rgw-kafka-teuth-cleanup

qa/tasks: Checking for kafka cleanup

3 years agomgr/prometheus: Refactoring: Introduce type aliases
Patrick Seidensal [Mon, 25 Oct 2021 08:51:35 +0000 (10:51 +0200)]
mgr/prometheus: Refactoring: Introduce type aliases

Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
3 years agomgr/cephadm: fixes minor grammar nit in Dry-Runs message 44566/head
James McClune [Thu, 13 Jan 2022 03:46:42 +0000 (22:46 -0500)]
mgr/cephadm: fixes minor grammar nit in Dry-Runs message

Signed-off-by: James McClune <jmcclune@mcclunetechnologies.net>