]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
3 years agobdev: fix FTBFS on FreeBSD, keep the huge paged read buffers. 44612/head
Radoslaw Zarzynski [Mon, 17 Jan 2022 14:55:05 +0000 (14:55 +0000)]
bdev: fix FTBFS on FreeBSD, keep the huge paged read buffers.

Special thanks to Willem Jan Withagen!

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #44510 from rzarzynski/wip-cephadm-docfix
Sebastian Wagner [Mon, 17 Jan 2022 09:21:45 +0000 (10:21 +0100)]
Merge pull request #44510 from rzarzynski/wip-cephadm-docfix

doc/cephadm: improve the development doc a bit

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agoMerge pull request #44485 from adk3798/agent-permissions
Sebastian Wagner [Mon, 17 Jan 2022 08:40:13 +0000 (09:40 +0100)]
Merge pull request #44485 from adk3798/agent-permissions

cephadm: fix permissions on agent files

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agoMerge pull request #44506 from sebastian-philipp/orch-suite-add-scsi
Sebastian Wagner [Mon, 17 Jan 2022 08:39:50 +0000 (09:39 +0100)]
Merge pull request #44506 from sebastian-philipp/orch-suite-add-scsi

qa/suites/orch/cephadm: Also run the rbd/iscsi suite

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Melissa Li <mingkli@redhat.com>
3 years agoMerge pull request #44603 from cbodley/wip-cmake-parquet
Casey Bodley [Fri, 14 Jan 2022 22:48:07 +0000 (17:48 -0500)]
Merge pull request #44603 from cbodley/wip-cmake-parquet

rgw: disable parquet by default

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agobuild: revert arrow package dependency 44603/head
Casey Bodley [Fri, 14 Jan 2022 19:54:09 +0000 (14:54 -0500)]
build: revert arrow package dependency

Signed-off-by: Casey Bodley <cbodley@redhat.com>
3 years agocmake: disable parquet by default
Casey Bodley [Fri, 14 Jan 2022 19:50:47 +0000 (14:50 -0500)]
cmake: disable parquet by default

Signed-off-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #44523 from ljflores/wip-telemetry-dashboard
Ernesto Puerta [Fri, 14 Jan 2022 19:11:15 +0000 (20:11 +0100)]
Merge pull request #44523 from ljflores/wip-telemetry-dashboard

mgr/dashboard/telemetry: reduce telemetry dashboard preview size

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: neha-ojha <NOT@FOUND>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
3 years agoMerge pull request #44550 from jdurgin/wip-pool-get-quota
Yuri Weinstein [Fri, 14 Jan 2022 18:46:49 +0000 (10:46 -0800)]
Merge pull request #44550 from jdurgin/wip-pool-get-quota

mon/OSDMonitor: avoid null dereference if stats are not available

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #42735 from amathuria/wip-amathuria-scrub-stats
Yuri Weinstein [Fri, 14 Jan 2022 18:46:28 +0000 (10:46 -0800)]
Merge pull request #42735 from amathuria/wip-amathuria-scrub-stats

osd/scrub: Add stats to PG dump for number of objects scrubbed

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
3 years agoMerge pull request #43667 from ifed01/wip-ifed-fix-ram-gridy-fsck
Neha Ojha [Fri, 14 Jan 2022 18:27:31 +0000 (10:27 -0800)]
Merge pull request #43667 from ifed01/wip-ifed-fix-ram-gridy-fsck

os/bluestore: make shared blob fsck much less RAM-greedy.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #44440 from soumyakoduri/wip-skoduri-dbstore-fixes
Soumya Koduri [Fri, 14 Jan 2022 18:08:22 +0000 (23:38 +0530)]
Merge pull request #44440 from soumyakoduri/wip-skoduri-dbstore-fixes

rgw/dbstore: Misc fixes

3 years agoMerge pull request #44552 from jdurgin/wip-releases-doc
Neha Ojha [Fri, 14 Jan 2022 17:42:08 +0000 (09:42 -0800)]
Merge pull request #44552 from jdurgin/wip-releases-doc

doc/releases: remove outdated info and versions; mark nautilus eol

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44370 from benhanokh/NCB_expand_device_fix
Yuri Weinstein [Fri, 14 Jan 2022 17:06:41 +0000 (09:06 -0800)]
Merge pull request #44370 from benhanokh/NCB_expand_device_fix

NCB code doesn't update allocation file when we expand-device

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #44251 from yaarith/telemetry-opt-in
Yuri Weinstein [Fri, 14 Jan 2022 17:06:11 +0000 (09:06 -0800)]
Merge pull request #44251 from yaarith/telemetry-opt-in

mgr/telemetry: introduce new design for varying report data

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
3 years agoMerge pull request #43849 from rzarzynski/wip-bs-lucky-buffers
Yuri Weinstein [Fri, 14 Jan 2022 16:44:06 +0000 (08:44 -0800)]
Merge pull request #43849 from rzarzynski/wip-bs-lucky-buffers

blk, os/bluestore: introduce huge page-based read buffers

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
3 years agoMerge pull request #42576 from AmnonHanuhov/wip-port_rgw_classes
Radoslaw Zarzynski [Fri, 14 Jan 2022 15:48:42 +0000 (16:48 +0100)]
Merge pull request #42576 from AmnonHanuhov/wip-port_rgw_classes

crimson/osd: Port rgw object classes to run in crimson

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #44518 from gregsfortytwo/wip-fix-53824
Yuri Weinstein [Fri, 14 Jan 2022 15:47:00 +0000 (07:47 -0800)]
Merge pull request #44518 from gregsfortytwo/wip-fix-53824

osd: PeeringState: fix selection order in calc_replicated_acting_stretch

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
3 years agomgr/dashboard/telemetry: add test for formatReport() 44523/head
Laura Flores [Fri, 14 Jan 2022 14:37:10 +0000 (14:37 +0000)]
mgr/dashboard/telemetry: add test for formatReport()

Tests a scenario where all keys are removed, and one
where a key is ignored.

Signed-off-by: Laura Flores <lflores@redhat.com>
3 years agodoc: document new OBJECTS_SCRUBBED column in pg dump 42735/head
Aishwarya Mathuria [Fri, 14 Jan 2022 14:10:33 +0000 (19:40 +0530)]
doc: document new OBJECTS_SCRUBBED column in pg dump

Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
3 years agocrimson/osd: Implement missing objclass functions used by cls_rgw 42576/head
Amnon Hanuhov [Thu, 29 Jul 2021 13:19:48 +0000 (16:19 +0300)]
crimson/osd: Implement missing objclass functions used by cls_rgw

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: Add support for CEPH_OSD_OP_OMAPRMKEYS
Amnon Hanuhov [Thu, 29 Jul 2021 13:11:27 +0000 (16:11 +0300)]
crimson/osd: Add support for CEPH_OSD_OP_OMAPRMKEYS

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: Add a getter for last_user_version
Amnon Hanuhov [Thu, 29 Jul 2021 12:36:18 +0000 (15:36 +0300)]
crimson/osd: Add a getter for last_user_version

last_user_version is the last user object version applied to store

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: drop PGBackend& from OpsExecuter ctor
Amnon Hanuhov [Wed, 11 Aug 2021 16:49:44 +0000 (19:49 +0300)]
crimson/osd: drop PGBackend& from OpsExecuter ctor

OpsExecuter holds a Ref<PG> so the PGBackend can be extracted from it
using get_backend()

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: drop pg_pool_t from OpsExecuter ctor
Amnon Hanuhov [Wed, 11 Aug 2021 16:34:55 +0000 (19:34 +0300)]
crimson/osd: drop pg_pool_t from OpsExecuter ctor

OpsExecuter now holds a Ref<PG> so the pool info can be extracted from it
using get_pool().info

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agocrimson/osd: Store a reference to PG inside OpsExecuter
Amnon Hanuhov [Thu, 24 Jun 2021 15:59:53 +0000 (18:59 +0300)]
crimson/osd: Store a reference to PG inside OpsExecuter

This is needed as some ObjClass methods make use of pg information related to the given cls_method_context_t

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
3 years agoMerge pull request #44507 from votdev/issue_53813_nfs_page_not_found
Ernesto Puerta [Fri, 14 Jan 2022 11:56:55 +0000 (12:56 +0100)]
Merge pull request #44507 from votdev/issue_53813_nfs_page_not_found

mgr/dashboard: NFS pages shows 'Page not found'

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
3 years agoMerge pull request #43685 from p-se/fix-grafana-graphs-ceph_daemon
Ernesto Puerta [Fri, 14 Jan 2022 11:50:13 +0000 (12:50 +0100)]
Merge pull request #43685 from p-se/fix-grafana-graphs-ceph_daemon

mgr/dashboard: fix Grafana OSD/host panels

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: p-se <NOT@FOUND>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #44573 from rhcs-dashboard/53858-fix-smart-data-single-daemon
Ernesto Puerta [Fri, 14 Jan 2022 11:48:52 +0000 (12:48 +0100)]
Merge pull request #44573 from rhcs-dashboard/53858-fix-smart-data-single-daemon

mgr/dashboard: fix: get SMART data from single-daemon device

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #44559 from ideepika/wip-iscsi-53830
Ilya Dryomov [Fri, 14 Jan 2022 09:30:27 +0000 (10:30 +0100)]
Merge pull request #44559 from ideepika/wip-iscsi-53830

test/rbd/iscsi: correct the hostname in gwcli_create.t to match hostname -f

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #44571 from idryomov/wip-xfstests-qemu-cert
Ilya Dryomov [Fri, 14 Jan 2022 09:28:06 +0000 (10:28 +0100)]
Merge pull request #44571 from idryomov/wip-xfstests-qemu-cert

qa/run_xfstests_qemu.sh: stop reporting success without actually running any tests

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #44570 from vshankar/wip-53857
Venky Shankar [Fri, 14 Jan 2022 03:12:20 +0000 (08:42 +0530)]
Merge pull request #44570 from vshankar/wip-53857

qa: adjust for MDSs to get deployed before verifying their availability

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44555 from cyx1231st/wip-fix-seastore-jounral-fast-submit
Samuel Just [Fri, 14 Jan 2022 01:23:37 +0000 (17:23 -0800)]
Merge pull request #44555 from cyx1231st/wip-fix-seastore-jounral-fast-submit

crimson/os/seastore/journal: fast submit if RecordSubmitter is IDLE and no pending

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
3 years agodoc/releases: remove dev and pre-nautilus releases from timeline 44552/head
Josh Durgin [Wed, 12 Jan 2022 02:15:34 +0000 (21:15 -0500)]
doc/releases: remove dev and pre-nautilus releases from timeline

Improve readability of the table - all this information is still
preserved in older branches.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
3 years agoMerge pull request #44583 from mgfritch/fixup-44306-docker-count
Adam King [Fri, 14 Jan 2022 00:18:35 +0000 (19:18 -0500)]
Merge pull request #44583 from mgfritch/fixup-44306-docker-count

cephadm: increase number of docker.io occurances

Reviewed-by: Adam King <adking@redhat.com>
3 years agocephadm: increase number of docker.io occurances 44583/head
Michael Fritch [Thu, 13 Jan 2022 22:22:40 +0000 (15:22 -0700)]
cephadm: increase number of docker.io occurances

fixup for 0fe2e54db774271e4fc18b45aba36b66cbc71779

Signed-off-by: Michael Fritch <mfritch@suse.com>
3 years agomgr/telemetry: revise format_perf_histogram 44251/head
Yaarit Hatuka [Wed, 12 Jan 2022 23:33:08 +0000 (23:33 +0000)]
mgr/telemetry: revise format_perf_histogram

osd_perf_histograms now include only separated stats; remove the
aggregated formatting; we can revert this in case we ever add aggregated
histograms.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agoPendingReleaseNotes: add a note about telemetry
Yaarit Hatuka [Wed, 12 Jan 2022 06:34:25 +0000 (06:34 +0000)]
PendingReleaseNotes: add a note about telemetry

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add `enable / disable channel all`
Yaarit Hatuka [Wed, 12 Jan 2022 05:57:21 +0000 (05:57 +0000)]
mgr/telemetry: add `enable / disable channel all`

Enable or disable all telemetry channels at once with:
    ceph telemetry enable channel all
    ceph telemetry disable channel all

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: do not restore channels default when opting-out
Yaarit Hatuka [Wed, 12 Jan 2022 05:32:01 +0000 (05:32 +0000)]
mgr/telemetry: do not restore channels default when opting-out

Other modules do not reset their configuration; keep telemetry module
consistent with this behavior.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: verify there are new collections when nagging due to a major
Yaarit Hatuka [Wed, 12 Jan 2022 05:01:48 +0000 (05:01 +0000)]
mgr/telemetry: verify there are new collections when nagging due to a major
upgrade

When adding a new collection we define whether to nag the user about it.
We may add many collections and nag about none of them. However, in case
of a major upgrade, we wish to notify the user about these new
collections. This commit verifies there are indeed new collections when
nagging due to a major upgrade.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: improve output of `ceph telemetry collection ls`
Yaarit Hatuka [Wed, 12 Jan 2022 04:36:27 +0000 (04:36 +0000)]
mgr/telemetry: improve output of `ceph telemetry collection ls`

STATUS column now indicates whether a collection is being reported, and
the reasons why it's not (either the user is not opted-in to this
collection, or its channel is off).

Also, removed the ENROLLED and DEFAULT columns due to potential
confusion they may cause.

In case a user is not opted-in to certain collections, a message will
appear above the table with the missing collections:

    New collections are available:
    ['basic_base', 'basic_mds_metadata', 'crash_base', 'device_base',
    'ident_base', 'perf_perf']
    Run `ceph telemetry on` to opt-in to these collections.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: use dict lookup when traversing MODULE_COLLECTION
Yaarit Hatuka [Wed, 12 Jan 2022 02:08:52 +0000 (02:08 +0000)]
mgr/telemetry: use dict lookup when traversing MODULE_COLLECTION

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add test coverage for telemetry upgrade
Yaarit Hatuka [Tue, 7 Dec 2021 23:17:13 +0000 (23:17 +0000)]
mgr/telemetry: add test coverage for telemetry upgrade

Test the behavior of the module after an upgrade, as we shift from our
revision design to Collections.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agodoc/mgr/telemetry: document new commands
Yaarit Hatuka [Tue, 7 Dec 2021 22:16:28 +0000 (22:16 +0000)]
doc/mgr/telemetry: document new commands

New commands:

  ceph telemetry enable channel <channel_name>
  ceph telemetry disable channel <channel_name>
  ceph telemetry channel ls
  ceph telemetry collection ls
  ceph telemetry collection diff
  ceph telemetry preview
  ceph telemetry preview-device
  ceph telemetry preview-all

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add command to list all collections
Yaarit Hatuka [Tue, 7 Dec 2021 18:30:56 +0000 (18:30 +0000)]
mgr/telemetry: add command to list all collections

List all collections, their current enrollment state, status, default,
and description, with:

$ ceph telemetry collection ls

NAME                  ENROLLED    STATUS    DEFAULT    DESC
basic_base            TRUE        ON        ON         Basic information about the cluster (capacity, number and type of daemons, version, etc.)
basic_mds_metadata    TRUE        ON        ON         MDS metadata
crash_base            TRUE        ON        ON         Information about daemon crashes (daemon type and version, backtrace, etc.)
device_base           TRUE        ON        ON         Information about device health metrics
ident_base            TRUE        OFF       OFF        User-provided identifying information about the cluster
perf_perf             TRUE        OFF       OFF        Information about performance counters of the cluster

Please note:

NAME:
=====
Collection name; prefix indicates the channel the collection belongs to.

ENROLLED:
=========
Signifies the collections that were available in the module when the
user last opted-in to telemetry. Please note: Even if a collection is
'enrolled', its metrics will be reported only if its channel is enabled.

STATUS:
=======
Indicates whether the collection metrics are reported; this is
determined by the status (enabled / disabled) of the channel the
collection belongs to, along with the enrollment status of the
collection.

DEFAULT:
========
The default status (enabled / disabled) of the channel the collection
belongs to.

DESC:
=====
Collection description.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: fix missing type annotations
Yaarit Hatuka [Tue, 30 Nov 2021 04:32:24 +0000 (04:32 +0000)]
mgr/telemetry: fix missing type annotations

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add preview-device and preview-all commands
Yaarit Hatuka [Tue, 23 Nov 2021 21:28:47 +0000 (21:28 +0000)]
mgr/telemetry: add preview-device and preview-all commands

`ceph telemetry show` will show a sample cluster report if the user is
opted-in to telemetry. The report will be compiled of the collections
the user is opted-in to. To preview a report compiled of the most recent
collection available, use `ceph telemetry preview`.

The device channel is not included in the cluster report, since it's
being sent to a different endpoint, thus we use
`ceph telemetry show-device` in case the user is opted-in to telemetry
and the device channel is enabled. If not, it can also be previewed with
`ceph telemetry preview-device`.

If telemetry is on, and device channel is enabled, both reports can be
reviewed with `ceph telemetry show-all`, otherwise use
`ceph telemetry preview-all`.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add command to list all channels
Yaarit Hatuka [Tue, 23 Nov 2021 17:11:38 +0000 (17:11 +0000)]
mgr/telemetry: add command to list all channels

List all channels, their current state, default, and description, with:

$ ceph telemetry channel ls

NAME      ENABLED    DEFAULT    DESC
basic     ON         ON         Share basic cluster information (size, version)
ident     OFF        OFF        Share a user-provided description and/or contact email for the cluster
crash     ON         ON         Share metadata about Ceph daemon crashes (version, stack straces, etc)
device    ON         ON         Share device health metrics (e.g., SMART data, minus potentially identifying info like serial numbers)
perf      ON         OFF        Share perf counter metrics summed across the whole cluster

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add commands to enable/disable channels
Yaarit Hatuka [Tue, 23 Nov 2021 00:12:10 +0000 (00:12 +0000)]
mgr/telemetry: add commands to enable/disable channels

Currently we enable/disable a telemetry channel via CLI with:
  `ceph config set mgr mgr/telemetry/channel_basic true`
  `ceph config set mgr mgr/telemetry/channel_crash false`

We can now do this with:
  `ceph telemetry enable channel basic`
  `ceph telemetry disable channel crash`

We allow enabling / disabling lists of channels:
  `ceph telemetry enable channel basic device crash perf`
  `ceph telemetry disable channel basic device crash perf`

Please note, telemetry should be on for these commands to take effect.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: introduce new design for adding new data
Yaarit Hatuka [Mon, 15 Nov 2021 16:53:59 +0000 (16:53 +0000)]
mgr/telemetry: introduce new design for adding new data

The current design requires increasing the telemetry revision each time
we add new data to the report. As a result, users need to re-opt-in to
telemetry. This new design allows for adding new data to the report,
while allowing users to keep sending only what they already opted-in to,
hence no re-opt-in is required. In case users wish to report the new
data as well, they need to re-opt-in and enable any new channels.

Also, move formatting perf histograms to a function, so we can use it
both in `show` and `preview` commands.

Fix get_report call in dashboard to use get_report_locked.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agoMerge pull request #44554 from jdurgin/wip-rbd-qos-docs
Josh Durgin [Thu, 13 Jan 2022 20:02:03 +0000 (12:02 -0800)]
Merge pull request #44554 from jdurgin/wip-rbd-qos-docs

doc/rbd: clarify and add more detail to librbd QoS docs

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #40802 from galsalomon66/wip-s3select-parquet-object-processing-2
Casey Bodley [Thu, 13 Jan 2022 17:53:33 +0000 (12:53 -0500)]
Merge pull request #40802 from galsalomon66/wip-s3select-parquet-object-processing-2

RGW/s3select : parquet implementation:

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agotest/rbd/iscsi: correct the HOST name provided. 44559/head
Deepika Upadhyay [Wed, 12 Jan 2022 09:56:04 +0000 (15:26 +0530)]
test/rbd/iscsi: correct the HOST name provided.

hostname -f and hostname generated from gwcli_create being different
gave rise to error:

The first gateway defined must be the local machine

Fixes: https://tracker.ceph.com/issues/53830
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #44577 from clementperon/master
Kefu Chai [Thu, 13 Jan 2022 17:15:07 +0000 (01:15 +0800)]
Merge pull request #44577 from clementperon/master

cmake: Fix Finddpdk cmake module

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
3 years agoMerge pull request #44498 from phlogistonjohn/jjm-root-check-later
Adam King [Thu, 13 Jan 2022 17:10:13 +0000 (12:10 -0500)]
Merge pull request #44498 from phlogistonjohn/jjm-root-check-later

cephadm: check if cephadm is root after cli is parsed

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #44394 from melissa-kun-li/enable-autotune
Adam King [Thu, 13 Jan 2022 17:06:46 +0000 (12:06 -0500)]
Merge pull request #44394 from melissa-kun-li/enable-autotune

Enable autotune for osd_memory_target on bootstrap

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
3 years agoMerge pull request #44306 from sebastian-philipp/normalize_image_digest-ambiguity
Adam King [Thu, 13 Jan 2022 17:03:50 +0000 (12:03 -0500)]
Merge pull request #44306 from sebastian-philipp/normalize_image_digest-ambiguity

cephadm: deal with ambiguity within normalize_image_digest

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Sage Weil <sage@newdream.net>
3 years agodoc/rbd/rbd-config-ref: add more detail on QoS settings 44554/head
Josh Durgin [Wed, 12 Jan 2022 03:17:15 +0000 (22:17 -0500)]
doc/rbd/rbd-config-ref: add more detail on QoS settings

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
3 years agohandling arm64(arrow installation) 40802/head
gal salomon [Thu, 13 Jan 2022 15:47:23 +0000 (17:47 +0200)]
handling arm64(arrow installation)

Signed-off-by: gal salomon <gal.salomon@gmail.com>
3 years agoMerge pull request #44427 from lxbsz/client_cleanup
Venky Shankar [Thu, 13 Jan 2022 15:04:54 +0000 (20:34 +0530)]
Merge pull request #44427 from lxbsz/client_cleanup

client: remove useless Lx cap check

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44451 from lxbsz/wip-53750
Venky Shankar [Thu, 13 Jan 2022 15:03:58 +0000 (20:33 +0530)]
Merge pull request #44451 from lxbsz/wip-53750

mds: directly return just after responding the link request

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44561 from cbodley/wip-51727
Casey Bodley [Thu, 13 Jan 2022 14:38:49 +0000 (09:38 -0500)]
Merge pull request #44561 from cbodley/wip-51727

qa/rgw: add PG_DEGRADED cluster warnings to log-ignorelist

Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
3 years agomgr/dashboard: fix: get SMART data from single-daemon device 44573/head
Alfonso Martínez [Thu, 13 Jan 2022 14:20:48 +0000 (15:20 +0100)]
mgr/dashboard: fix: get SMART data from single-daemon device

Return SMART data even when a device is only associated with a single daemon.

Fixes: https://tracker.ceph.com/issues/53858
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
3 years agoMerge pull request #44538 from dang/wip-dang-zipper-perf
Daniel Gryniewicz [Thu, 13 Jan 2022 14:09:33 +0000 (09:09 -0500)]
Merge pull request #44538 from dang/wip-dang-zipper-perf

RGW Zipper - don't load stats for every bucket load

Reviewed-by: Mark Nelson <mnelson@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #44002 from JoshSalomon/wip-primary-balancer
Laura Flores [Thu, 13 Jan 2022 13:45:54 +0000 (07:45 -0600)]
Merge pull request #44002 from JoshSalomon/wip-primary-balancer

3 years agocmake: dpdk: only append common dir if it has been found 44577/head
Clément Péron [Thu, 13 Jan 2022 13:32:20 +0000 (14:32 +0100)]
cmake: dpdk: only append common dir if it has been found

Signed-off-by: Clément Péron <peron.clem@gmail.com>
3 years agocmake: dpdk: use STREQUAL and not EQUAL when comparing strings
Clément Péron [Thu, 13 Jan 2022 13:27:33 +0000 (14:27 +0100)]
cmake: dpdk: use STREQUAL and not EQUAL when comparing strings

Signed-off-by: Clément Péron <peron.clem@gmail.com>
3 years agocmake: dpdk: fix typo in HINTS when looking for DPDK
Clément Péron [Thu, 13 Jan 2022 13:26:29 +0000 (14:26 +0100)]
cmake: dpdk: fix typo in HINTS when looking for DPDK

Signed-off-by: Clément Péron <peron.clem@gmail.com>
3 years agoqa: adjust for MDSs to get deployed before verifying their availability 44570/head
Venky Shankar [Tue, 11 Jan 2022 09:05:03 +0000 (14:35 +0530)]
qa: adjust for MDSs to get deployed before verifying their availability

The check happens when some MDSs are *just* deployed by cephadm causing
jobs to fail with:

     Command failed on smithi016 with status 1: 'sudo /home/ubuntu/cephtest/cephadm \
     --image docker.io/ceph/ceph:v16.2.4 shell -c /etc/ceph/ceph.conf -k \
     /etc/ceph/ceph.client.admin.keyring --fsid 403bfcae-706b-11ec-8c32-001a4aab830c \
     -- bash -c \'ceph --format=json mds versions | jq -e ". | add == 4"\''

Fixes: http://tracker.ceph.com/issues/53857
Signed-off-by: Venky Shankar <vshankar@redhat.com>
3 years agomds: directly return just after responding the link request 44451/head
Xiubo Li [Tue, 4 Jan 2022 03:18:53 +0000 (11:18 +0800)]
mds: directly return just after responding the link request

Fixes: https://tracker.ceph.com/issues/53750
Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agoMerge pull request #43286 from lxbsz/improve_setattr
Venky Shankar [Thu, 13 Jan 2022 12:53:27 +0000 (18:23 +0530)]
Merge pull request #43286 from lxbsz/improve_setattr

client: buffer the truncate if we have the Fx caps

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agosrc/osd: reset objects_scrubbed count at the beginning of a new scrub
Aishwarya Mathuria [Thu, 13 Jan 2022 12:47:59 +0000 (18:17 +0530)]
src/osd: reset objects_scrubbed count at the beginning of a new scrub

Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
3 years agoclient: remove useless Lx cap check 44427/head
Xiubo Li [Thu, 30 Dec 2021 07:03:35 +0000 (15:03 +0800)]
client: remove useless Lx cap check

Once here the new_caps must have the 'Ls' caps, the extra check
for 'Lsx' makes no sense.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agoMerge pull request #44229 from lxbsz/mds-buffix
Venky Shankar [Thu, 13 Jan 2022 12:46:13 +0000 (18:16 +0530)]
Merge pull request #44229 from lxbsz/mds-buffix

mds: remove the duplicated or incorrect respond

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44397 from lxbsz/wip-53726
Venky Shankar [Thu, 13 Jan 2022 12:45:24 +0000 (18:15 +0530)]
Merge pull request #44397 from lxbsz/wip-53726

mds: dump tree '/' when the path is empty

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #44422 from lxbsz/wip-51705
Venky Shankar [Thu, 13 Jan 2022 12:44:14 +0000 (18:14 +0530)]
Merge pull request #44422 from lxbsz/wip-51705

qa: do not use any time related suffix for *_op_timeouts

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agomonitoring: Add unit tests for OSD panels in ceph-cluster dashboard 43685/head
Patrick Seidensal [Thu, 9 Dec 2021 14:01:54 +0000 (15:01 +0100)]
monitoring: Add unit tests for OSD panels in ceph-cluster dashboard

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
3 years agomonitoring: fix display ceph_osd_in in Grafana panel
Patrick Seidensal [Thu, 9 Dec 2021 13:59:49 +0000 (14:59 +0100)]
monitoring: fix display ceph_osd_in in Grafana panel

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
3 years agomgr/prometheus: Fix regression with OSD/host details/overview dashboards
Patrick Seidensal [Mon, 25 Oct 2021 13:00:14 +0000 (15:00 +0200)]
mgr/prometheus: Fix regression with OSD/host details/overview dashboards

Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.

As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk.  This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros).  The data we
have expected is simply different in some rare cases.

I have not found a sole PromQL solution to this issue. What we basically
need is the following.

1. Match on labels `host` and `instance` to get one or more OSD names
   from a metadata metric (`ceph_disk_occupation`) to let a user know
   about which OSDs belong to which disk.

2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
   in which case the value of `ceph_daemon` must not refer to more than
   a single OSD. The exact opposite to requirement 1.

As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.

Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk).  This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.

`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.

    foo * on(ceph_daemon) group_left ceph_disk_occupation

`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).

    foo * on(device,instance)
    group_left(ceph_daemon) ceph_disk_occupation_human

Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
3 years agorgw/dbstore: GetUser should return ENOENT if no user found 44440/head
Soumya Koduri [Mon, 3 Jan 2022 12:29:17 +0000 (17:59 +0530)]
rgw/dbstore: GetUser should return ENOENT if no user found

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
3 years agorgw/dbstore: Use mutex to protect DB objectmap and prepared stmt
Soumya Koduri [Mon, 3 Jan 2022 12:11:38 +0000 (17:41 +0530)]
rgw/dbstore: Use mutex to protect DB objectmap and prepared stmt

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
3 years agorgw/dbstore: Fix null ptr reference
Soumya Koduri [Mon, 3 Jan 2022 07:26:31 +0000 (12:56 +0530)]
rgw/dbstore: Fix null ptr reference

Initialize Object state once and use the same for all its
references. Also fixed a bug in SQLGetLC::prepare()

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
3 years agorgw/dbstore: Fixing s3 test 'test_bucket_delete_nonempty'
Soumya Koduri [Tue, 14 Dec 2021 06:07:14 +0000 (11:37 +0530)]
rgw/dbstore: Fixing s3 test 'test_bucket_delete_nonempty'

if delete_children not set to 'true', delete bucket should
fail with ENOTEMPTY

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
3 years agoMerge pull request #43995 from TRYTOBE8TME/wip-rgw-kafka-teuth-cleanup
Yuval Lifshitz [Thu, 13 Jan 2022 09:57:03 +0000 (11:57 +0200)]
Merge pull request #43995 from TRYTOBE8TME/wip-rgw-kafka-teuth-cleanup

qa/tasks: Checking for kafka cleanup

3 years agomgr/prometheus: Refactoring: Introduce type aliases
Patrick Seidensal [Mon, 25 Oct 2021 08:51:35 +0000 (10:51 +0200)]
mgr/prometheus: Refactoring: Introduce type aliases

Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
3 years agoosd, tools: refactor OSDMap::calc_pg_upmaps (simplify the code) 44002/head
Josh Salomon [Thu, 13 Jan 2022 02:23:07 +0000 (02:23 +0000)]
osd, tools: refactor OSDMap::calc_pg_upmaps (simplify the code)

This is the first commit in a series of commits that aims at adding a primary balancer to Ceph and improving the current upmap balancer functionality. This first commit focuses on simplifying (refactoring) the code of `calc_pg_upmaps` so it is easier to change in the future. This PR keeps the existing functionality as-is and does not change anything but the code structure.

As part of the work is major refactoring of OSDMap::calc_pg_upmaps, the first thing is adding an --upmap-seed param to osdmaptool so test results can be compared without the random factor.

Other changes made:
    - Divided sections of `OSDMap::calc_pg_upmaps` into their own separate functions
    - Renamed tmp to tmp_osd_map
    - Changed all the occurances of 'first' and 'second' in the function to more meaningful names.

Signed-off-by: Josh Salomon <josh.salomon@gmail.com>
3 years agoMerge pull request #43299 from markhpc/wip-age-binning-rebase-20210923
Yuri Weinstein [Thu, 13 Jan 2022 00:54:23 +0000 (16:54 -0800)]
Merge pull request #43299 from markhpc/wip-age-binning-rebase-20210923

common/PriorityCache: Updated Implementation of Cache Age Binning

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoparquet implementation:
gal salomon [Mon, 12 Apr 2021 05:54:37 +0000 (08:54 +0300)]
parquet implementation:
(1) adding arrow/parquet to make(install is missing)
(2) s3select-operation contains 2 flows CSV and Parquet
(3) upon parquet-flow s3select processing engine is calling (via callback) to get-size and range-request, the range-requests are a-sync, thus the caller is waiting until notification.
(4) flow : execute --> s3select --(arrow layer)--> range-request --> GetObj::execute --> send_response_data --> notify-range-request --> (back-to) --> s3select
(5) on parquet flow the s3select is handling the response (using call-backs) because of aws-response-limitation (16mb)

add unique pointer (rgw_api); verify magic number for parquet objects; s3select module update
fix buffer-over-flow (copy range request)
change the range-request flow. now,it needs to use the callback parametrs (ofs & len) and not to use the element length
refactoring.  seperate the CSV flow from the parquet flow, a phase before adding conditional build(depend on arrow package installation)
adding arrow/parquet installation to debian/control
align s3select repo with RGW (missing API"s, such as get_error_description)
undefined reference to arrow symbol
fix comment: using optional_yield by value
fix comments; remove future/promise
s3select: a leak fix
s3select: fixing result production
s3select,s3tests : parquet alignments
typo: git-remote --> git_remote
s3select: remove redundant comma(end of projections); bug fix in parquet flow upon aggregation queries
adding arrow/parquet
editorial. remove blank lines
s3select: merged with master(output serialization,presto alignments)
merging(not rebase) master functionlities into parquet branch

(*) a dedicated source-files for s3select operation.
(*) s3select-engine: fix leaks on parquet flows, enabling allocate csv_object and parquet_object on stack
(*) the csv_object and parquet object allocated on stack (no heap allocation)

move data-members from heap to stack allocation, refactoring, separate flows for CSV and parquet. s3select: bug fix

conditional build: upon arrow package is installed the parquet flow become visable, thus enables to process parquet object. in case the package is not installed only CSV is usable

remove redundant try/catch, s3select: fix compile warning

arrow-devel version should be higher than 4.0.0, where arrow::io::AsyncContext become depecrated

missing sudo; wrong url;move the rm -f arrow.list

replace codename with $(lsb_release -sc)

arrow version should be >= 4.0.0; iocontext not exists in namespace on lower versions

RGW points to s3select/master

s3select submodule

sudo --> $SUDO

Signed-off-by: gal salomon <gal.salomon@gmail.com>
3 years agoqa/rgw: add PG_DEGRADED cluster warnings to log-ignorelist 44561/head
Casey Bodley [Wed, 12 Jan 2022 19:07:26 +0000 (14:07 -0500)]
qa/rgw: add PG_DEGRADED cluster warnings to log-ignorelist

and cover rgw/singleton suite

Fixes: https://tracker.ceph.com/issues/51727
Signed-off-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #43494 from majianpeng/enable-test-librbd-BlockGuard
Ilya Dryomov [Wed, 12 Jan 2022 20:50:00 +0000 (21:50 +0100)]
Merge pull request #43494 from majianpeng/enable-test-librbd-BlockGuard

test/librbd: re-enable BlockGuard test

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agotest/objectstore: verify the huge page-backed reading of BlueStore. 43849/head
Radoslaw Zarzynski [Mon, 10 Jan 2022 23:41:35 +0000 (23:41 +0000)]
test/objectstore: verify the huge page-backed reading of BlueStore.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agocommon: introduce instrumented_raw to buffer_instrumentation
Radoslaw Zarzynski [Mon, 10 Jan 2022 23:31:27 +0000 (23:31 +0000)]
common: introduce instrumented_raw to buffer_instrumentation

Its initial user will be a unit test for BlueStore's huge
paged-backed reading.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agocommon, test: move instrumented_bptr to a dedicated header.
Radoslaw Zarzynski [Mon, 10 Jan 2022 23:19:59 +0000 (23:19 +0000)]
common, test: move instrumented_bptr to a dedicated header.

We're going to reuse it outside `test/bufferlist.cc`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoblk: don't cache the huge page-based buffers of KernelDevice.
Radoslaw Zarzynski [Mon, 8 Nov 2021 20:09:19 +0000 (20:09 +0000)]
blk: don't cache the huge page-based buffers of KernelDevice.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoblk: introduce multi-size huge page pools to KernelDevice.
Radoslaw Zarzynski [Mon, 8 Nov 2021 16:32:04 +0000 (16:32 +0000)]
blk: introduce multi-size huge page pools to KernelDevice.

When testing remember about `bluestore_max_blob_size` as it's
only 64 KB by default while the entire huge page-based pools
machinery targets far bigger scenrios (initially 4 MB!).

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoblk: move the buffer size of ExplicitHugePagePool to run-time.
Radoslaw Zarzynski [Mon, 8 Nov 2021 14:11:05 +0000 (14:11 +0000)]
blk: move the buffer size of ExplicitHugePagePool to run-time.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoblk: bring MAP_HUGETLB-based buffer pool to KernelDevice.
Radoslaw Zarzynski [Thu, 4 Nov 2021 20:50:17 +0000 (20:50 +0000)]
blk: bring MAP_HUGETLB-based buffer pool to KernelDevice.

The idea here is to bring a pool of `mmap`-allocated,
constantly-sized buffers which would take precedence
over the 2 MB-aligned, THP-based mechanism. On first
attempt to acquire a 4 MB buffer, KernelDevice mmaps
`bdev_read_preallocated_huge_buffer_num` (default 128)
memory regions using the MAP_HUGETLB option. If this
fails, the entire process is aborted. Buffers, after
their life-times going over, are recycled with lock-
free queue shared across entire process.

Remember about allocating the appropriate number of
huge pages in the system! For instance:

```
echo 256 | sudo tee /proc/sys/vm/nr_hugepages
```

This commit bases on / cherry-picks with changes
897a4932bee5cba3641c18619cccd0ee945bfcf8.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoblk: make the buffer alignment configurable in KernelDevice.
Radoslaw Zarzynski [Thu, 28 Jan 2021 15:42:34 +0000 (16:42 +0100)]
blk: make the buffer alignment configurable in KernelDevice.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoblk, os/bluestore: introduce a cache bypassing to IOContext and BlueStore.
Radoslaw Zarzynski [Wed, 3 Nov 2021 18:13:49 +0000 (18:13 +0000)]
blk, os/bluestore: introduce a cache bypassing to IOContext and BlueStore.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>