]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
4 years agoceph-volume: show devices with GPT headers as not available 40315/head
Andrew Schoen [Wed, 17 Mar 2021 20:19:08 +0000 (15:19 -0500)]
ceph-volume: show devices with GPT headers as not available

This patch ensures that if a device has GPT headers it will
not show up in `ceph-volume inventory` as available.

Fixes: https://tracker.ceph.com/issues/48697
Resolves: rhbz#1908065

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 1347243242fc55c904c4d94fb43bdf0bcfc23ab0)

4 years agoMerge pull request #40254 from singuliere/wip-49767-pacific
Yuri Weinstein [Mon, 22 Mar 2021 15:22:43 +0000 (08:22 -0700)]
Merge pull request #40254 from singuliere/wip-49767-pacific

pacific: librbd: allow interrupted trash move request to be restarted

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #40253 from singuliere/wip-49773-pacific
Yuri Weinstein [Mon, 22 Mar 2021 15:22:13 +0000 (08:22 -0700)]
Merge pull request #40253 from singuliere/wip-49773-pacific

pacific: librbd/io: send alloc_hint when compression hint is set

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge PR #40247 into pacific
Sage Weil [Sun, 21 Mar 2021 18:25:06 +0000 (13:25 -0500)]
Merge PR #40247 into pacific

* refs/pull/40247/head:
common: reset last_log_sent when clog_to_monitors is updated
logclient: move LogChannel::set_log_to_monitors(bool v) to LogClient.cc

Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge PR #40246 into pacific
Sage Weil [Sun, 21 Mar 2021 18:24:25 +0000 (13:24 -0500)]
Merge PR #40246 into pacific

* refs/pull/40246/head:
osd: fix potential null pointer dereference when sending ping

Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge PR #40126 into pacific
Sage Weil [Sun, 21 Mar 2021 18:23:41 +0000 (13:23 -0500)]
Merge PR #40126 into pacific

* refs/pull/40126/head:
pybind/mgr/balancer/module.py: assign weight-sets to all buckets before balancing

Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge PR #40249 into pacific
Sage Weil [Sun, 21 Mar 2021 18:22:56 +0000 (13:22 -0500)]
Merge PR #40249 into pacific

* refs/pull/40249/head:
osd: ignore already dumped osd in dump_item()

Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge PR #40248 into pacific
Sage Weil [Sun, 21 Mar 2021 18:22:08 +0000 (13:22 -0500)]
Merge PR #40248 into pacific

* refs/pull/40248/head:
debian/ceph-common.postinst: do not chown cephadm log dirs

Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge pull request #40285 from tchaikov/pacific-pr-40272
Kefu Chai [Sun, 21 Mar 2021 17:19:49 +0000 (01:19 +0800)]
Merge pull request #40285 from tchaikov/pacific-pr-40272

pacific: install-deps.sh: remove existing ceph-libboost of different version

Reviewed-by: David Galloway <dgallowa@redhat.com>
4 years agoMerge PR #40231 into pacific
Sage Weil [Sun, 21 Mar 2021 14:39:20 +0000 (09:39 -0500)]
Merge PR #40231 into pacific

* refs/pull/40231/head:
mgr/dashboard: check .badge instead of text for expected label
mgr/dashboard: Add badge to the Label column in Host List

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
4 years agoMerge PR #40209 into pacific
Sage Weil [Sun, 21 Mar 2021 14:38:58 +0000 (09:38 -0500)]
Merge PR #40209 into pacific

* refs/pull/40209/head:
mgr/dashboard: select any object gateway on local cluster.

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
4 years agoMerge PR #40129 into pacific
Sage Weil [Sun, 21 Mar 2021 14:38:49 +0000 (09:38 -0500)]
Merge PR #40129 into pacific

* refs/pull/40129/head:
osd: PeeringState: implement an acting_set_writeable() function
osd: PeeringState: fix a boolean conditional direction
osd: PeeringState: fix stretch peering so PGs can go peered but not active
osd: PeeringState: don't add acting-set OSDs to candidate set in stretch mode
osd: PeeringState: fix calc_replicated_acting_stretch() syntax/logic
osd: PeeringState: respect stretch peering constraints for async recovery
osd: PeeringState: add a comment about using size as a proxy for activateable
osd: check for is_stretch_pool() in stretch_set_can_peer()
scripts: some additions to help with local testing
script: set_up_stretch_mode: include OSDs in root=default so pg creation works

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoinstall-deps.sh: remove existing ceph-libboost of different version 40269/head 40285/head
Kefu Chai [Sat, 20 Mar 2021 05:00:01 +0000 (13:00 +0800)]
install-deps.sh: remove existing ceph-libboost of different version

we install different versions of precompiled ceph-libboost packages
for different branches when building and testing them on ubuntu test
nodes. for instance,

- nautilus: v1.72
- octopus, pacific: v1.73

they share the same set of test nodes. and these ceph-libboost packages
conflict with each other, because they install files to the same places.

in order to avoid the confliction, we should uninstall existing packages
before installing a different version of ceph-libboost packages.

ceph-libboost${version}-dev is a package providing the shared headers of
boost library, so, in this change we check if it is installed before
returning or removing the existing packages.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 939b147a55192c21e98d21cb380d0ec0b2ca84d5)

Conflicts:
install-deps.sh: trivial resolution

4 years agoMerge pull request #40273 from singuliere/wip-49907-pacific
Kefu Chai [Sun, 21 Mar 2021 05:45:47 +0000 (13:45 +0800)]
Merge pull request #40273 from singuliere/wip-49907-pacific

pacific: pybind/mgr/dashboard: bump flake8 to 3.9.0

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agopybind/mgr/dashboard: remove "python_version >= 3' 40273/head
Kefu Chai [Fri, 19 Mar 2021 04:24:28 +0000 (12:24 +0800)]
pybind/mgr/dashboard: remove "python_version >= 3'

remove "python_version >= '3'" from requirements-lint.txt, as we've
dropped the Python2 support.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit de9a6a4d6c6e20f6ba6ee7798e0a29431d04def9)

4 years agopybind/mgr/dashboard: bump flake8 to 3.9.0
Kefu Chai [Fri, 19 Mar 2021 04:05:45 +0000 (12:05 +0800)]
pybind/mgr/dashboard: bump flake8 to 3.9.0

to address the failure of

ERROR: Cannot install -r requirements-lint.txt (line 2) and -r requirements-lint.txt (line 8) because these package versions have conflicting dependencies.

The conflict is caused by:
    flake8 3.8.4 depends on pycodestyle<2.7.0 and >=2.6.0a1
    autopep8 1.5.6 depends on pycodestyle>=2.7.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 152964ca360293d9accd18f435efcd66d145063e)

4 years agoMerge pull request #40226 from neha-ojha/wip-49895-pacific
Yuri Weinstein [Fri, 19 Mar 2021 21:13:45 +0000 (14:13 -0700)]
Merge pull request #40226 from neha-ojha/wip-49895-pacific

pacific: osd: remove a ceph_assert() from a legitimate path

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoMerge pull request #40221 from sseshasa/wip-49886-pacific
Yuri Weinstein [Fri, 19 Mar 2021 21:13:14 +0000 (14:13 -0700)]
Merge pull request #40221 from sseshasa/wip-49886-pacific

pacific: qa/tasks: Add additional wait_for_clean() check in lost_unfound tasks.

Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #40197 from neha-ojha/wip-39757-pacific
Yuri Weinstein [Fri, 19 Mar 2021 21:12:51 +0000 (14:12 -0700)]
Merge pull request #40197 from neha-ojha/wip-39757-pacific

pacific: qa: Add bluestore resharding test

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoMerge pull request #39997 from sseshasa/wip-49699-pacific
Yuri Weinstein [Fri, 19 Mar 2021 21:12:19 +0000 (14:12 -0700)]
Merge pull request #39997 from sseshasa/wip-49699-pacific

pacific: osd: Refinements to mclock built-in profiles implementation.

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agorbd: clarify trash remove error code from interrupted move 40254/head
Jason Dillaman [Wed, 10 Mar 2021 20:31:22 +0000 (15:31 -0500)]
rbd: clarify trash remove error code from interrupted move

Fixes: https://tracker.ceph.com/issues/49716
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 138d71fb0635682510cadda8e4ad5aaab3f39e44)

4 years agolibrbd/trash: don't return -ENOENT error from move state machine
Jason Dillaman [Wed, 10 Mar 2021 20:37:39 +0000 (15:37 -0500)]
librbd/trash: don't return -ENOENT error from move state machine

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit f6ed98d682e562de1cad301696e918c52a4dba5d)

4 years agolibrbd/api: trash remove/purge should indicate interrupted move
Jason Dillaman [Wed, 10 Mar 2021 20:29:11 +0000 (15:29 -0500)]
librbd/api: trash remove/purge should indicate interrupted move

This will help the user self-diagnose that a trash move operation
was interrupted and therefore the state is invalid.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c808abea64f00e25c6fd3bcaa7ebf9bc763e7ca0)

4 years agolibrbd/api: allow an interrupted trash move to be restarted
Jason Dillaman [Wed, 10 Mar 2021 20:15:26 +0000 (15:15 -0500)]
librbd/api: allow an interrupted trash move to be restarted

Search the trash entries for a matching image name that is
still in the moving state and allow the operation to be
restarted.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit ed2d696e1eafaa59d29ce6fac952e4e5f4f1e920)

4 years agolibrbd/api: helper method for natively listing the trash
Jason Dillaman [Wed, 10 Mar 2021 19:44:36 +0000 (14:44 -0500)]
librbd/api: helper method for natively listing the trash

The existing list method converts the native TrashImageSpec to the
API's rbd_trash_image_info_t which is missing the source field.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 21adc927fe50ae37069d77482edd4c4e098433c9)

4 years agolibrbd/io: send alloc_hint when compression hint is set 40253/head
Jason Dillaman [Fri, 12 Mar 2021 00:44:15 +0000 (19:44 -0500)]
librbd/io: send alloc_hint when compression hint is set

Previously the hint would not be set if the object map indicated the
object may exist.

Fixes: https://tracker.ceph.com/issues/49690
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit b52b5fe06d1f88130b72b8357dbf5630c7cf1cbd)

4 years agoosd: ignore already dumped osd in dump_item() 40249/head
jhonxue [Fri, 5 Mar 2021 15:33:10 +0000 (23:33 +0800)]
osd: ignore already dumped osd in dump_item()

Fixes: https://tracker.ceph.com/issues/49627
Signed-off-by: Xue Yantao <jhonxue@tencent.com>
(cherry picked from commit 7813819445e73d1e7f333bd9aaaf42624cd781ec)

4 years agodebian/ceph-common.postinst: do not chown cephadm log dirs 40248/head
Sage Weil [Tue, 9 Mar 2021 17:56:42 +0000 (11:56 -0600)]
debian/ceph-common.postinst: do not chown cephadm log dirs

The container uid/gid is different than the debian uid/gid (because the
container is centos-based and we got a different uid/gid allocation there).

Fixes: https://tracker.ceph.com/issues/49677
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit b89ffdcae51303f185e1b423a948df353497250f)

4 years agocommon: reset last_log_sent when clog_to_monitors is updated 40247/head
Gerald Yang [Wed, 3 Mar 2021 04:37:15 +0000 (04:37 +0000)]
common: reset last_log_sent when clog_to_monitors is updated

When clog_to_monitors is disabled, "last_log" still keeps increasing by
get_next_seq() if OSD writes info to clog

But "last_log_sent" doesn't increase, if we disable clog_to_monitors for
a bit longer and then re-enabling it, the num_unsent could be bigger than
log_queue_size(), it will trigger an assertion in _get_mon_log_message

We need to reset last_log_sent to last_log before updating clog_to_monitors

Signed-off-by: Gerald Yang <gerald.yang@canonical.com>
(cherry picked from commit 294ddf9ba779d40b0bc859e55f5287379c75624f)

4 years agologclient: move LogChannel::set_log_to_monitors(bool v) to LogClient.cc
Gerald Yang [Thu, 21 Jan 2021 08:16:48 +0000 (08:16 +0000)]
logclient: move LogChannel::set_log_to_monitors(bool v) to LogClient.cc

Signed-off-by: Gerald Yang <gerald.yang@canonical.com>
(cherry picked from commit faf2e099ca58868e0b35e5b6f9639c1ecabb4e16)

4 years agoosd: fix potential null pointer dereference when sending ping 40246/head
Mykola Golub [Sat, 16 Jan 2021 05:00:09 +0000 (05:00 +0000)]
osd: fix potential null pointer dereference when sending ping

Fixes: https://tracker.ceph.com/issues/48821
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 86576b09973b857ec2fe8195069e21812992db26)

4 years ago16.1.0
Jenkins Build Slave User [Fri, 19 Mar 2021 16:54:22 +0000 (16:54 +0000)]
16.1.0

4 years agoMerge pull request #40165 from dillaman/wip-librbd-backports-pacific-9
Jason Dillaman [Fri, 19 Mar 2021 12:40:43 +0000 (08:40 -0400)]
Merge pull request #40165 from dillaman/wip-librbd-backports-pacific-9

pacific: librbd: miscellaneous backports

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
4 years agomgr/dashboard: check .badge instead of text for expected label 40231/head
Nizamudeen A [Mon, 8 Feb 2021 20:21:25 +0000 (01:51 +0530)]
mgr/dashboard: check .badge instead of text for expected label

this change fixes a regression introduced by
8c5e31ec1a13bc53394eb2cb6880d74db169fac4 which broke the 01-hosts.e2e-spec.ts test
driven by test_dashboard_e2e.sh

Fixes: https://tracker.ceph.com/issues/49205
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 6156055a78e63cef0eede0670816a24c3a097b4c)

4 years agomgr/dashboard: Add badge to the Label column in Host List
Nizamudeen A [Tue, 2 Feb 2021 15:12:02 +0000 (20:42 +0530)]
mgr/dashboard: Add badge to the Label column in Host List

Fixes: https://tracker.ceph.com/issues/49105
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 8c5e31ec1a13bc53394eb2cb6880d74db169fac4)

4 years agoMerge pull request #40228 from neha-ojha/wip-revert-39637
Neha Ojha [Fri, 19 Mar 2021 01:40:04 +0000 (18:40 -0700)]
Merge pull request #40228 from neha-ojha/wip-revert-39637

pacific: Revert "PendingReleaseNotes: mgr/pg_autoscaler"

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
4 years agoRevert "PendingReleaseNotes: mgr/pg_autoscaler" 40228/head
Neha Ojha [Thu, 18 Mar 2021 23:50:46 +0000 (23:50 +0000)]
Revert "PendingReleaseNotes: mgr/pg_autoscaler"

This reverts commit ce45584800f81d1d70d39a76d78778f0ccd73bb2.

Needs reverting since the corresponding code changes were reverted in
https://github.com/ceph/ceph/pull/39921.

Signed-off-by: Neha Ojha <nojha@redhat.com>
4 years agoosd: remove a ceph_assert() from a legitimate path 40226/head
Ronen Friedman [Wed, 17 Mar 2021 15:21:10 +0000 (17:21 +0200)]
osd: remove a ceph_assert() from a legitimate path

on_replica_init() might be legitimately called twice,
if the replica was waiting for updates to complete
before servicing the request.

Fixes: https://tracker.ceph.com/issues/49867
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 437456ecf9429dd5623cda105e1399234fcc86de)

4 years agoMerge pull request #40180 from linuxbox2/wip-pacific-lcloop
Matt Benjamin [Thu, 18 Mar 2021 20:04:40 +0000 (16:04 -0400)]
Merge pull request #40180 from linuxbox2/wip-pacific-lcloop

rgw: lc: fix infinite loop in bucket_lc_prepare

4 years agotest: ignore failures to force-enable lockdep 40165/head
Jason Dillaman [Wed, 17 Mar 2021 19:29:37 +0000 (15:29 -0400)]
test: ignore failures to force-enable lockdep

PR #40062 tweaked the behavior of lockdep to compile it out
of the code entirely for release builds. This fixes several
gtests where lockdep was force-enabled.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit bdc1178bd8a722233743a1b6ad63f79dccb3f8f8)

4 years agotest/pybind/rbd: fixed functional change in encryption API
Jason Dillaman [Wed, 17 Mar 2021 18:14:48 +0000 (14:14 -0400)]
test/pybind/rbd: fixed functional change in encryption API

The encryption format API now also implicitly loads the encryption
layer. This tweaks the tests to account for this functional
difference.

Fixes: https://tracker.ceph.com/issues/49848
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 625244f999a5ecaf908220d7bc68c81bab01cc6a)

4 years agorbd/cache/pwl: update wait_buffer state and add wake_up
Yin Congmin [Mon, 15 Mar 2021 07:34:35 +0000 (15:34 +0800)]
rbd/cache/pwl: update wait_buffer state and add wake_up

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit 21cc46bb3aaf3315ceeef786710f6874c1ab6e86)

4 years agolibrbd/cache/pwl: set max size of continuous data
Yin Congmin [Mon, 8 Mar 2021 16:26:04 +0000 (00:26 +0800)]
librbd/cache/pwl: set max size of continuous data

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit bcad92c126526be7ba249322ac3ead0d83b4d188)

4 years agoqa: krbd_blkroset.t: update for separate hw and user read-only flags
Ilya Dryomov [Wed, 17 Mar 2021 10:00:33 +0000 (11:00 +0100)]
qa: krbd_blkroset.t: update for separate hw and user read-only flags

Since kernel 5.12, hardware read-only state and user read-only
policy (BLKROGET/SET ioctls) are tracked separately in the block
layer.  As the purpose of our ->set_read_only() method was exactly
that, it was removed.

As a side effect, BLKROSET no longer returns EROFS on an attempt
to make a read-only mapping read-write with "blockdev --setrw".
The policy gets updated, but the device remains read-only as before
because the hardware (== mapping) state is controlled by the driver.

Fixes: https://tracker.ceph.com/issues/49858
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d72fca26edcff49d203ed6fb940e0cf331e943dd)

4 years agokrbd: check device node accessibility only if we actually mapped
Ilya Dryomov [Mon, 15 Mar 2021 19:30:07 +0000 (20:30 +0100)]
krbd: check device node accessibility only if we actually mapped

Fix a braino that came with commit f6854ac65d2a ("krbd: make sure the
device node is accessible after the mapping").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 8330c9fa4e27204c768777afe45af0eeb273c835)

4 years agomgr: enhance the rados service
Xiubo Li [Thu, 4 Feb 2021 06:14:13 +0000 (14:14 +0800)]
mgr: enhance the rados service

For some use cases, like the tcmu-runner, there maybe handreds or
thousands of LUNs, and then for each LUN it will register one service
daemon, then in the `ceph -s` output will be full of useless info.

This will allow to classify the sevices service daemons in one
specified format by adding two pairs in metadata:

  "daemon_type"   : "${TYPE}"
  "daemon_prefix" : "${PREFIX}"

TYPE: will be used to replace the default "daemon(s)"
showed in `ceph -s`. If absent, the "daemon" will be used.
PREFIX: if present the active members will be classified
by the prefix instead of "daemon_name".

For exmaple for iscsi gateways, it will be something likes:
  "daemon_type"   : "portal"
  "daemon_prefix" : "gw${N}"

Then the `ceph -s` output will be:

  ...
  services:
    mon:   3 daemons, quorum a,b,c (age 50m)
    mgr:   x(active, since 49m)
    mds:   a:1 {0=c=up:active} 2 up:standby
    osd:   3 osds: 3 up (since 49m), 3 in (since 49m)
    iscsi: 8 portals active (gw0, gw1, gw2, gw3, gw4, gw5, gw6, gw7)
  ...

Fixes: https://tracker.ceph.com/issues/49057
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit a968f65d784b3d6c6a172929aa293f09e6917fa6)

4 years agodoc/RBD:fixes for ceph-immutable-object-cache daemon enable command
Rachanaben Patel [Tue, 16 Mar 2021 22:37:46 +0000 (15:37 -0700)]
doc/RBD:fixes for ceph-immutable-object-cache daemon enable command

Document for rbd-persistent-read-only-cache show how to manage
ceph-immutable-object-cache daemon using systemd.
command example needs fixing.It should be

systemctl enable ceph-immutable-object-cache@ceph-immutable-object-cache.{unique id}

Fixes: https://tracker.ceph.com/issues/49849
Signed-off-by: Rachanaben Patel <racpatel@redhat.com>
(cherry picked from commit f000ecb64e6e10c9525cc303e15df477b5670570)

4 years agoosd: Disable sleep times for all best effort clients of mclock 39997/head
Sridhar Seshasayee [Thu, 4 Mar 2021 13:02:01 +0000 (18:32 +0530)]
osd: Disable sleep times for all best effort clients of mclock

If mClockScheduler is scheduling IOs then the various sleep options
for the best effort clients of mclock viz. pg_delete, snaptrim and
scrub are disabled so as to not affect the QoS being applied.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 18fab9054ae730ce68dfad1a7e1f4f7da3eb5e01)

4 years agoosd: handle config change for cost per io and cost per byte options
Sridhar Seshasayee [Thu, 4 Mar 2021 11:50:27 +0000 (17:20 +0530)]
osd: handle config change for cost per io and cost per byte options

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 33c258a973c9b284194678c5332f918e2ea827b4)

4 years agoosd: Add config options for cost per io & byte for the mclock scheduler
Sridhar Seshasayee [Thu, 4 Mar 2021 11:38:58 +0000 (17:08 +0530)]
osd: Add config options for cost per io & byte for the mclock scheduler

The cost per io and cost per byte options for hdd and ssd are specified
and set to default values determined using experiments on hdds and ssds
using a cost model. The values are used in calc_scaled_cost() to
determine the scaled cost for every OpSchedulerItem that is enqueued
within the mClockScheduler.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 2da091229bd3a9c4d81fecacb60b918a614aeb84)

4 years agoqa/tasks: Add additional wait_for_clean() check in lost_unfound tasks. 40221/head
Sridhar Seshasayee [Tue, 16 Mar 2021 19:48:40 +0000 (01:18 +0530)]
qa/tasks: Add additional wait_for_clean() check in lost_unfound tasks.

At the end of the lost_unfound tests add an additional wait_for_clean()
check to ensure that recoveries get enough time to complete before
proceeding and avoid failures down the line. For e.g. failure like
"Scrubbing terminated -- not all pgs were active and clean." is because
recoveries on the PGs did not get sufficient time to complete even though
they were bound to eventually complete.

Fixes: https://tracker.ceph.com/issues/49844
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 88df47230b5ad85e95b0be2eca6f5763914b175c)

4 years agoMerge PR #40119 into pacific
Sage Weil [Thu, 18 Mar 2021 16:47:14 +0000 (11:47 -0500)]
Merge PR #40119 into pacific

* refs/pull/40119/head:
osd: propagate base pool application_metadata to tiers

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge PR #40195 into pacific
Sage Weil [Thu, 18 Mar 2021 15:30:58 +0000 (10:30 -0500)]
Merge PR #40195 into pacific

* refs/pull/40195/head:
Revert "osd: Try other PGs when reservation failures occur"
Revert "test: Add test for scrub parallelism"

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoMerge PR #40156 into pacific
Sage Weil [Thu, 18 Mar 2021 15:16:37 +0000 (10:16 -0500)]
Merge PR #40156 into pacific

* refs/pull/40156/head:
qa/tests: changed image path to 'quay.ceph.io/ceph-ci/ceph:octopus'

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge PR #40181 into pacific
Sage Weil [Thu, 18 Mar 2021 15:16:22 +0000 (10:16 -0500)]
Merge PR #40181 into pacific

* refs/pull/40181/head:
mgr/prometheus: fix typo in get_collect_time_metrics

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agomgr/dashboard: select any object gateway on local cluster. 40209/head
Alfonso Martínez [Wed, 24 Feb 2021 07:20:53 +0000 (08:20 +0100)]
mgr/dashboard: select any object gateway on local cluster.

Dashboard backend settings:
- Refactoring: now accepting more than 1 type of value.
- RGW_API_ACCESS_KEY & RGW_API_SECRET_KEY accept string (backward compatibility: legacy behavior) as well as dictionary of strings for connecting multiple daemons.
- Ease of use: deprecated: mgr/dashboard/RGW_API_USER_ID: not useful anymore (kept for backward compatibility).

UI/UX:
- Created context component (to be shown only on rgw-related routes) for selecting operating daemon.
- Daemon selector only shown if there is more than 1 daemon running on a local cluster (to reduce cognitive load).

Fixes: https://tracker.ceph.com/issues/47375
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 94fe271b06f1e87d37850ac20dd31fa2314e8dfe)

4 years agoMerge PR #40137 into pacific
Sage Weil [Wed, 17 Mar 2021 21:18:53 +0000 (16:18 -0500)]
Merge PR #40137 into pacific

* refs/pull/40137/head:
qa/suites/rados/singletone: whitelist MON_DOWN when injecting msgr errors

Reviewed-by: Yuri Weinstein <yweins@redhat.com>
4 years agoMerge PR #40132 into pacific
Sage Weil [Wed, 17 Mar 2021 21:18:39 +0000 (16:18 -0500)]
Merge PR #40132 into pacific

* refs/pull/40132/head:
mgr: wait for ~3 beacons on startup if mons are pre-pacific
mon/MgrMonitor: populate available_modules from promote_standby()

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoqa: Add bluestore resharing test 40197/head
Adam Kupczyk [Fri, 19 Feb 2021 18:09:48 +0000 (19:09 +0100)]
qa: Add bluestore resharing test

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit a84820b7432926617d710cf05f0e93d0e7151b49)

4 years agoRevert "osd: Try other PGs when reservation failures occur" 40195/head
Neha Ojha [Wed, 17 Mar 2021 16:26:44 +0000 (16:26 +0000)]
Revert "osd: Try other PGs when reservation failures occur"

This reverts commit e0ed0122526791547a317c6ca19ed081a92dfe69.

Signed-off-by: Neha Ojha <nojha@redhat.com>
4 years agoRevert "test: Add test for scrub parallelism"
Neha Ojha [Wed, 17 Mar 2021 16:26:31 +0000 (16:26 +0000)]
Revert "test: Add test for scrub parallelism"

This reverts commit 6f6553939a20ac01d6ce7daaa2a79e5f333c4311.

Signed-off-by: Neha Ojha <nojha@redhat.com>
4 years agomgr/prometheus: fix typo in get_collect_time_metrics 40181/head
Sage Weil [Tue, 16 Mar 2021 20:10:42 +0000 (15:10 -0500)]
mgr/prometheus: fix typo in get_collect_time_metrics

This causes a failure the first time through this function, but
subsequent calls succeed, making it a bit hard to notice.

Fixes: 58fd057e2c8799fa000b9937aa992e13cbbd485f
Fixes: https://tracker.ceph.com/issues/49846
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit c80b944fd4f32dea6a153375d9bdeb4a6e1d0b4c)

4 years agoMerge pull request #39973 from singuliere/wip-49686-pacific
Venky Shankar [Wed, 17 Mar 2021 13:47:47 +0000 (19:17 +0530)]
Merge pull request #39973 from singuliere/wip-49686-pacific

pacific: cephfs-mirror: register mirror daemon as service daemon

4 years agoMerge pull request #39810 from vshankar/wip-49432
Venky Shankar [Wed, 17 Mar 2021 13:46:27 +0000 (19:16 +0530)]
Merge pull request #39810 from vshankar/wip-49432

pacific: tools/cephfs-mirror: fix a dangling pointer

4 years agoMerge PR #40107 into pacific
Patrick Donnelly [Wed, 17 Mar 2021 13:44:13 +0000 (06:44 -0700)]
Merge PR #40107 into pacific

* refs/pull/40107/head:
qa: use tcmalloc with valgrind in fs:valgrind

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #40131 from cbodley/wip-qa-rgw-ignore-pg-avail-pacific
Casey Bodley [Wed, 17 Mar 2021 13:35:06 +0000 (09:35 -0400)]
Merge pull request #40131 from cbodley/wip-qa-rgw-ignore-pg-avail-pacific

pacific: qa/rgw: put PG_AVAILABILITY ignorelist override in its own file

Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #40134 from cbodley/wip-49814
Casey Bodley [Wed, 17 Mar 2021 13:34:23 +0000 (09:34 -0400)]
Merge pull request #40134 from cbodley/wip-49814

pacific: rgw: rgw::sal::RGWBucket initializes creation_time

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
4 years agorgw: lc: fix infinite loop in bucket_lc_prepare 40180/head
Ilsoo Byun [Mon, 7 Dec 2020 06:20:53 +0000 (15:20 +0900)]
rgw: lc: fix infinite loop in bucket_lc_prepare

Fixes: https://tracker.ceph.com/issues/49862
Signed-off-by: Ilsoo Byun <ilsoobyun@linecorp.com>
(cherry picked from commit bc8f304a51afc1398a54cf254e65fd217af00c8a)

4 years agoMerge PR #40135 into pacific
Sage Weil [Tue, 16 Mar 2021 20:14:19 +0000 (15:14 -0500)]
Merge PR #40135 into pacific

* refs/pull/40135/head:
pybind/mgr: correct a MgrModule annotation
mgr/ceph_module: add type annotation to BaseMgrModule
mgr/prometheus: fix warning of possibly unbound variables
mgr/prometheus: flake8 cleanups
mgr/prometheus: fix import failure (flake8)
mgr/prometheus: add type annotations
mgr/prometheus: raise at seeing unknown status
mgr/prometheus: implement command using CLIReadCommand
mgr/{prometheus,telemetry}: appease mypy
mgr/prometheus: add prometheus to flake8 test
mgr/prometheus: escape special chars using r-string
pybind/mgr/prometheus: PEP8 cleanups
pybind/mgr/prometheus: add typing annotations
mgr/prometheus: introduce metric for collection time
mgr/cephadm: fix 'auth caps' fallback
mgr/cephadm: ensure mgr metadata is not none
qa/suites/rados/cephadm: add back centos+rhel with kubic podman
qa/suites/rados/cephadm/upgrade: deploy a legacy r.z-style rgw
qa/suites/rados/cephadm/upgrade: start at 15.2.9 to test iscsi upgrade
qa/tasks/cephadm.py: don't set mgr count to +1
doc/cephadm: add note about deprecation of NFSv3
doc/cephadm: remove step to restart the mgr
doc/cephadm: use `reconfig` instead of `redeploy`
doc/cephadm: update custom j2 config-key name
doc/cephadm: use 'apt' to install cephadm on Ubuntu
mgr/cephadm: remove duplicate labels when adding a host
mgr/cephadm: tolerate failure to update daemon caps
mgr/cephadm: fix get_keyring_with_caps
python-common: fix PlacementSpec target size method
python-common: count-per-host must be combined with label or hosts or host_pattern
mgr/cephadm: handle bare 'count-per-host:NNN', fix comments
mgr/cephadm/schedule: remove Scheduler abstraction (for now at least)
mgr/cephadm/schedule: calculate additions/removals in place()
mgr/cephadm/schedule: allow colocation of certain daemon types
mgr/cephadm/schedule: shuffle candidates, not final placements
mgr/cephadm/schedule: pass per-type allow_colo to the scheduler
mgr/cephadm/services/cephadmservice: fix typo
mgr/cephadm/schedule: pass daemons, not get_daemons_func
mgr/cephadm: use local var
mgr/cephadm/schedule: move host filtering into get_candidates()
python-common/ceph/deployment/service_spec: disallow max-per-host + explicit placement
mgr/cephadm/schedule: respect count-per-host
mgr/cephadm: adjust deployment logic to allow multiple daemons per host
python-common: add count-per-host to PlacementSpec
mgr/cephadm: do not worry about even # of monitors
mgr/cephadm: add iscsi and nfs to upgrade
mgr/cephadm: update caps if necessary when getting keyring
mgr/cephadm: add cephfs-mirror to CEPH_UPGRADE_ORDER
cephadm: Add cephfs-mirror
qa/cephadm: Add cephfs-mirror test
qa/tasks: some type annotations
mgr/orch: Add cephfs-mirror to enum
mgr/cephadm: Add CephfsMirrorService
mgr/orch: replace def add_{type}(...) with generic add_daemon()
mgr/cephadm: drop `create_func` arg from _add_daemon
mgr/cephadm: move CephadmExporter to new module
mgr/cephadm: fix CephadmExporter deployment
cephadm: exporter: use os.path.realpath(__file__)
mgr/cephadm: root mode: call (and deploy) cephadm binary
cephadm: Get rid of injected_argv
cephadm: Make path to cephadm binary unique
python-common: continue to allow RGWSpec(realm=r,zone=z)
PendingReleaseNodes: note changes in cephadm rgw behavior
qa/tasks/cephadm: drop realm.zone convention for rgw
doc: update docs
doc/cephadm: rewrite "adoption process"
doc/cephadm: rewrite "preparation" in adoption.rst
doc/cephadm: add prompts to adoption.rst
doc/cephadm: rewrite part of adoption.rst
python-common/ceph/deployment: RGWSpec: accept (and drop) subcluster arg
mgr/orchestrator: drop $realm.$zone naming convention
mgr/cephadm: rgw: do not mess with realm configuration
mgr/cephadm:Document the cephadm config-check feature
mgr/cephadm:fix to resolve mypy issue
mgr/cephadm:add unit test for the lookup_check helper
mgr/cephadm:Drop active healthcheck during a disable request
mgr/cephadm:Added helper function to return a specific healthcheck
mgr/cephadm:unit test added for nics better than most
mgr/cephadm:skip an alert if the linkspeed is better than most
mgr/cephadm:fix mypy warning
mgr/cephadm:Remove check from ceph metadata gathering
mgr/cephadm:Add unit test for hosts without public network NIC
mgr/cephadm:Minor updates to address review comments
mgr/cephadm:Added CLI interface for the configuration checker
mgr/cephadm:Multiple updates related to the addition of the CLI
mgr/cephadm:Moved 'ownership' of the checker to cephadm
mgr/cephadm:Unit tests updated to account for upgrades
mgr/cephadm:Updates to CephadmConfigChecks class
mgr/cephadm:Adds unit tests for the CephadmConfigChecks class
mgr/cephadm:add module option to enable configuration checks
mgr/cephadm:added ceph version consistency check
mgr/cephadm: added config checker to main serve loop
mgr/cephadm: adding check logic
mgr/cephadm: resolve rebase conflicts
mgr/cephadm:Document the intergration with libstoragemgmt
mgr/cephadm:Enable cephadm device scan to use LSM
mgr/cephadm: prevent traceback when invalid osd id passed to 'orch osd rm stop'
mgr/cephadm: do not prime service cache on reconfig
mgr/cephadm/osd: PEP-8 fix
mgr/cephadm: Activate existing OSDs
mgr/cephadm: osd: Use _run_cephadm_json()
mgr/cephadm: document ok_to_stop output argument for clarity
mgr/DaemonServer: make warning language a bit friendlier
mgr/cephadm/upgrade: improve language a bit
mgr/cephadm/upgrade: restart multiple osds at once
mgr/cephadm: gather other osds that are safe to stop
mgr/cephadm: optional pass 'known' through to ok_to_stop
mgr/cephadm/upgrade: log start/stop/pause/resume
mgr/cephadm: add CEPHADM_STRAY_DAEMON unittest
mgr/cephadm: alias rgw-nfs -> nfs
qa/tasks/cephadm: remove mirror code
cephadm: fixup `alrady` -> `already`
cephadm: Change outer quotes to avoid escaping inner quotes (Q003)
cephadm: Remove bad quotes from multiline string (Q001)
cephadm: Remove bad quotes (Q000)
cephadm: introduce flake8-quotes
cephadm: line break after binary operator (W504)
cephadm: blank line contains whitespace (W293)
cephadm: trailing whitespace (W291)
cephadm: local variable 'e' is assigned to but never used (F841)
cephadm: 'select' imported but unused (F401)
cephadm: ambiguous variable name 'l' (E741)
cephadm: do not use bare 'except' (E722)
cephadm: statement ends with a semicolon (E703)
cephadm: module level import not at top of file (E402)
cephadm: expected 1 blank line before a nested definition (E306)
cephadm: expected 2 blank lines after end of function or class (E305)
cephadm: too many blank lines (E303)
cephadm: expected 2 blank lines, found 1 (E302)
cephadm: expected 1 blank line, found 0 (E301)
cephadm: too many leading '#' for block comment (E266)
cephadm: block comment should start with '# ' (E265)
cephadm: at least two spaces before inline comment (E261)
cephadm: unexpected spaces around keyword / parameter equals (E251)
cephadm: multiple spaces after ',' (E241)
cephadm: missing whitespace after ':' (E231)
cephadm: missing whitespace around arithmetic operator (E226)
cephadm: missing whitespace around operator (E225)
cephadm: whitespace before ':' (E203)
cephadm: whitespace after '{' (E201)
cephadm: continuation line unaligned for hanging indent (E131)
cephadm: continuation line under-indented for visual indent (E128)
cephadm: continuation line over-indented for visual indent (E127)
cephadm: continuation line over-indented for hanging indent (E126)
cephadm: continuation line with same indent as next logical line (E125)
cephadm: closing bracket does not match visual indentation (E124)
cephadm: ... does not match indentation of opening bracket's line (E123)
cephadm: continuation line missing indentation or outdented (E122)
cephadm: continuation line under-indented for hanging indent (E121)
cephadm: over-indented (E117)
cephadm: introduce flake8

Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
4 years agoqa/tests: changed image path to 'quay.ceph.io/ceph-ci/ceph:octopus' 40156/head
Yuri Weinstein [Tue, 16 Mar 2021 16:16:28 +0000 (09:16 -0700)]
qa/tests: changed image path to 'quay.ceph.io/ceph-ci/ceph:octopus'

Fixes: https://tracker.ceph.com/issues/49790
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
4 years agopybind/mgr: correct a MgrModule annotation 40135/head
Kefu Chai [Fri, 29 Jan 2021 03:25:31 +0000 (11:25 +0800)]
pybind/mgr: correct a MgrModule annotation

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 3649ecf64ccb111c80218477f24021d70a78485f)

4 years agomgr/ceph_module: add type annotation to BaseMgrModule
Kefu Chai [Mon, 22 Feb 2021 05:45:31 +0000 (13:45 +0800)]
mgr/ceph_module: add type annotation to BaseMgrModule

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit b67e4341a3c3f4ceb7d03a731dcda31f18237eb2)

4 years agomgr/prometheus: fix warning of possibly unbound variables
Patrick Seidensal [Mon, 22 Feb 2021 14:52:56 +0000 (15:52 +0100)]
mgr/prometheus: fix warning of possibly unbound variables

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 0fdbdb45dcec57e6e108edfdbea6e6a661f34a7a)

4 years agomgr/prometheus: flake8 cleanups
Kefu Chai [Fri, 26 Feb 2021 09:57:32 +0000 (17:57 +0800)]
mgr/prometheus: flake8 cleanups

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 48b70f28df652d61b035690329fb1a2611d79786)

4 years agomgr/prometheus: fix import failure (flake8)
Patrick Seidensal [Mon, 22 Feb 2021 15:45:40 +0000 (16:45 +0100)]
mgr/prometheus: fix import failure (flake8)

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 806ef8a1c513a52f50604ba223434d36a654eb81)

4 years agomgr/prometheus: add type annotations
Sage Weil [Tue, 16 Mar 2021 13:05:51 +0000 (08:05 -0500)]
mgr/prometheus: add type annotations

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 7f438440f91dea588603b18cbfc00340cd535703)

# Conflicts:
# src/mypy.ini
  - surrounding modules are in master but not pacific

4 years agomgr/prometheus: raise at seeing unknown status
Kefu Chai [Fri, 26 Feb 2021 04:15:03 +0000 (12:15 +0800)]
mgr/prometheus: raise at seeing unknown status

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit f92aa2f48a8801d31dac092ca53b48d07315bed4)

4 years agomgr/prometheus: implement command using CLIReadCommand
Kefu Chai [Fri, 26 Feb 2021 03:47:11 +0000 (11:47 +0800)]
mgr/prometheus: implement command using CLIReadCommand

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 3b9ecd14d7d714be7a7c72fa3d44d7781d360135)

4 years agomgr/{prometheus,telemetry}: appease mypy
Sage Weil [Tue, 16 Mar 2021 13:01:46 +0000 (08:01 -0500)]
mgr/{prometheus,telemetry}: appease mypy

update to adapt the type annotation of MgrModule.list_servers()

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 19abd1e1f47da14dbd2be89e15b04d483e75a105)

# Conflicts:
# src/pybind/mgr/telemetry/module.py
  - drop telemetry portion

4 years agomgr/prometheus: add prometheus to flake8 test
Kefu Chai [Wed, 10 Feb 2021 07:49:03 +0000 (15:49 +0800)]
mgr/prometheus: add prometheus to flake8 test

for the explanation why we should add a line break before a binary
operator. see
https://www.python.org/dev/peps/pep-0008/#should-a-line-break-before-or-after-a-binary-operator

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 5dd9db970994530002df3da6be075974bfef2767)

4 years agomgr/prometheus: escape special chars using r-string
Kefu Chai [Wed, 10 Feb 2021 07:47:31 +0000 (15:47 +0800)]
mgr/prometheus: escape special chars using r-string

so we don't need to worry about escaping the backslash anymore.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 734dd75f35fb24d319ee1ddbff09da53dfac8c72)

4 years agopybind/mgr/prometheus: PEP8 cleanups
Sage Weil [Tue, 16 Mar 2021 13:00:42 +0000 (08:00 -0500)]
pybind/mgr/prometheus: PEP8 cleanups

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e69aad6c6191d0ff3288ae666a306d1c66f1039a)

# Conflicts:
# src/pybind/mgr/tox.ini
  - pacific telemetry not in tox.ini

4 years agopybind/mgr/prometheus: add typing annotations
Kefu Chai [Mon, 15 Mar 2021 11:35:16 +0000 (19:35 +0800)]
pybind/mgr/prometheus: add typing annotations

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 58fd057e2c8799fa000b9937aa992e13cbbd485f)

4 years agomgr/prometheus: introduce metric for collection time
Patrick Seidensal [Fri, 24 Jul 2020 17:11:35 +0000 (19:11 +0200)]
mgr/prometheus: introduce metric for collection time

Introduces metric `prometheus_collect_duration_seconds` for the time it
takes the Prometheus manager module to collect all the metrics.

```
ceph_prometheus_collect_duration_seconds_sum{method="get_health"} 0.0002613067626953125
ceph_prometheus_collect_duration_seconds_sum{method="get_pool_stats"} 0.0018298625946044922
ceph_prometheus_collect_duration_seconds_sum{method="get_df"} 0.0005767345428466797
ceph_prometheus_collect_duration_seconds_sum{method="get_fs"} 0.0010402202606201172
ceph_prometheus_collect_duration_seconds_sum{method="get_quorum_status"} 0.0007524490356445312
ceph_prometheus_collect_duration_seconds_sum{method="get_mgr_status"} 0.0035364627838134766
ceph_prometheus_collect_duration_seconds_sum{method="get_pg_status"} 0.00021266937255859375
ceph_prometheus_collect_duration_seconds_sum{method="get_osd_stats"} 0.0018737316131591797
ceph_prometheus_collect_duration_seconds_sum{method="get_metadata_and_osd_status"} 0.0032796859741210938
ceph_prometheus_collect_duration_seconds_sum{method="get_num_objects"} 0.00011086463928222656
ceph_prometheus_collect_duration_seconds_sum{method="get_rbd_stats"} 0.00036144256591796875
ceph_prometheus_collect_duration_seconds_count{method="get_health"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_pool_stats"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_df"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_fs"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_quorum_status"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_mgr_status"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_pg_status"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_osd_stats"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_metadata_and_osd_status"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_num_objects"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_rbd_stats"} 1.0
```

Fixes: https://tracker.ceph.com/issues/46703
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 801d3f670330499fb9cd5f8674678908f2115fe8)

4 years agomgr/cephadm: fix 'auth caps' fallback
Sage Weil [Mon, 15 Mar 2021 22:34:57 +0000 (17:34 -0500)]
mgr/cephadm: fix 'auth caps' fallback

The first get-or-create attempt also needs to tolerate failure.

Fixes: 8ceea1961f818dc2d07edf9c256ebe5150b6b133
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 19c75234433c4ce5073a13dbada88f437c2dd0ad)

4 years agomgr/cephadm: ensure mgr metadata is not none
Sage Weil [Mon, 15 Mar 2021 22:20:25 +0000 (17:20 -0500)]
mgr/cephadm: ensure mgr metadata is not none

This hunk is from aca45d7d08fd8c3f32849331eba4620e2726282a, a much
larger change in master that added type annotations all over the place.
It just brings src/pybind/mgr/cephadm fully in sync with master.

Signed-off-by: Sage Weil <sage@newdream.net>
4 years agoqa/suites/rados/cephadm: add back centos+rhel with kubic podman
Sage Weil [Thu, 11 Mar 2021 19:46:23 +0000 (13:46 -0600)]
qa/suites/rados/cephadm: add back centos+rhel with kubic podman

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit dbdd4d46e35d2fcf80a1b2cd9da77b6832c45aa3)

4 years agoqa/suites/rados/cephadm/upgrade: deploy a legacy r.z-style rgw
Sage Weil [Wed, 10 Mar 2021 13:20:45 +0000 (08:20 -0500)]
qa/suites/rados/cephadm/upgrade: deploy a legacy r.z-style rgw

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 04a3d4c927e714aed43a58fb57209bd5154240b2)

4 years agoqa/suites/rados/cephadm/upgrade: start at 15.2.9 to test iscsi upgrade
Sage Weil [Thu, 11 Mar 2021 03:58:33 +0000 (22:58 -0500)]
qa/suites/rados/cephadm/upgrade: start at 15.2.9 to test iscsi upgrade

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 6ccc3d83c004483f455690ec493c7ac7de483587)

4 years agoqa/tasks/cephadm.py: don't set mgr count to +1
Sage Weil [Thu, 11 Mar 2021 16:58:15 +0000 (11:58 -0500)]
qa/tasks/cephadm.py: don't set mgr count to +1

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 0a139c1ffc4a6c92d2f79e278955c62b2ceb36ba)

4 years agodoc/cephadm: add note about deprecation of NFSv3
Michael Fritch [Wed, 10 Mar 2021 17:28:11 +0000 (10:28 -0700)]
doc/cephadm: add note about deprecation of NFSv3

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit a52bd99a60c1d212bdcaa58250f8a4fdf88fbdf3)

4 years agodoc/cephadm: remove step to restart the mgr
Michael Fritch [Wed, 10 Mar 2021 04:06:27 +0000 (21:06 -0700)]
doc/cephadm: remove step to restart the mgr

a restart of the mgr does not appear to be necessary
after a `ceph config-key set ...`

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 9441d66318d243789af6d22cae1a778b35694419)

4 years agodoc/cephadm: use `reconfig` instead of `redeploy`
Michael Fritch [Wed, 10 Mar 2021 04:06:20 +0000 (21:06 -0700)]
doc/cephadm: use `reconfig` instead of `redeploy`

`reconfig` can be used to apply a change to either
the tls/ssl cert or a custom configuration file (j2)

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit b4b6f359dfbbe17288457066f3f182e3095ea81d)

4 years agodoc/cephadm: update custom j2 config-key name
Michael Fritch [Wed, 10 Mar 2021 04:06:12 +0000 (21:06 -0700)]
doc/cephadm: update custom j2 config-key name

introduced by:
cd79c9912ab35ee6296d613edc7830410a141e05

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit b58b0de77710fd511d30b79a9cc8f32b19ff29b0)

4 years agodoc/cephadm: use 'apt' to install cephadm on Ubuntu
Josh [Sun, 7 Mar 2021 03:59:46 +0000 (21:59 -0600)]
doc/cephadm: use 'apt' to install cephadm on Ubuntu

Adjusted so Ubuntu command uses 'apt' and added Fedora since that uses 'dnf'.

(cherry picked from commit ffc08b930b32fa34e4d22164feda04719a34dd6b)

4 years agomgr/cephadm: remove duplicate labels when adding a host
Adam King [Fri, 5 Mar 2021 15:10:25 +0000 (10:10 -0500)]
mgr/cephadm: remove duplicate labels when adding a host

Fixes: https://tracker.ceph.com/issues/49626
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 030fb9d30fbd0b6914ee1ec8283fe7618ed1b8a5)

4 years agomgr/cephadm: tolerate failure to update daemon caps
Sage Weil [Mon, 15 Mar 2021 16:55:36 +0000 (11:55 -0500)]
mgr/cephadm: tolerate failure to update daemon caps

If we're upgrading from 15.2.0, we may fail to update caps.  Instead of
failing the upgrade hard, warn to the log and continue.  This is less
than ideal, but the caps will get corrected the next time the daemon is
redeployed on the next upgrade, and most likely the previous caps will
continue to work (given they were presumably working before the upgrade).

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 8ceea1961f818dc2d07edf9c256ebe5150b6b133)

4 years agomgr/cephadm: fix get_keyring_with_caps
Sage Weil [Fri, 12 Mar 2021 16:15:35 +0000 (10:15 -0600)]
mgr/cephadm: fix get_keyring_with_caps

1- Pass caps to 'auth get-or-create'
2- Only try 'auth caps' if the get-or-create failed

Note that the 'auth caps' step can fail if upgrading from 15.2.0 since
'profile mgr' didn't include 'auth caps' until 15.2.1.  We're not
addressing that for now...

Fixes: 7c0d532f3a4839f4199a13773fb5fa8b6fb3f183
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 6127d7f20bc8a6ad02d8ea144584eaf2bfc9590e)

4 years agopython-common: fix PlacementSpec target size method
Sage Weil [Wed, 10 Mar 2021 22:27:28 +0000 (17:27 -0500)]
python-common: fix PlacementSpec target size method

- Rename get_host_selection_size() to get_target_size() since the host
  part of the name was a bit misleading
- Take count-per-host into consideration.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 2904c5ece0e6d3cabaf16b2e977bdcdf5d8f68dd)

4 years agopython-common: count-per-host must be combined with label or hosts or host_pattern
Sage Weil [Wed, 10 Mar 2021 22:31:31 +0000 (17:31 -0500)]
python-common: count-per-host must be combined with label or hosts or host_pattern

I think this is better for the same reason we made PlacementSpec() not
mean 'all hosts' by default.  If you really want N daemons for every host
in the cluster, be specific with 'count-per-host:2 *'.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit c7e0fb1e8e7cb06097c23d9e1643b6ba852f0eb0)