]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
3 years agomgr/dashboard: change the readFile to readFileSync 44935/head
Nizamudeen A [Wed, 9 Feb 2022 15:36:16 +0000 (21:06 +0530)]
mgr/dashboard: change the readFile to readFileSync

Apparently the readFile i added in #44934 is async and that's not what
we want. so changing it to the synchronous call that is readFileSync

Fixes: https://tracker.ceph.com/issues/54190
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit cbfdd551d9c1e67c2757056ac1119c058f4aa704)

3 years agomgr/dashboard: set appropriate baseline branch for applitools
Nizamudeen A [Tue, 8 Feb 2022 06:20:29 +0000 (11:50 +0530)]
mgr/dashboard: set appropriate baseline branch for applitools

All the dashboard PRs are checked against a baseline branch called
'default' in the visual regresstion testing. This will cause issues when
testing PRs in different branches. For eg: currently our master and
pacific has to save two different screenshots since the two of them
differ slightly.

Disabling the applitools logs as well because its too 'noisy'

Fixes: https://tracker.ceph.com/issues/54190
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 40c902ac59b758a314f6a123d71cb59342523dac)

3 years agoMerge pull request #43897 from k0ste/wip-53234-pacific
Ernesto Puerta [Fri, 4 Feb 2022 16:39:05 +0000 (17:39 +0100)]
Merge pull request #43897 from k0ste/wip-53234-pacific

pacific: mgr/prometheus: Make prometheus standby behaviour configurable

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: k0ste <NOT@FOUND>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: rsommer <NOT@FOUND>
3 years agoMerge pull request #44775 from p-se/wip-53882-pacific
Ernesto Puerta [Fri, 4 Feb 2022 16:36:43 +0000 (17:36 +0100)]
Merge pull request #44775 from p-se/wip-53882-pacific

pacific: mgr/dashboard: fix Grafana OSD/host panels

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: p-se <NOT@FOUND>
3 years agoMerge pull request #44672 from kamoltat/wip-ksirivad-pacific-backport-44553
Kamoltat Sirivadhna [Wed, 2 Feb 2022 22:10:42 +0000 (17:10 -0500)]
Merge pull request #44672 from kamoltat/wip-ksirivad-pacific-backport-44553

pacific: pybind/mgr/progress: enforced try and except on accessing event dictionary
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agoMerge pull request #44840 from mchangir/pacific-avoid-mon-sanity-assertion-on-startup
Yuri Weinstein [Wed, 2 Feb 2022 17:26:51 +0000 (09:26 -0800)]
Merge pull request #44840 from mchangir/pacific-avoid-mon-sanity-assertion-on-startup

qa: skip sanity check during upgrade

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Venky Shankar vshankar@redhat.com
3 years agoMerge pull request #44259 from sseshasa/wip-53551-pacific
Yuri Weinstein [Wed, 2 Feb 2022 17:25:24 +0000 (09:25 -0800)]
Merge pull request #44259 from sseshasa/wip-53551-pacific

pacific: osd/OSDMap: Add health warning if 'require-osd-release' != current release

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoPendingReleaseNotes: Release note 'require-osd-release' health warning 44259/head
Sridhar Seshasayee [Fri, 3 Dec 2021 09:55:32 +0000 (15:25 +0530)]
PendingReleaseNotes: Release note 'require-osd-release' health warning

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 34f18fa45f3e98de4f35959bdeb1d11730f3f291)

Conflicts:
    PendingReleaseNotes
- Add the release note under the correct release heading.

3 years agoqa: set pacific require-osd-release to avoid health warning
Patrick Donnelly [Wed, 15 Dec 2021 17:57:00 +0000 (12:57 -0500)]
qa: set pacific require-osd-release to avoid health warning

Fixes: https://tracker.ceph.com/issues/53615
Fixes: bd815bd9d6ecdecaab3d2dd9e0f5a18aa795d749
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit bc2eaba8c6616c3469ac85d36ee21f3e9765cf42)

Conflicts:
- Changed release name in commit message from 'quincy' to 'pacific'.

  ../fs/upgrade/featureful_client/old_client/tasks/2-upgrade.yaml
  ../fs/upgrade/featureful_client/upgraded_client/tasks/2-upgrade.yaml
  ../fs/upgrade/nofs/tasks/1-upgrade.yaml
- Changed release name from 'quincy' to 'pacific' when setting the
  'require-osd-release' flag in the above files.

  ../fs/upgrade/volumes/import-legacy/tasks/2-upgrade.yaml
- Changed release name from 'octopus' to 'pacific' when setting the
  'require-osd-release' flag in the above file.

3 years agoqa/suites/upgrade/octopus-x/stress-split-no-cephadm: remove msgr2
Neha Ojha [Wed, 1 Dec 2021 01:22:46 +0000 (01:22 +0000)]
qa/suites/upgrade/octopus-x/stress-split-no-cephadm: remove msgr2

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 6ad7a8a597e6314cf9310f9f7c2f01ef82bc8fa3)

3 years agoqa: test upgrades with hybrid allocator
Neha Ojha [Wed, 1 Dec 2021 01:15:14 +0000 (01:15 +0000)]
qa: test upgrades with hybrid allocator

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit df67040a4c34593adb4585086671543233d90f5a)

3 years agoqa: rename octopus install correctly
Neha Ojha [Wed, 1 Dec 2021 01:12:15 +0000 (01:12 +0000)]
qa: rename octopus install correctly

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 3b15a044550903d4074fc421a2ef1f24fc7e4023)

3 years agoqa: remove leftovers from nautilus
Neha Ojha [Wed, 1 Dec 2021 00:39:57 +0000 (00:39 +0000)]
qa: remove leftovers from nautilus

pglog_hardlimit and msgr2

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit ed4bb05bd945c5d30cb70b88b1f8db0eb64a6ab1)

3 years agoqa/suites/upgrade: Fix/Modify upgrade tests to work with 'pacific' release.
Sridhar Seshasayee [Thu, 13 Jan 2022 13:27:09 +0000 (18:57 +0530)]
qa/suites/upgrade: Fix/Modify upgrade tests to work with 'pacific' release.

This commit is not a cherry-pick and fixes the following issues unique to
the pacific release:
1. Fixes the nautilus-x and octopus-x upgrade tests to work with the
   pacific release by updating the post upgrade step to set the
   'require-osd-release' flag to 'pacific'. This is done by using
   '.qa/releases/pacific.yaml' for all the tests.
2. Fixed an issue in 'upgrade-mon-osd-mds.yaml' under both the
   nautilus-x/parallel, octopus-x/parallel-no-cephadm tests. The
   'wait-for-healthy' check should not be performed after all the osds
   are upgraded since the 'require-osd-release' warning comes into
   effect. This check is delayed until after the 'require-osd-release'
   flag is set to 'pacific'.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
3 years agoosd/OSDMap: Add health warning if 'require-osd-release' != current release
Sridhar Seshasayee [Mon, 22 Nov 2021 15:16:02 +0000 (20:46 +0530)]
osd/OSDMap: Add health warning if 'require-osd-release' != current release

After all OSDs are upgraded to a new release, generate a health warning if
the 'require-osd-release' flag doesn't match the the new release version.
This will result in the cluster showing a warning in the health state until
the flag is set properly.

Fixes: https://tracker.ceph.com/issues/51984
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit bd815bd9d6ecdecaab3d2dd9e0f5a18aa795d749)

Conflicts:
    src/osd/OSDMap.cc
- Removed checks for non-existent ceph_release_t 'quincy' flag from
  OSDMap::pending_require_osd_release().

3 years agoMerge pull request #44387 from trociny/wip-53702-pacific
Yuri Weinstein [Wed, 2 Feb 2022 00:04:42 +0000 (16:04 -0800)]
Merge pull request #44387 from trociny/wip-53702-pacific

pacific: qa/tasks: improve backfill_toofull test

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44325 from k0ste/wip-53621-pacific
Yuri Weinstein [Wed, 2 Feb 2022 00:04:08 +0000 (16:04 -0800)]
Merge pull request #44325 from k0ste/wip-53621-pacific

pacific: mgr/devicehealth: fix missing timezone from time delta calculation

Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agoMerge pull request #44205 from k0ste/wip-53488-pacific
Yuri Weinstein [Wed, 2 Feb 2022 00:03:32 +0000 (16:03 -0800)]
Merge pull request #44205 from k0ste/wip-53488-pacific

pacific: mgr/prometheus: define module options for standby

Reviewed-by: Adam King adking@redhat.com
3 years agoMerge pull request #44175 from cfsnyder/wip-51172-pacific
Yuri Weinstein [Wed, 2 Feb 2022 00:02:34 +0000 (16:02 -0800)]
Merge pull request #44175 from cfsnyder/wip-51172-pacific

pacific: common/PriorityCache: low perf counters priorities for submodules.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
3 years agoMerge pull request #44181 from myoungwon/pacific-50192
Yuri Weinstein [Tue, 1 Feb 2022 22:12:17 +0000 (14:12 -0800)]
Merge pull request #44181 from myoungwon/pacific-50192

pacific: osd: recover unreadable snapshot before reading ref. count info

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #43955 from cfsnyder/wip-53201-pacific
Yuri Weinstein [Tue, 1 Feb 2022 22:10:21 +0000 (14:10 -0800)]
Merge pull request #43955 from cfsnyder/wip-53201-pacific

pacific: osd: fix 'ceph osd stop <osd.nnn>' doesn't take effect

Reviewed-by: Laura Flores <lflores@redhat.com>
3 years agoMerge pull request #44212 from k0ste/wip-53494-pacific
Yuri Weinstein [Tue, 1 Feb 2022 20:42:40 +0000 (12:42 -0800)]
Merge pull request #44212 from k0ste/wip-53494-pacific

pacific: mgr: fix locking for MetadataUpdate::finish

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44202 from myoungwon/pacific-53486
Yuri Weinstein [Tue, 1 Feb 2022 20:41:52 +0000 (12:41 -0800)]
Merge pull request #44202 from myoungwon/pacific-53486

pacific: test: increase retry duration when calculating manifest ref. count

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44173 from cfsnyder/wip-51150-pacific
Yuri Weinstein [Tue, 1 Feb 2022 20:40:49 +0000 (12:40 -0800)]
Merge pull request #44173 from cfsnyder/wip-51150-pacific

pacific: osd: set r only if succeed in FillInVerifyExtent

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
3 years agoMerge pull request #44096 from cfsnyder/wip-53388-pacific
Yuri Weinstein [Tue, 1 Feb 2022 20:40:09 +0000 (12:40 -0800)]
Merge pull request #44096 from cfsnyder/wip-53388-pacific

pacific: osd/OSDMap.cc: clean up pg_temp for nonexistent pgs

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
3 years agoMerge pull request #43882 from ifed01/wip-ifed-fix-53011-pac
Yuri Weinstein [Tue, 1 Feb 2022 20:39:04 +0000 (12:39 -0800)]
Merge pull request #43882 from ifed01/wip-ifed-fix-53011-pac

pacific: os/bluestore: use proper prefix when removing undecodable Share Blob.

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoqa: skip sanity check during upgrade 44840/head
Milind Changire [Mon, 31 Jan 2022 11:22:45 +0000 (16:52 +0530)]
qa: skip sanity check during upgrade

Fixes: https://tracker.ceph.com/issues/54064
Signed-off-by: Milind Changire <mchangir@redhat.com>
3 years agoMerge pull request #44727 from cfsnyder/wip-51825-pacific
Ernesto Puerta [Thu, 27 Jan 2022 10:27:06 +0000 (11:27 +0100)]
Merge pull request #44727 from cfsnyder/wip-51825-pacific

pacific: qa/run-tox-mgr-dashboard: Do not write to /tmp/test_sanitize_password…

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Cory Snyder <csnyder@iland.com>
Reviewed-by: kevinzs2048 <NOT@FOUND>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #44540 from kamoltat/wip-ksirivad-backport-pacific-43716
Yuri Weinstein [Wed, 26 Jan 2022 23:39:24 +0000 (15:39 -0800)]
Merge pull request #44540 from kamoltat/wip-ksirivad-backport-pacific-43716

pacific: mgr/autoscaler: Introduce noautoscale flag

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Vikhyat Umrao <vikhyat@redhat.com>
3 years agoMerge pull request #44660 from sebastian-philipp/pacific-backport-44647
Adam King [Wed, 26 Jan 2022 14:50:37 +0000 (09:50 -0500)]
Merge pull request #44660 from sebastian-philipp/pacific-backport-44647

pacific: doc/cephadm: remove duplicate deployment scenario section

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #44636 from sebastian-philipp/pacific-backport-44510
Adam King [Wed, 26 Jan 2022 14:46:08 +0000 (09:46 -0500)]
Merge pull request #44636 from sebastian-philipp/pacific-backport-44510

pacific: doc/cephadm: improve the development doc a bit

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #44584 from vumrao/wip-vumrao-53876
Yuri Weinstein [Wed, 26 Jan 2022 00:27:11 +0000 (16:27 -0800)]
Merge pull request #44584 from vumrao/wip-vumrao-53876

pacific: osd/PeeringState: separate history's pruub from pg's

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agodocs: Added noautoscale to docs + release notes 44540/head
Kamoltat [Wed, 22 Dec 2021 21:42:52 +0000 (21:42 +0000)]
docs: Added noautoscale to docs + release notes

Updated the docs in
https://docs.ceph.com/en/latest/rados/operations/placement-groups/
and updated the release notes to reflect noautoscale flag.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 9baed0394e03de41f1921693bb33badd1922fa97)

Conflicts:
        PendingReleaseNotes - trivial fix

3 years agoMerge pull request #44513 from batrick/i53714
Yuri Weinstein [Tue, 25 Jan 2022 19:53:58 +0000 (11:53 -0800)]
Merge pull request #44513 from batrick/i53714

pacific: mds: fails to reintegrate strays if destdn's directory is full (ENOSPC)

Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agoqa: Added workunit test for noautoscale flag
Kamoltat [Wed, 8 Dec 2021 15:15:50 +0000 (15:15 +0000)]
qa: Added workunit test for noautoscale flag

set and unset the noautoscale flag,
evaluate if the results are what
we expected. As well as, evaluate
if the flag is correct when we
create new pools.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit bb42c71e7e059be2cc4d1d4408e475b15b1c6340)

Conflicts:
        test-noautoscale-flag.yaml
- modified pre-mgr-command to not create
  device health monitor

3 years agopybind/mgr/autoscaler: Introduce noautoscale flag
Kamoltat [Wed, 8 Dec 2021 15:13:38 +0000 (15:13 +0000)]
pybind/mgr/autoscaler: Introduce noautoscale flag

`noautoscale` flag is a feature where the
user can choose to flip the switch between
turning autoscale `on` and `off` for all
pools with a single command.

`osd pool set noautoscale` will turn all
autoscale mode`off` for all pools.

`osd pool unset noautoscale` will turn all
autoscale mode `on` for all pools.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit be17f041bab90d8f93c3e52df74cdf6c28b44ef2)

Conflicts:
src/pybind/mgr/pg_autoscaler/module.py - trivial fix

3 years agoMerge pull request #44642 from vshankar/wip-53458
Yuri Weinstein [Tue, 25 Jan 2022 16:04:50 +0000 (08:04 -0800)]
Merge pull request #44642 from vshankar/wip-53458

pacific: qa: wait for purge queue operations to finish

Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agoMerge pull request #44639 from vshankar/wip-53912
Yuri Weinstein [Tue, 25 Jan 2022 16:04:15 +0000 (08:04 -0800)]
Merge pull request #44639 from vshankar/wip-53912

pacific: qa: adjust for MDSs to get deployed before verifying their availability

Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agoMerge pull request #44623 from lxbsz/wip-53908
Yuri Weinstein [Tue, 25 Jan 2022 16:03:47 +0000 (08:03 -0800)]
Merge pull request #44623 from lxbsz/wip-53908

pacific: mds: remove the duplicated or incorrect respond

Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agoMerge pull request #44622 from lxbsz/wip-53860
Yuri Weinstein [Tue, 25 Jan 2022 16:03:21 +0000 (08:03 -0800)]
Merge pull request #44622 from lxbsz/wip-53860

pacific: mds: dump tree '/' when the path is empty

Reviewed-by: Kotresh HR khiremat@redhat.com
Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agoMerge pull request #44621 from lxbsz/wip-53861
Yuri Weinstein [Tue, 25 Jan 2022 16:02:37 +0000 (08:02 -0800)]
Merge pull request #44621 from lxbsz/wip-53861

pacific: qa: do not use any time related suffix for *_op_timeouts

Reviewed-by: Kotresh HR khiremat@redhat.com
Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agoMerge pull request #44620 from lxbsz/wip-53864
Yuri Weinstein [Tue, 25 Jan 2022 16:01:47 +0000 (08:01 -0800)]
Merge pull request #44620 from lxbsz/wip-53864

pacific: mds: directly return just after responding the link request

Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Kotresh HR khiremat@redhat.com
3 years agoMerge pull request #44516 from nmshelke/wip-53777-pacific
Yuri Weinstein [Tue, 25 Jan 2022 16:00:51 +0000 (08:00 -0800)]
Merge pull request #44516 from nmshelke/wip-53777-pacific

pacific: mgr/stats: exception handling for ceph fs perf stats command

Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agoMerge pull request #44514 from batrick/i53736
Yuri Weinstein [Tue, 25 Jan 2022 16:00:19 +0000 (08:00 -0800)]
Merge pull request #44514 from batrick/i53736

pacific: mds: recursive scrub does not trigger stray reintegration

Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agoMerge pull request #44512 from MrFreezeex/wip-52631-pacific
Yuri Weinstein [Tue, 25 Jan 2022 15:59:05 +0000 (07:59 -0800)]
Merge pull request #44512 from MrFreezeex/wip-52631-pacific

pacific: mds: add mds_dir_max_entries config option

Reviewed-by: Milind Changire <mchangir@redhat.com>
3 years agomonitoring: Add unit tests for OSD panels in ceph-cluster dashboard 44775/head
Patrick Seidensal [Thu, 9 Dec 2021 14:01:54 +0000 (15:01 +0100)]
monitoring: Add unit tests for OSD panels in ceph-cluster dashboard

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 7d7488018ea30dc61174bafcad01bb3eac8aa9bb)

3 years agomonitoring: fix display ceph_osd_in in Grafana panel
Patrick Seidensal [Thu, 9 Dec 2021 13:59:49 +0000 (14:59 +0100)]
monitoring: fix display ceph_osd_in in Grafana panel

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 4a6b2c1dfbbe7182beaf510c4a7297a79c6e2524)

3 years agomgr/prometheus: Fix regression with OSD/host details/overview dashboards
Patrick Seidensal [Mon, 25 Oct 2021 13:00:14 +0000 (15:00 +0200)]
mgr/prometheus: Fix regression with OSD/host details/overview dashboards

Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.

As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk.  This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros).  The data we
have expected is simply different in some rare cases.

I have not found a sole PromQL solution to this issue. What we basically
need is the following.

1. Match on labels `host` and `instance` to get one or more OSD names
   from a metadata metric (`ceph_disk_occupation`) to let a user know
   about which OSDs belong to which disk.

2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
   in which case the value of `ceph_daemon` must not refer to more than
   a single OSD. The exact opposite to requirement 1.

As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.

Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk).  This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.

`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.

    foo * on(ceph_daemon) group_left ceph_disk_occupation

`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).

    foo * on(device,instance)
    group_left(ceph_daemon) ceph_disk_occupation_human

Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 18d3a71618a5e3bc3cbd0bce017fb7b9c18c2ca0)

3 years agomgr/prometheus: Refactoring: Introduce type aliases
Patrick Seidensal [Mon, 25 Oct 2021 08:51:35 +0000 (10:51 +0200)]
mgr/prometheus: Refactoring: Introduce type aliases

Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 154d3525b19135a929851c0b027da19abda20ebe)

3 years agoMerge pull request #44708 from guits/wip-53962-pacific
Guillaume Abrioux [Mon, 24 Jan 2022 12:39:15 +0000 (13:39 +0100)]
Merge pull request #44708 from guits/wip-53962-pacific

pacific: ceph-volume: show RBD devices as not available

3 years agoMerge pull request #44534 from rhcs-dashboard/wip-53834-pacific
Ernesto Puerta [Fri, 21 Jan 2022 19:44:47 +0000 (20:44 +0100)]
Merge pull request #44534 from rhcs-dashboard/wip-53834-pacific

pacific: mgr/dashboard: Update Angular version to 12

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoqa/run-tox-mgr-dashboard: Do not write to /tmp/test_sanitize_password.txt file 44727/head
Kevin Zhao [Thu, 22 Jul 2021 06:58:20 +0000 (07:58 +0100)]
qa/run-tox-mgr-dashboard: Do not write to /tmp/test_sanitize_password.txt file

To allow running multiple instances of the same tests.

Fixes: https://tracker.ceph.com/issues/51792
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
(cherry picked from commit d04ef800abd671a564795eba198ca976619b4cc7)

3 years agoMerge pull request #44701 from guits/wip-53955-pacific
Guillaume Abrioux [Fri, 21 Jan 2022 12:48:04 +0000 (13:48 +0100)]
Merge pull request #44701 from guits/wip-53955-pacific

pacific: ceph-volume: don't use MultiLogger in find_executable_on_host()

3 years agoceph-volume: filter RBD devices from the device inventory 44708/head
Michael Fritch [Tue, 18 Jan 2022 22:15:45 +0000 (15:15 -0700)]
ceph-volume: filter RBD devices from the device inventory

Avoid running `blkid` or deploying OSDs on RBD devices by ensuring they
do not appear in the `ceph-volume inventory`

Fixes: https://tracker.ceph.com/issues/53846
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 47325ec3ec5ce1d53c5eae2952f631e95b7135fe)

3 years agoMerge pull request #44681 from guits/split-cephadm-distros
Adam King [Thu, 20 Jan 2022 19:51:04 +0000 (14:51 -0500)]
Merge pull request #44681 from guits/split-cephadm-distros

qa: split distro for rados/cephadm/smoke tests

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoceph-volume: don't use MultiLogger in find_executable_on_host() 44701/head
Guillaume Abrioux [Wed, 19 Jan 2022 14:04:20 +0000 (15:04 +0100)]
ceph-volume: don't use MultiLogger in find_executable_on_host()

This generates a lot of unnecessary messages on the terminal.

Fixes: https://tracker.ceph.com/issues/53934
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3be55621600be3ebc9c70295a3a351dab426b3a3)

3 years agoMerge pull request #44480 from rhcs-dashboard/wip-53616-pacific
Ernesto Puerta [Thu, 20 Jan 2022 17:22:54 +0000 (18:22 +0100)]
Merge pull request #44480 from rhcs-dashboard/wip-53616-pacific

pacific: mgr/prometheus: expose ceph healthchecks as metrics

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
Reviewed-by: sebastian-philipp <NOT@FOUND>
3 years agoqa: split distro for rados/cephadm/smoke tests 44681/head
Guillaume Abrioux [Thu, 20 Jan 2022 10:29:52 +0000 (11:29 +0100)]
qa: split distro for rados/cephadm/smoke tests

There was a difference between master and pacific.
The hwe kernel modification for Ubuntu 20.04 should be done
only for cephadm tests. Modifying `qa/distros/all/ubuntu_20.04.yaml` broke
many tests.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agoMerge pull request #44635 from sebastian-philipp/pacific-backport-44506
Sebastian Wagner [Thu, 20 Jan 2022 10:09:29 +0000 (11:09 +0100)]
Merge pull request #44635 from sebastian-philipp/pacific-backport-44506

pacific: qa/suites/orch/cephadm: Also run the rbd/iscsi suite

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #44596 from idryomov/wip-xfstests-qemu-cert-pacific
Yuri Weinstein [Wed, 19 Jan 2022 22:05:38 +0000 (14:05 -0800)]
Merge pull request #44596 from idryomov/wip-xfstests-qemu-cert-pacific

pacific: qa/run_xfstests_qemu.sh: stop reporting success without actually running any tests

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #44594 from idryomov/wip-diff-iterate-parent-fix-pacific
Yuri Weinstein [Wed, 19 Jan 2022 22:04:47 +0000 (14:04 -0800)]
Merge pull request #44594 from idryomov/wip-diff-iterate-parent-fix-pacific

pacific: librbd: restore diff-iterate include_parent functionality in fast-diff mode

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
3 years agoMerge pull request #44547 from cfsnyder/wip-53839-pacific
Yuri Weinstein [Wed, 19 Jan 2022 22:04:06 +0000 (14:04 -0800)]
Merge pull request #44547 from cfsnyder/wip-53839-pacific

pacific: librbd: diff-iterate reports incorrect offsets in fast-diff mode

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
3 years agopybind/mgr/progress: enforced try and except on accessing event dictionary 44672/head
Kamoltat [Wed, 12 Jan 2022 02:41:01 +0000 (02:41 +0000)]
pybind/mgr/progress: enforced try and except on accessing event dictionary

There is a certain race condition scenario where
an event gets deleted while the progress module
iterates through the ``events`` dictionary,
without a ``try and except``, this will cause
an unhandled exception error and will crash
the module.

This commit will enforce ``try and except``
on every part of the code where we are accessing
the ``events`` dictionary.

Fixes: https://tracker.ceph.com/issues/53803
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit b70d4a9caae0eb859e10b68f93573d507625d267)

Conflicts:
src/pybind/mgr/progress/module.py - trivial-fix

3 years agoMerge pull request #44626 from sebastian-philipp/pacific-backport-42905
Sebastian Wagner [Wed, 19 Jan 2022 15:11:53 +0000 (16:11 +0100)]
Merge pull request #44626 from sebastian-philipp/pacific-backport-42905

pacific: python-common: improve OSD spec error messages

Reviewed-by: Michael Fritch <mfritch@suse.com>
3 years agodoc/cephadm: remove duplicate deployment scenario section 44660/head
Melissa Li [Tue, 18 Jan 2022 21:53:04 +0000 (16:53 -0500)]
doc/cephadm: remove duplicate deployment scenario section

Signed-off-by: Melissa Li <melissali@redhat.com>
(cherry picked from commit 2222f26a37137a2f70b3f736ffad16c51a6b4e44)

3 years agoMerge pull request #44644 from guits/wip-53916-pacific
Sebastian Wagner [Wed, 19 Jan 2022 12:35:41 +0000 (13:35 +0100)]
Merge pull request #44644 from guits/wip-53916-pacific

pacific: ceph-volume: fix regression introcuded via #43536

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agoMerge pull request #44652 from rhcs-dashboard/wip-53921-pacific
Ernesto Puerta [Wed, 19 Jan 2022 12:14:59 +0000 (13:14 +0100)]
Merge pull request #44652 from rhcs-dashboard/wip-53921-pacific

pacific: mgr/dashboard: Refactoring dashboard cephadm checks

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #44650 from aaSharma14/wip-53828-pacific
Ernesto Puerta [Wed, 19 Jan 2022 12:03:24 +0000 (13:03 +0100)]
Merge pull request #44650 from aaSharma14/wip-53828-pacific

pacific: mgr/dashboard: monitoring:Implement BlueStore onode hit/miss counters into the dashboard

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agopython-common/tests: Remove filstore tests in test_disk_selector.py 44626/head
Sebastian Wagner [Thu, 25 Nov 2021 16:38:35 +0000 (17:38 +0100)]
python-common/tests: Remove filstore tests in test_disk_selector.py

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 1c40ca1e37e5e798cfd9cf317f39b11dd22ea086)

3 years agopython-common: Don't valiate ServiceSpec.from_json() in `orch ls`
Sebastian Wagner [Wed, 10 Nov 2021 14:54:42 +0000 (15:54 +0100)]
python-common: Don't valiate ServiceSpec.from_json() in `orch ls`

unfortunately `ceph orch ls` may return invalid OSD specs for
OSDs not associated to and specs.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 3f38583b7189d99be360d8475fe6ef8cd53dee7c)

Conflicts:
src/pybind/mgr/orchestrator/module.py

3 years agopython-common: HostSpec: add `validate()`
Sebastian Wagner [Wed, 22 Sep 2021 11:46:52 +0000 (13:46 +0200)]
python-common: HostSpec: add `validate()`

Adjust HostSpec interface to ServiceSpec

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 7c6d922dead8480cd1f2cd05be7ccd1d8d5b7dd8)

Conflicts:
  src/python-common/ceph/deployment/service_spec.py

3 years agopython-common: DriveGroupSpec: move pacement validation to validate()
Sebastian Wagner [Wed, 1 Sep 2021 13:46:12 +0000 (15:46 +0200)]
python-common: DriveGroupSpec: move pacement validation to validate()

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 311860412e840e6b31e04b80a9de5e9ae05e7fb7)

3 years agopyhton-common: DriveGroupSpec: Allow unnamed OSD specs
Sebastian Wagner [Wed, 1 Sep 2021 13:36:01 +0000 (15:36 +0200)]
pyhton-common: DriveGroupSpec: Allow unnamed OSD specs

Cause it never actually worked as expected.

Remove duplicated service_id check, cause it's already
verified by parent method.

Fixes: https://tracker.ceph.com/issues/46253
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 8b567e132d75711179febac126c5ec8a250b8952)

Conflicts:
src/python-common/ceph/deployment/service_spec.py

3 years agopython-common: Improve DriveSelection error messages
Sebastian Wagner [Tue, 24 Aug 2021 12:57:27 +0000 (14:57 +0200)]
python-common: Improve DriveSelection error messages

Fixes: https://tracker.ceph.com/issues/50685
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 74f29b97ea3331d43391cd40fe843104a2c15c3d)

3 years agopython-common: OSD specs: Improve quality of error messages
Sebastian Wagner [Tue, 24 Aug 2021 10:56:21 +0000 (12:56 +0200)]
python-common: OSD specs: Improve quality of error messages

Fixes: https://tracker.ceph.com/issues/47401
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 4142c52d7406bb67042d9ad7b26d8e84f5a734ba)

Conflicts:
src/python-common/ceph/deployment/drive_group.py

3 years agopython-common: Remove duplicated DriveGroupSpec.__repr__ and __eq__
Sebastian Wagner [Tue, 24 Aug 2021 12:31:56 +0000 (14:31 +0200)]
python-common: Remove duplicated DriveGroupSpec.__repr__ and __eq__

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit b91f81801af40c213adfbc88c8fd148b4edf3ede)

Conflicts:
src/python-common/ceph/deployment/drive_group.py

3 years agomgr/orch: re-raise to make debugging easier
Sebastian Wagner [Wed, 22 Sep 2021 12:20:24 +0000 (14:20 +0200)]
mgr/orch: re-raise to make debugging easier

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 38b52f715fa581f3540ad6fc4c595ab0ede83ece)

3 years agoMerge pull request #44627 from sebastian-philipp/pacific-backport-44228
Sebastian Wagner [Wed, 19 Jan 2022 10:39:36 +0000 (11:39 +0100)]
Merge pull request #44627 from sebastian-philipp/pacific-backport-44228

pacific: mgr/cephadm: fix 'cephadm osd activate' on existing osd devices

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #44625 from sebastian-philipp/pacific-backport-43149
Sebastian Wagner [Wed, 19 Jan 2022 10:39:13 +0000 (11:39 +0100)]
Merge pull request #44625 from sebastian-philipp/pacific-backport-43149

pacific: mgr/cephadm: Add client.admin keyring when upgrading from older version

Reviewed-by: Michael Fritch <mfritch@suse.com>
3 years agoqa/cephadm: install hwe kernel only for focal 44644/head
Guillaume Abrioux [Fri, 14 Jan 2022 17:20:10 +0000 (18:20 +0100)]
qa/cephadm: install hwe kernel only for focal

Let's install hwe kernel only on Ubuntu focal, otherwise we only shift the
problem on Ubuntu bionic given that the hwe kernel for bionic is 5.4.

Fixes: https://tracker.ceph.com/issues/53863
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5c0f0698a5b8db75ae9bcdca311a68a1589ee0a5)

3 years agoqa/nvme_loop: fix an issue on ubuntu 18.04
Guillaume Abrioux [Thu, 13 Jan 2022 21:46:03 +0000 (22:46 +0100)]
qa/nvme_loop: fix an issue on ubuntu 18.04

The following command:

```
echo /dev/sda | tee /sys/kernel/config/nvmet/subsystems/sda/namespaces/1/device_path
```

makes nvme_loop fail because fascinatingly, it adds an unexpected newline.

See:
```
/dev/sda
/dev/sda

1
tee: /sys/kernel/config/nvmet/subsystems/sda/namespaces/1/enable: No such file or directory
/dev/sda
1
```

Other distros don't have the same behavior:

```
CentOS 8
/dev/sda
/dev/sda
1

Ubuntu 20.04
/dev/sda
/dev/sda
1
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f8e22fb3da9bfbdc75d88beb66543716afb19511)

3 years agoceph-volume: fix regression introcuded via #43536
Guillaume Abrioux [Mon, 10 Jan 2022 09:21:53 +0000 (10:21 +0100)]
ceph-volume: fix regression introcuded via #43536

The recent changes from PR #43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c93ffdc92d4d03b9ae7415b548192a572cfc5ea)

3 years agomgr/dashboard: Refactoring dashboard cephadm checks 44652/head
Nizamudeen A [Thu, 13 Jan 2022 12:58:56 +0000 (18:28 +0530)]
mgr/dashboard: Refactoring dashboard cephadm checks

I isolated all the tests suites into there respective files
so that in future it is easier to add more tests to it.

I also given priority to the host actions.

Create OSD checks are now written in a way that OSDs
are created only on the intended hosts. This will make
the host draining process easier and less time consuming.

Also tried to address the flaky force maintenance checks.

Removed some duplicated codes

Service creation part improved to reduce the time taken
for its completion

Fixes: https://tracker.ceph.com/issues/53905
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit b6759b75c9fc4d3fb565201aa6bbe0c2473fd3d4)

3 years agoqa/dashboard: ensure node 16 is installed 44534/head
Ernesto Puerta [Thu, 13 Jan 2022 16:21:12 +0000 (17:21 +0100)]
qa/dashboard: ensure node 16 is installed

For Ubuntu: https://github.com/nodesource/distributions#manual-installation

Fixes: https://tracker.ceph.com/issues/53843
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 7225b68e46173350954beb418ecd43e9eca4d179)

3 years agomgr/dashboard: monitoring:Implement BlueStore onode hit/miss counters into the dashboard 44650/head
Aashish Sharma [Mon, 13 Dec 2021 12:03:02 +0000 (17:33 +0530)]
mgr/dashboard: monitoring:Implement BlueStore onode hit/miss counters into the dashboard

Provide the details pulled from Bluestore stats in order to display the onode hit/miss counters

Fixes: https://tracker.ceph.com/issues/53577
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 15aa4dffa91b325014024d3e35603d88330b87cc)

3 years agoMerge pull request #44467 from rhcs-dashboard/wip-53780-pacific
Ernesto Puerta [Tue, 18 Jan 2022 20:01:51 +0000 (21:01 +0100)]
Merge pull request #44467 from rhcs-dashboard/wip-53780-pacific

pacific: mgr/dashboard: fix orchestrator/02-hosts-inventory.e2e failure

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
3 years agoMerge pull request #44533 from rhcs-dashboard/wip-53825-pacific
Ernesto Puerta [Tue, 18 Jan 2022 19:58:47 +0000 (20:58 +0100)]
Merge pull request #44533 from rhcs-dashboard/wip-53825-pacific

pacific: mgr/dashboard: add test coverage for API docs (SwaggerUI)

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #44529 from sebastian-philipp/pacific-backport-43901-44341
Sebastian Wagner [Tue, 18 Jan 2022 13:55:31 +0000 (14:55 +0100)]
Merge pull request #44529 from sebastian-philipp/pacific-backport-43901-44341

pacific: mgr/cephadm: Add snmp-gateway service support

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
3 years agoqa: wait for purge queue operations to finish 44642/head
Venky Shankar [Tue, 23 Nov 2021 09:37:01 +0000 (04:37 -0500)]
qa: wait for purge queue operations to finish

TestFragmentation.test_deep_split relies on `num_strays`
to reach zero expecting that the purge threads would
have deleted the directory entries. However, checking
`num_strays` cannot be relied on since PurqeQueue merely
journals the purge item (see PurgeQueue::push) followed
by the StrayManager marking the stray as removed thereby
accounting `num_strays`.

So, add an additional condition to check if the purge
threads have finished processing items.

Fixes: http://tracker.ceph.com/issues/52487
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit d9c79983230a9237422998771db4b4c450aed949)

3 years agoqa: adjust for MDSs to get deployed before verifying their availability 44639/head
Venky Shankar [Tue, 11 Jan 2022 09:05:03 +0000 (14:35 +0530)]
qa: adjust for MDSs to get deployed before verifying their availability

The check happens when some MDSs are *just* deployed by cephadm causing
jobs to fail with:

     Command failed on smithi016 with status 1: 'sudo /home/ubuntu/cephtest/cephadm \
     --image docker.io/ceph/ceph:v16.2.4 shell -c /etc/ceph/ceph.conf -k \
     /etc/ceph/ceph.client.admin.keyring --fsid 403bfcae-706b-11ec-8c32-001a4aab830c \
     -- bash -c \'ceph --format=json mds versions | jq -e ". | add == 4"\''

Fixes: http://tracker.ceph.com/issues/53857
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 8939d8c14b911e8f57a46c442e31185ce3ca5d63)

3 years agodoc/cephadm: improve the developer's guide a bit 44636/head
Radoslaw Zarzynski [Mon, 10 Jan 2022 14:10:33 +0000 (14:10 +0000)]
doc/cephadm: improve the developer's guide a bit

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
(cherry picked from commit 4c58d71d2bcd6b89e1578b844d8092b692cec4b2)

3 years agodoc/cephadm: fix a typo in developing-cephadm.rst
Radoslaw Zarzynski [Tue, 4 Jan 2022 15:39:13 +0000 (15:39 +0000)]
doc/cephadm: fix a typo in developing-cephadm.rst

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
(cherry picked from commit e513869fd36459518178ac321e8dda61836d4631)

3 years agoqa/suites/orch/cephadm: Also run the rbd/iscsi suite 44635/head
Sebastian Wagner [Mon, 10 Jan 2022 09:45:36 +0000 (10:45 +0100)]
qa/suites/orch/cephadm: Also run the rbd/iscsi suite

Adding a new workload test to our suite.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 651192aacc4ac695a03f4ab0f7ffa045632d5d11)

3 years agoqa/suites/orch/cephadm/osds: test 'ceph cephadm osd activate' 44627/head
Sage Weil [Thu, 16 Dec 2021 15:00:05 +0000 (10:00 -0500)]
qa/suites/orch/cephadm/osds: test 'ceph cephadm osd activate'

Make sure this command behaves when the /var/lib/ceph osd.NNN dir is
removed.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 867bf04b74d510a544d9555afc56d5cd6657874d)

3 years agomgr/cephadm/services/osd: skip found osds that already have daemons
Sage Weil [Mon, 6 Dec 2021 15:19:57 +0000 (10:19 -0500)]
mgr/cephadm/services/osd: skip found osds that already have daemons

If we are trying to deploy new or newly-found osds, we can skip the ones
that already have cephadm daemons deployed.

Fixes: https://tracker.ceph.com/issues/53491
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit dc3d45bbe8c3bfedee57da619616c0be489cd233)

Conflicts:
src/pybind/mgr/cephadm/services/osd.py

3 years agomgr/cephadm: allow activation of OSDs that have previously started
Sage Weil [Mon, 6 Dec 2021 15:19:16 +0000 (10:19 -0500)]
mgr/cephadm: allow activation of OSDs that have previously started

When this code was introduced way back in ea987a0e56db106f7c76d11f86b3e602257f365e,
for some reason I was focused only on freshly created OSDs.  The
get_osd_uuid_map() helper is used by deploy_osd_daemons_for_existing_osds()
which is called not only by OSD creation but also by 'ceph cephadm
osd activate', which is meant to instantiate daemons for existing OSD
devices (e.g., devices that were reattached to a new server, or whose
/var/lib/ceph/$fsid/osd.$id directory was lost for some other reason.
However, if we ignore OSDs with up_from > 0, then we can't recreate a
daemon instance for such existing OSDs--arguably the most important ones,
since they may hold real data.

Fixes: https://tracker.ceph.com/issues/53491
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 40aeac7f52c80df0daa99bb664e3d672da3bc249)

3 years agopython-common: move test_valid_snmp_gateway_spec from mgr/cephadm 44529/head
Sebastian Wagner [Mon, 20 Dec 2021 10:48:43 +0000 (11:48 +0100)]
python-common: move test_valid_snmp_gateway_spec from mgr/cephadm

We have to validate to_json() now as well, as we have spcial enums.
Otherwiese we might end up with !!python... representations.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 303843b476b442d0d398680b23aa244633768f29)

3 years agopython-common: move test_invalid_snmp_gateway_spec from mgr/cephadm
Sebastian Wagner [Mon, 20 Dec 2021 10:37:40 +0000 (11:37 +0100)]
python-common: move test_invalid_snmp_gateway_spec from mgr/cephadm

Let's keep the tests in the same package where the class is defined.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit c652ae74795252f875594b09627064d97ff2a762)

3 years agomgr/cephadm: SNMP: don't write urls manually
Sebastian Wagner [Thu, 16 Dec 2021 16:57:50 +0000 (17:57 +0100)]
mgr/cephadm: SNMP: don't write urls manually

this just broken for non-trivial urls. Don't be a bad example

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 3f47c2293b9ace730d6f76c613ef2106f274ea32)

3 years agomgr/cephadm: SNMP: Don't write default values into the store
Sebastian Wagner [Thu, 16 Dec 2021 16:51:07 +0000 (17:51 +0100)]
mgr/cephadm: SNMP: Don't write default values into the store

Enable us to chage defaults in the future

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 5e3cc4d6c167b7d5bdd0f08aa90ed7e7d0779b25)