]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Nizamudeen A [Wed, 9 Feb 2022 15:36:16 +0000 (21:06 +0530)]
mgr/dashboard: change the readFile to readFileSync
Apparently the readFile i added in #44934 is async and that's not what
we want. so changing it to the synchronous call that is readFileSync
Fixes: https://tracker.ceph.com/issues/54190
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit
cbfdd551d9c1e67c2757056ac1119c058f4aa704 )
Nizamudeen A [Tue, 8 Feb 2022 06:20:29 +0000 (11:50 +0530)]
mgr/dashboard: set appropriate baseline branch for applitools
All the dashboard PRs are checked against a baseline branch called
'default' in the visual regresstion testing. This will cause issues when
testing PRs in different branches. For eg: currently our master and
pacific has to save two different screenshots since the two of them
differ slightly.
Disabling the applitools logs as well because its too 'noisy'
Fixes: https://tracker.ceph.com/issues/54190
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit
40c902ac59b758a314f6a123d71cb59342523dac )
Ernesto Puerta [Fri, 4 Feb 2022 16:39:05 +0000 (17:39 +0100)]
Merge pull request #43897 from k0ste/wip-53234-pacific
pacific: mgr/prometheus: Make prometheus standby behaviour configurable
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: k0ste <NOT@FOUND>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: rsommer <NOT@FOUND>
Ernesto Puerta [Fri, 4 Feb 2022 16:36:43 +0000 (17:36 +0100)]
Merge pull request #44775 from p-se/wip-53882-pacific
pacific: mgr/dashboard: fix Grafana OSD/host panels
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: p-se <NOT@FOUND>
Kamoltat Sirivadhna [Wed, 2 Feb 2022 22:10:42 +0000 (17:10 -0500)]
Merge pull request #44672 from kamoltat/wip-ksirivad-pacific-backport-44553
pacific: pybind/mgr/progress: enforced try and except on accessing event dictionary
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Yuri Weinstein [Wed, 2 Feb 2022 17:26:51 +0000 (09:26 -0800)]
Merge pull request #44840 from mchangir/pacific-avoid-mon-sanity-assertion-on-startup
qa: skip sanity check during upgrade
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Venky Shankar vshankar@redhat.com
Yuri Weinstein [Wed, 2 Feb 2022 17:25:24 +0000 (09:25 -0800)]
Merge pull request #44259 from sseshasa/wip-53551-pacific
pacific: osd/OSDMap: Add health warning if 'require-osd-release' != current release
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Sridhar Seshasayee [Fri, 3 Dec 2021 09:55:32 +0000 (15:25 +0530)]
PendingReleaseNotes: Release note 'require-osd-release' health warning
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit
34f18fa45f3e98de4f35959bdeb1d11730f3f291 )
Conflicts:
PendingReleaseNotes
- Add the release note under the correct release heading.
Patrick Donnelly [Wed, 15 Dec 2021 17:57:00 +0000 (12:57 -0500)]
qa: set pacific require-osd-release to avoid health warning
Fixes: https://tracker.ceph.com/issues/53615
Fixes: bd815bd9d6ecdecaab3d2dd9e0f5a18aa795d749
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
bc2eaba8c6616c3469ac85d36ee21f3e9765cf42 )
Conflicts:
- Changed release name in commit message from 'quincy' to 'pacific'.
../fs/upgrade/featureful_client/old_client/tasks/2-upgrade.yaml
../fs/upgrade/featureful_client/upgraded_client/tasks/2-upgrade.yaml
../fs/upgrade/nofs/tasks/1-upgrade.yaml
- Changed release name from 'quincy' to 'pacific' when setting the
'require-osd-release' flag in the above files.
../fs/upgrade/volumes/import-legacy/tasks/2-upgrade.yaml
- Changed release name from 'octopus' to 'pacific' when setting the
'require-osd-release' flag in the above file.
Neha Ojha [Wed, 1 Dec 2021 01:22:46 +0000 (01:22 +0000)]
qa/suites/upgrade/octopus-x/stress-split-no-cephadm: remove msgr2
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit
6ad7a8a597e6314cf9310f9f7c2f01ef82bc8fa3 )
Neha Ojha [Wed, 1 Dec 2021 01:15:14 +0000 (01:15 +0000)]
qa: test upgrades with hybrid allocator
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit
df67040a4c34593adb4585086671543233d90f5a )
Neha Ojha [Wed, 1 Dec 2021 01:12:15 +0000 (01:12 +0000)]
qa: rename octopus install correctly
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit
3b15a044550903d4074fc421a2ef1f24fc7e4023 )
Neha Ojha [Wed, 1 Dec 2021 00:39:57 +0000 (00:39 +0000)]
qa: remove leftovers from nautilus
pglog_hardlimit and msgr2
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit
ed4bb05bd945c5d30cb70b88b1f8db0eb64a6ab1 )
Sridhar Seshasayee [Thu, 13 Jan 2022 13:27:09 +0000 (18:57 +0530)]
qa/suites/upgrade: Fix/Modify upgrade tests to work with 'pacific' release.
This commit is not a cherry-pick and fixes the following issues unique to
the pacific release:
1. Fixes the nautilus-x and octopus-x upgrade tests to work with the
pacific release by updating the post upgrade step to set the
'require-osd-release' flag to 'pacific'. This is done by using
'.qa/releases/pacific.yaml' for all the tests.
2. Fixed an issue in 'upgrade-mon-osd-mds.yaml' under both the
nautilus-x/parallel, octopus-x/parallel-no-cephadm tests. The
'wait-for-healthy' check should not be performed after all the osds
are upgraded since the 'require-osd-release' warning comes into
effect. This check is delayed until after the 'require-osd-release'
flag is set to 'pacific'.
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Sridhar Seshasayee [Mon, 22 Nov 2021 15:16:02 +0000 (20:46 +0530)]
osd/OSDMap: Add health warning if 'require-osd-release' != current release
After all OSDs are upgraded to a new release, generate a health warning if
the 'require-osd-release' flag doesn't match the the new release version.
This will result in the cluster showing a warning in the health state until
the flag is set properly.
Fixes: https://tracker.ceph.com/issues/51984
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit
bd815bd9d6ecdecaab3d2dd9e0f5a18aa795d749 )
Conflicts:
src/osd/OSDMap.cc
- Removed checks for non-existent ceph_release_t 'quincy' flag from
OSDMap::pending_require_osd_release().
Yuri Weinstein [Wed, 2 Feb 2022 00:04:42 +0000 (16:04 -0800)]
Merge pull request #44387 from trociny/wip-53702-pacific
pacific: qa/tasks: improve backfill_toofull test
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Wed, 2 Feb 2022 00:04:08 +0000 (16:04 -0800)]
Merge pull request #44325 from k0ste/wip-53621-pacific
pacific: mgr/devicehealth: fix missing timezone from time delta calculation
Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
Yuri Weinstein [Wed, 2 Feb 2022 00:03:32 +0000 (16:03 -0800)]
Merge pull request #44205 from k0ste/wip-53488-pacific
pacific: mgr/prometheus: define module options for standby
Reviewed-by: Adam King adking@redhat.com
Yuri Weinstein [Wed, 2 Feb 2022 00:02:34 +0000 (16:02 -0800)]
Merge pull request #44175 from cfsnyder/wip-51172-pacific
pacific: common/PriorityCache: low perf counters priorities for submodules.
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Yuri Weinstein [Tue, 1 Feb 2022 22:12:17 +0000 (14:12 -0800)]
Merge pull request #44181 from myoungwon/pacific-50192
pacific: osd: recover unreadable snapshot before reading ref. count info
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Tue, 1 Feb 2022 22:10:21 +0000 (14:10 -0800)]
Merge pull request #43955 from cfsnyder/wip-53201-pacific
pacific: osd: fix 'ceph osd stop <osd.nnn>' doesn't take effect
Reviewed-by: Laura Flores <lflores@redhat.com>
Yuri Weinstein [Tue, 1 Feb 2022 20:42:40 +0000 (12:42 -0800)]
Merge pull request #44212 from k0ste/wip-53494-pacific
pacific: mgr: fix locking for MetadataUpdate::finish
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Tue, 1 Feb 2022 20:41:52 +0000 (12:41 -0800)]
Merge pull request #44202 from myoungwon/pacific-53486
pacific: test: increase retry duration when calculating manifest ref. count
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Tue, 1 Feb 2022 20:40:49 +0000 (12:40 -0800)]
Merge pull request #44173 from cfsnyder/wip-51150-pacific
pacific: osd: set r only if succeed in FillInVerifyExtent
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Tue, 1 Feb 2022 20:40:09 +0000 (12:40 -0800)]
Merge pull request #44096 from cfsnyder/wip-53388-pacific
pacific: osd/OSDMap.cc: clean up pg_temp for nonexistent pgs
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Yuri Weinstein [Tue, 1 Feb 2022 20:39:04 +0000 (12:39 -0800)]
Merge pull request #43882 from ifed01/wip-ifed-fix-53011-pac
pacific: os/bluestore: use proper prefix when removing undecodable Share Blob.
Reviewed-by: Neha Ojha <nojha@redhat.com>
Milind Changire [Mon, 31 Jan 2022 11:22:45 +0000 (16:52 +0530)]
qa: skip sanity check during upgrade
Fixes: https://tracker.ceph.com/issues/54064
Signed-off-by: Milind Changire <mchangir@redhat.com>
Ernesto Puerta [Thu, 27 Jan 2022 10:27:06 +0000 (11:27 +0100)]
Merge pull request #44727 from cfsnyder/wip-51825-pacific
pacific: qa/run-tox-mgr-dashboard: Do not write to /tmp/test_sanitize_password…
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Cory Snyder <csnyder@iland.com>
Reviewed-by: kevinzs2048 <NOT@FOUND>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Yuri Weinstein [Wed, 26 Jan 2022 23:39:24 +0000 (15:39 -0800)]
Merge pull request #44540 from kamoltat/wip-ksirivad-backport-pacific-43716
pacific: mgr/autoscaler: Introduce noautoscale flag
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Vikhyat Umrao <vikhyat@redhat.com>
Adam King [Wed, 26 Jan 2022 14:50:37 +0000 (09:50 -0500)]
Merge pull request #44660 from sebastian-philipp/pacific-backport-44647
pacific: doc/cephadm: remove duplicate deployment scenario section
Reviewed-by: Adam King <adking@redhat.com>
Adam King [Wed, 26 Jan 2022 14:46:08 +0000 (09:46 -0500)]
Merge pull request #44636 from sebastian-philipp/pacific-backport-44510
pacific: doc/cephadm: improve the development doc a bit
Reviewed-by: Adam King <adking@redhat.com>
Yuri Weinstein [Wed, 26 Jan 2022 00:27:11 +0000 (16:27 -0800)]
Merge pull request #44584 from vumrao/wip-vumrao-53876
pacific: osd/PeeringState: separate history's pruub from pg's
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kamoltat [Wed, 22 Dec 2021 21:42:52 +0000 (21:42 +0000)]
docs: Added noautoscale to docs + release notes
Updated the docs in
https://docs.ceph.com/en/latest/rados/operations/placement-groups/
and updated the release notes to reflect noautoscale flag.
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit
9baed0394e03de41f1921693bb33badd1922fa97 )
Conflicts:
PendingReleaseNotes - trivial fix
Yuri Weinstein [Tue, 25 Jan 2022 19:53:58 +0000 (11:53 -0800)]
Merge pull request #44513 from batrick/i53714
pacific: mds: fails to reintegrate strays if destdn's directory is full (ENOSPC)
Reviewed-by: Milind Changire <mchangir@redhat.com>
Kamoltat [Wed, 8 Dec 2021 15:15:50 +0000 (15:15 +0000)]
qa: Added workunit test for noautoscale flag
set and unset the noautoscale flag,
evaluate if the results are what
we expected. As well as, evaluate
if the flag is correct when we
create new pools.
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit
bb42c71e7e059be2cc4d1d4408e475b15b1c6340 )
Conflicts:
test-noautoscale-flag.yaml
- modified pre-mgr-command to not create
device health monitor
Kamoltat [Wed, 8 Dec 2021 15:13:38 +0000 (15:13 +0000)]
pybind/mgr/autoscaler: Introduce noautoscale flag
`noautoscale` flag is a feature where the
user can choose to flip the switch between
turning autoscale `on` and `off` for all
pools with a single command.
`osd pool set noautoscale` will turn all
autoscale mode`off` for all pools.
`osd pool unset noautoscale` will turn all
autoscale mode `on` for all pools.
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit
be17f041bab90d8f93c3e52df74cdf6c28b44ef2 )
Conflicts:
src/pybind/mgr/pg_autoscaler/module.py - trivial fix
Yuri Weinstein [Tue, 25 Jan 2022 16:04:50 +0000 (08:04 -0800)]
Merge pull request #44642 from vshankar/wip-53458
pacific: qa: wait for purge queue operations to finish
Reviewed-by: Milind Changire <mchangir@redhat.com>
Yuri Weinstein [Tue, 25 Jan 2022 16:04:15 +0000 (08:04 -0800)]
Merge pull request #44639 from vshankar/wip-53912
pacific: qa: adjust for MDSs to get deployed before verifying their availability
Reviewed-by: Milind Changire <mchangir@redhat.com>
Yuri Weinstein [Tue, 25 Jan 2022 16:03:47 +0000 (08:03 -0800)]
Merge pull request #44623 from lxbsz/wip-53908
pacific: mds: remove the duplicated or incorrect respond
Reviewed-by: Milind Changire <mchangir@redhat.com>
Yuri Weinstein [Tue, 25 Jan 2022 16:03:21 +0000 (08:03 -0800)]
Merge pull request #44622 from lxbsz/wip-53860
pacific: mds: dump tree '/' when the path is empty
Reviewed-by: Kotresh HR khiremat@redhat.com
Reviewed-by: Milind Changire <mchangir@redhat.com>
Yuri Weinstein [Tue, 25 Jan 2022 16:02:37 +0000 (08:02 -0800)]
Merge pull request #44621 from lxbsz/wip-53861
pacific: qa: do not use any time related suffix for *_op_timeouts
Reviewed-by: Kotresh HR khiremat@redhat.com
Reviewed-by: Milind Changire <mchangir@redhat.com>
Yuri Weinstein [Tue, 25 Jan 2022 16:01:47 +0000 (08:01 -0800)]
Merge pull request #44620 from lxbsz/wip-53864
pacific: mds: directly return just after responding the link request
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Kotresh HR khiremat@redhat.com
Yuri Weinstein [Tue, 25 Jan 2022 16:00:51 +0000 (08:00 -0800)]
Merge pull request #44516 from nmshelke/wip-53777-pacific
pacific: mgr/stats: exception handling for ceph fs perf stats command
Reviewed-by: Milind Changire <mchangir@redhat.com>
Yuri Weinstein [Tue, 25 Jan 2022 16:00:19 +0000 (08:00 -0800)]
Merge pull request #44514 from batrick/i53736
pacific: mds: recursive scrub does not trigger stray reintegration
Reviewed-by: Milind Changire <mchangir@redhat.com>
Yuri Weinstein [Tue, 25 Jan 2022 15:59:05 +0000 (07:59 -0800)]
Merge pull request #44512 from MrFreezeex/wip-52631-pacific
pacific: mds: add mds_dir_max_entries config option
Reviewed-by: Milind Changire <mchangir@redhat.com>
Patrick Seidensal [Thu, 9 Dec 2021 14:01:54 +0000 (15:01 +0100)]
monitoring: Add unit tests for OSD panels in ceph-cluster dashboard
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit
7d7488018ea30dc61174bafcad01bb3eac8aa9bb )
Patrick Seidensal [Thu, 9 Dec 2021 13:59:49 +0000 (14:59 +0100)]
monitoring: fix display ceph_osd_in in Grafana panel
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit
4a6b2c1dfbbe7182beaf510c4a7297a79c6e2524 )
Patrick Seidensal [Mon, 25 Oct 2021 13:00:14 +0000 (15:00 +0200)]
mgr/prometheus: Fix regression with OSD/host details/overview dashboards
Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.
As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk. This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros). The data we
have expected is simply different in some rare cases.
I have not found a sole PromQL solution to this issue. What we basically
need is the following.
1. Match on labels `host` and `instance` to get one or more OSD names
from a metadata metric (`ceph_disk_occupation`) to let a user know
about which OSDs belong to which disk.
2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
in which case the value of `ceph_daemon` must not refer to more than
a single OSD. The exact opposite to requirement 1.
As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.
Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk). This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.
`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.
foo * on(ceph_daemon) group_left ceph_disk_occupation
`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).
foo * on(device,instance)
group_left(ceph_daemon) ceph_disk_occupation_human
Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit
18d3a71618a5e3bc3cbd0bce017fb7b9c18c2ca0 )
Patrick Seidensal [Mon, 25 Oct 2021 08:51:35 +0000 (10:51 +0200)]
mgr/prometheus: Refactoring: Introduce type aliases
Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit
154d3525b19135a929851c0b027da19abda20ebe )
Guillaume Abrioux [Mon, 24 Jan 2022 12:39:15 +0000 (13:39 +0100)]
Merge pull request #44708 from guits/wip-53962-pacific
pacific: ceph-volume: show RBD devices as not available
Ernesto Puerta [Fri, 21 Jan 2022 19:44:47 +0000 (20:44 +0100)]
Merge pull request #44534 from rhcs-dashboard/wip-53834-pacific
pacific: mgr/dashboard: Update Angular version to 12
Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Kevin Zhao [Thu, 22 Jul 2021 06:58:20 +0000 (07:58 +0100)]
qa/run-tox-mgr-dashboard: Do not write to /tmp/test_sanitize_password.txt file
To allow running multiple instances of the same tests.
Fixes: https://tracker.ceph.com/issues/51792
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
(cherry picked from commit
d04ef800abd671a564795eba198ca976619b4cc7 )
Guillaume Abrioux [Fri, 21 Jan 2022 12:48:04 +0000 (13:48 +0100)]
Merge pull request #44701 from guits/wip-53955-pacific
pacific: ceph-volume: don't use MultiLogger in find_executable_on_host()
Michael Fritch [Tue, 18 Jan 2022 22:15:45 +0000 (15:15 -0700)]
ceph-volume: filter RBD devices from the device inventory
Avoid running `blkid` or deploying OSDs on RBD devices by ensuring they
do not appear in the `ceph-volume inventory`
Fixes: https://tracker.ceph.com/issues/53846
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit
47325ec3ec5ce1d53c5eae2952f631e95b7135fe )
Adam King [Thu, 20 Jan 2022 19:51:04 +0000 (14:51 -0500)]
Merge pull request #44681 from guits/split-cephadm-distros
qa: split distro for rados/cephadm/smoke tests
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Guillaume Abrioux [Wed, 19 Jan 2022 14:04:20 +0000 (15:04 +0100)]
ceph-volume: don't use MultiLogger in find_executable_on_host()
This generates a lot of unnecessary messages on the terminal.
Fixes: https://tracker.ceph.com/issues/53934
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
3be55621600be3ebc9c70295a3a351dab426b3a3 )
Ernesto Puerta [Thu, 20 Jan 2022 17:22:54 +0000 (18:22 +0100)]
Merge pull request #44480 from rhcs-dashboard/wip-53616-pacific
pacific: mgr/prometheus: expose ceph healthchecks as metrics
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
Reviewed-by: sebastian-philipp <NOT@FOUND>
Guillaume Abrioux [Thu, 20 Jan 2022 10:29:52 +0000 (11:29 +0100)]
qa: split distro for rados/cephadm/smoke tests
There was a difference between master and pacific.
The hwe kernel modification for Ubuntu 20.04 should be done
only for cephadm tests. Modifying `qa/distros/all/ubuntu_20.04.yaml` broke
many tests.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Sebastian Wagner [Thu, 20 Jan 2022 10:09:29 +0000 (11:09 +0100)]
Merge pull request #44635 from sebastian-philipp/pacific-backport-44506
pacific: qa/suites/orch/cephadm: Also run the rbd/iscsi suite
Reviewed-by: Adam King <adking@redhat.com>
Yuri Weinstein [Wed, 19 Jan 2022 22:05:38 +0000 (14:05 -0800)]
Merge pull request #44596 from idryomov/wip-xfstests-qemu-cert-pacific
pacific: qa/run_xfstests_qemu.sh: stop reporting success without actually running any tests
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Yuri Weinstein [Wed, 19 Jan 2022 22:04:47 +0000 (14:04 -0800)]
Merge pull request #44594 from idryomov/wip-diff-iterate-parent-fix-pacific
pacific: librbd: restore diff-iterate include_parent functionality in fast-diff mode
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Yuri Weinstein [Wed, 19 Jan 2022 22:04:06 +0000 (14:04 -0800)]
Merge pull request #44547 from cfsnyder/wip-53839-pacific
pacific: librbd: diff-iterate reports incorrect offsets in fast-diff mode
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Kamoltat [Wed, 12 Jan 2022 02:41:01 +0000 (02:41 +0000)]
pybind/mgr/progress: enforced try and except on accessing event dictionary
There is a certain race condition scenario where
an event gets deleted while the progress module
iterates through the ``events`` dictionary,
without a ``try and except``, this will cause
an unhandled exception error and will crash
the module.
This commit will enforce ``try and except``
on every part of the code where we are accessing
the ``events`` dictionary.
Fixes: https://tracker.ceph.com/issues/53803
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit
b70d4a9caae0eb859e10b68f93573d507625d267 )
Conflicts:
src/pybind/mgr/progress/module.py - trivial-fix
Sebastian Wagner [Wed, 19 Jan 2022 15:11:53 +0000 (16:11 +0100)]
Merge pull request #44626 from sebastian-philipp/pacific-backport-42905
pacific: python-common: improve OSD spec error messages
Reviewed-by: Michael Fritch <mfritch@suse.com>
Melissa Li [Tue, 18 Jan 2022 21:53:04 +0000 (16:53 -0500)]
doc/cephadm: remove duplicate deployment scenario section
Signed-off-by: Melissa Li <melissali@redhat.com>
(cherry picked from commit
2222f26a37137a2f70b3f736ffad16c51a6b4e44 )
Sebastian Wagner [Wed, 19 Jan 2022 12:35:41 +0000 (13:35 +0100)]
Merge pull request #44644 from guits/wip-53916-pacific
pacific: ceph-volume: fix regression introcuded via #43536
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Ernesto Puerta [Wed, 19 Jan 2022 12:14:59 +0000 (13:14 +0100)]
Merge pull request #44652 from rhcs-dashboard/wip-53921-pacific
pacific: mgr/dashboard: Refactoring dashboard cephadm checks
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Ernesto Puerta [Wed, 19 Jan 2022 12:03:24 +0000 (13:03 +0100)]
Merge pull request #44650 from aaSharma14/wip-53828-pacific
pacific: mgr/dashboard: monitoring:Implement BlueStore onode hit/miss counters into the dashboard
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Sebastian Wagner [Thu, 25 Nov 2021 16:38:35 +0000 (17:38 +0100)]
python-common/tests: Remove filstore tests in test_disk_selector.py
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
1c40ca1e37e5e798cfd9cf317f39b11dd22ea086 )
Sebastian Wagner [Wed, 10 Nov 2021 14:54:42 +0000 (15:54 +0100)]
python-common: Don't valiate ServiceSpec.from_json() in `orch ls`
unfortunately `ceph orch ls` may return invalid OSD specs for
OSDs not associated to and specs.
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
3f38583b7189d99be360d8475fe6ef8cd53dee7c )
Conflicts:
src/pybind/mgr/orchestrator/module.py
Sebastian Wagner [Wed, 22 Sep 2021 11:46:52 +0000 (13:46 +0200)]
python-common: HostSpec: add `validate()`
Adjust HostSpec interface to ServiceSpec
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
7c6d922dead8480cd1f2cd05be7ccd1d8d5b7dd8 )
Conflicts:
src/python-common/ceph/deployment/service_spec.py
Sebastian Wagner [Wed, 1 Sep 2021 13:46:12 +0000 (15:46 +0200)]
python-common: DriveGroupSpec: move pacement validation to validate()
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
311860412e840e6b31e04b80a9de5e9ae05e7fb7 )
Sebastian Wagner [Wed, 1 Sep 2021 13:36:01 +0000 (15:36 +0200)]
pyhton-common: DriveGroupSpec: Allow unnamed OSD specs
Cause it never actually worked as expected.
Remove duplicated service_id check, cause it's already
verified by parent method.
Fixes: https://tracker.ceph.com/issues/46253
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
8b567e132d75711179febac126c5ec8a250b8952 )
Conflicts:
src/python-common/ceph/deployment/service_spec.py
Sebastian Wagner [Tue, 24 Aug 2021 12:57:27 +0000 (14:57 +0200)]
python-common: Improve DriveSelection error messages
Fixes: https://tracker.ceph.com/issues/50685
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
74f29b97ea3331d43391cd40fe843104a2c15c3d )
Sebastian Wagner [Tue, 24 Aug 2021 10:56:21 +0000 (12:56 +0200)]
python-common: OSD specs: Improve quality of error messages
Fixes: https://tracker.ceph.com/issues/47401
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
4142c52d7406bb67042d9ad7b26d8e84f5a734ba )
Conflicts:
src/python-common/ceph/deployment/drive_group.py
Sebastian Wagner [Tue, 24 Aug 2021 12:31:56 +0000 (14:31 +0200)]
python-common: Remove duplicated DriveGroupSpec.__repr__ and __eq__
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
b91f81801af40c213adfbc88c8fd148b4edf3ede )
Conflicts:
src/python-common/ceph/deployment/drive_group.py
Sebastian Wagner [Wed, 22 Sep 2021 12:20:24 +0000 (14:20 +0200)]
mgr/orch: re-raise to make debugging easier
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
38b52f715fa581f3540ad6fc4c595ab0ede83ece )
Sebastian Wagner [Wed, 19 Jan 2022 10:39:36 +0000 (11:39 +0100)]
Merge pull request #44627 from sebastian-philipp/pacific-backport-44228
pacific: mgr/cephadm: fix 'cephadm osd activate' on existing osd devices
Reviewed-by: Adam King <adking@redhat.com>
Sebastian Wagner [Wed, 19 Jan 2022 10:39:13 +0000 (11:39 +0100)]
Merge pull request #44625 from sebastian-philipp/pacific-backport-43149
pacific: mgr/cephadm: Add client.admin keyring when upgrading from older version
Reviewed-by: Michael Fritch <mfritch@suse.com>
Guillaume Abrioux [Fri, 14 Jan 2022 17:20:10 +0000 (18:20 +0100)]
qa/cephadm: install hwe kernel only for focal
Let's install hwe kernel only on Ubuntu focal, otherwise we only shift the
problem on Ubuntu bionic given that the hwe kernel for bionic is 5.4.
Fixes: https://tracker.ceph.com/issues/53863
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
5c0f0698a5b8db75ae9bcdca311a68a1589ee0a5 )
Guillaume Abrioux [Thu, 13 Jan 2022 21:46:03 +0000 (22:46 +0100)]
qa/nvme_loop: fix an issue on ubuntu 18.04
The following command:
```
echo /dev/sda | tee /sys/kernel/config/nvmet/subsystems/sda/namespaces/1/device_path
```
makes nvme_loop fail because fascinatingly, it adds an unexpected newline.
See:
```
/dev/sda
/dev/sda
1
tee: /sys/kernel/config/nvmet/subsystems/sda/namespaces/1/enable: No such file or directory
/dev/sda
1
```
Other distros don't have the same behavior:
```
CentOS 8
/dev/sda
/dev/sda
1
Ubuntu 20.04
/dev/sda
/dev/sda
1
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
f8e22fb3da9bfbdc75d88beb66543716afb19511 )
Guillaume Abrioux [Mon, 10 Jan 2022 09:21:53 +0000 (10:21 +0100)]
ceph-volume: fix regression introcuded via #43536
The recent changes from PR #43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.
Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.
Fixes: https://tracker.ceph.com/issues/53812
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
95e88cda3df76b59b548ae808df0ef7f19db1f63 )
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
3c93ffdc92d4d03b9ae7415b548192a572cfc5ea )
Nizamudeen A [Thu, 13 Jan 2022 12:58:56 +0000 (18:28 +0530)]
mgr/dashboard: Refactoring dashboard cephadm checks
I isolated all the tests suites into there respective files
so that in future it is easier to add more tests to it.
I also given priority to the host actions.
Create OSD checks are now written in a way that OSDs
are created only on the intended hosts. This will make
the host draining process easier and less time consuming.
Also tried to address the flaky force maintenance checks.
Removed some duplicated codes
Service creation part improved to reduce the time taken
for its completion
Fixes: https://tracker.ceph.com/issues/53905
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit
b6759b75c9fc4d3fb565201aa6bbe0c2473fd3d4 )
Ernesto Puerta [Thu, 13 Jan 2022 16:21:12 +0000 (17:21 +0100)]
qa/dashboard: ensure node 16 is installed
For Ubuntu: https://github.com/nodesource/distributions#manual-installation
Fixes: https://tracker.ceph.com/issues/53843
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit
7225b68e46173350954beb418ecd43e9eca4d179 )
Aashish Sharma [Mon, 13 Dec 2021 12:03:02 +0000 (17:33 +0530)]
mgr/dashboard: monitoring:Implement BlueStore onode hit/miss counters into the dashboard
Provide the details pulled from Bluestore stats in order to display the onode hit/miss counters
Fixes: https://tracker.ceph.com/issues/53577
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit
15aa4dffa91b325014024d3e35603d88330b87cc )
Ernesto Puerta [Tue, 18 Jan 2022 20:01:51 +0000 (21:01 +0100)]
Merge pull request #44467 from rhcs-dashboard/wip-53780-pacific
pacific: mgr/dashboard: fix orchestrator/02-hosts-inventory.e2e failure
Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
Ernesto Puerta [Tue, 18 Jan 2022 19:58:47 +0000 (20:58 +0100)]
Merge pull request #44533 from rhcs-dashboard/wip-53825-pacific
pacific: mgr/dashboard: add test coverage for API docs (SwaggerUI)
Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Sebastian Wagner [Tue, 18 Jan 2022 13:55:31 +0000 (14:55 +0100)]
Merge pull request #44529 from sebastian-philipp/pacific-backport-43901-44341
pacific: mgr/cephadm: Add snmp-gateway service support
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
Venky Shankar [Tue, 23 Nov 2021 09:37:01 +0000 (04:37 -0500)]
qa: wait for purge queue operations to finish
TestFragmentation.test_deep_split relies on `num_strays`
to reach zero expecting that the purge threads would
have deleted the directory entries. However, checking
`num_strays` cannot be relied on since PurqeQueue merely
journals the purge item (see PurgeQueue::push) followed
by the StrayManager marking the stray as removed thereby
accounting `num_strays`.
So, add an additional condition to check if the purge
threads have finished processing items.
Fixes: http://tracker.ceph.com/issues/52487
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit
d9c79983230a9237422998771db4b4c450aed949 )
Venky Shankar [Tue, 11 Jan 2022 09:05:03 +0000 (14:35 +0530)]
qa: adjust for MDSs to get deployed before verifying their availability
The check happens when some MDSs are *just* deployed by cephadm causing
jobs to fail with:
Command failed on smithi016 with status 1: 'sudo /home/ubuntu/cephtest/cephadm \
--image docker.io/ceph/ceph:v16.2.4 shell -c /etc/ceph/ceph.conf -k \
/etc/ceph/ceph.client.admin.keyring --fsid
403bfcae -706b-11ec-8c32-
001a4aab830c \
-- bash -c \'ceph --format=json mds versions | jq -e ". | add == 4"\''
Fixes: http://tracker.ceph.com/issues/53857
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit
8939d8c14b911e8f57a46c442e31185ce3ca5d63 )
Radoslaw Zarzynski [Mon, 10 Jan 2022 14:10:33 +0000 (14:10 +0000)]
doc/cephadm: improve the developer's guide a bit
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
(cherry picked from commit
4c58d71d2bcd6b89e1578b844d8092b692cec4b2 )
Radoslaw Zarzynski [Tue, 4 Jan 2022 15:39:13 +0000 (15:39 +0000)]
doc/cephadm: fix a typo in developing-cephadm.rst
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
(cherry picked from commit
e513869fd36459518178ac321e8dda61836d4631 )
Sebastian Wagner [Mon, 10 Jan 2022 09:45:36 +0000 (10:45 +0100)]
qa/suites/orch/cephadm: Also run the rbd/iscsi suite
Adding a new workload test to our suite.
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
651192aacc4ac695a03f4ab0f7ffa045632d5d11 )
Sage Weil [Thu, 16 Dec 2021 15:00:05 +0000 (10:00 -0500)]
qa/suites/orch/cephadm/osds: test 'ceph cephadm osd activate'
Make sure this command behaves when the /var/lib/ceph osd.NNN dir is
removed.
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit
867bf04b74d510a544d9555afc56d5cd6657874d )
Sage Weil [Mon, 6 Dec 2021 15:19:57 +0000 (10:19 -0500)]
mgr/cephadm/services/osd: skip found osds that already have daemons
If we are trying to deploy new or newly-found osds, we can skip the ones
that already have cephadm daemons deployed.
Fixes: https://tracker.ceph.com/issues/53491
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit
dc3d45bbe8c3bfedee57da619616c0be489cd233 )
Conflicts:
src/pybind/mgr/cephadm/services/osd.py
Sage Weil [Mon, 6 Dec 2021 15:19:16 +0000 (10:19 -0500)]
mgr/cephadm: allow activation of OSDs that have previously started
When this code was introduced way back in
ea987a0e56db106f7c76d11f86b3e602257f365e ,
for some reason I was focused only on freshly created OSDs. The
get_osd_uuid_map() helper is used by deploy_osd_daemons_for_existing_osds()
which is called not only by OSD creation but also by 'ceph cephadm
osd activate', which is meant to instantiate daemons for existing OSD
devices (e.g., devices that were reattached to a new server, or whose
/var/lib/ceph/$fsid/osd.$id directory was lost for some other reason.
However, if we ignore OSDs with up_from > 0, then we can't recreate a
daemon instance for such existing OSDs--arguably the most important ones,
since they may hold real data.
Fixes: https://tracker.ceph.com/issues/53491
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit
40aeac7f52c80df0daa99bb664e3d672da3bc249 )
Sebastian Wagner [Mon, 20 Dec 2021 10:48:43 +0000 (11:48 +0100)]
python-common: move test_valid_snmp_gateway_spec from mgr/cephadm
We have to validate to_json() now as well, as we have spcial enums.
Otherwiese we might end up with !!python... representations.
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
303843b476b442d0d398680b23aa244633768f29 )
Sebastian Wagner [Mon, 20 Dec 2021 10:37:40 +0000 (11:37 +0100)]
python-common: move test_invalid_snmp_gateway_spec from mgr/cephadm
Let's keep the tests in the same package where the class is defined.
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
c652ae74795252f875594b09627064d97ff2a762 )
Sebastian Wagner [Thu, 16 Dec 2021 16:57:50 +0000 (17:57 +0100)]
mgr/cephadm: SNMP: don't write urls manually
this just broken for non-trivial urls. Don't be a bad example
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
3f47c2293b9ace730d6f76c613ef2106f274ea32 )
Sebastian Wagner [Thu, 16 Dec 2021 16:51:07 +0000 (17:51 +0100)]
mgr/cephadm: SNMP: Don't write default values into the store
Enable us to chage defaults in the future
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit
5e3cc4d6c167b7d5bdd0f08aa90ed7e7d0779b25 )