]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
3 years agocephadm: skip podman check during `rm-repo`
Michael Fritch [Fri, 10 Sep 2021 13:38:48 +0000 (07:38 -0600)]
cephadm: skip podman check during `rm-repo`

allow the `rm-repo` command to succeed when podman is not installed

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit fd977773a57e12003fb02bdc762bf6bc89d785a1)

3 years agodoc/cephadm: Removing a service
Sebastian Wagner [Sat, 11 Sep 2021 17:15:38 +0000 (19:15 +0200)]
doc/cephadm: Removing a service

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 7af138e089bf0972a2067f84fe9dd6cd4588e7f8)

3 years agodoc/cephadm: Add lots of links to other chapters
Sebastian Wagner [Sat, 11 Sep 2021 18:02:44 +0000 (20:02 +0200)]
doc/cephadm: Add lots of links to other chapters

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit d9ec8eb7a8da3c7dff40d7ed89feaebf7cadd37d)

3 years agocephadm: show podman version during `check-host`
Michael Fritch [Mon, 23 Aug 2021 13:47:56 +0000 (07:47 -0600)]
cephadm: show podman version during `check-host`

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 44aee33945f285ed4366b960e9526ed9d1984382)

3 years agocephadm: avoid unhandled `AttributeError`
Michael Fritch [Thu, 19 Aug 2021 20:06:32 +0000 (14:06 -0600)]
cephadm: avoid unhandled `AttributeError`

when docker/podman are not present

Fixes: https://tracker.ceph.com/issues/51818
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 4d5694a9f0977a22c2a6dac680d594ab3feb070b)

3 years agomgr/cephadm: show unhandled exceptions during `host add`
Michael Fritch [Thu, 19 Aug 2021 21:21:06 +0000 (15:21 -0600)]
mgr/cephadm: show unhandled exceptions during `host add`

138700e59bcd assumes stderr will always have a line containing the
prefix 'ERROR', which leads to an empty error reason when `check-host`
fails with an unhandled exception

Fixes: https://tracker.ceph.com/issues/51818
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit dac9225085a1f6d2eeaf209fc3d77c54208db2e8)

3 years agomgr/cephadm: Add OSDService.post_remove()
Sebastian Wagner [Tue, 31 Aug 2021 09:38:14 +0000 (11:38 +0200)]
mgr/cephadm: Add OSDService.post_remove()

Do not remove the osd.N keyring, if we failed to deploy the OSD, because
we cannot recover from it. The OSD keys are created by ceph-volume and not by
us.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit d7a4c5794034e60e94dd12951f7dbf4685647686)

3 years agomgr/cephadm: Add MonService.post_remove()
Sebastian Wagner [Tue, 31 Aug 2021 09:01:11 +0000 (11:01 +0200)]
mgr/cephadm: Add MonService.post_remove()

We should never remove the mon keyring. Let's move
this piece of code into the MonService class

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 71eaf35aa755735574f8bc53b38fa1bac550792c)

3 years agocephadm: (re)add command argv logging
Michael Fritch [Mon, 30 Aug 2021 15:40:55 +0000 (09:40 -0600)]
cephadm: (re)add command argv logging

introduced by 81a7df0498d and inadvertently removed by 3afec2ab30c

Fixes: https://tracker.ceph.com/issues/52484
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 6d18759bcb75c68c3a2d421e5d39c6cee8c18526)

3 years agocephadm: add thread ident to log messages
Michael Fritch [Mon, 30 Aug 2021 15:18:15 +0000 (09:18 -0600)]
cephadm: add thread ident to log messages

can be used to filter msgs from a specific cephadm command

Fixes: https://tracker.ceph.com/issues/52484
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 2f1482890bad99cf29623585c6e4f8abf15cecc5)

3 years agoqa/distros: Remove stale kubic distros
Sebastian Wagner [Fri, 3 Sep 2021 08:13:54 +0000 (10:13 +0200)]
qa/distros: Remove stale kubic distros

Cause they're broken

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 65e2cc084073ea7e05ecbc06f5773617676708ff)

3 years agoqa/distros/podman: Add rhel_8.rhel_8.4_container_tools_3.0.yaml
Sebastian Wagner [Thu, 2 Sep 2021 09:48:13 +0000 (11:48 +0200)]
qa/distros/podman: Add rhel_8.rhel_8.4_container_tools_3.0.yaml

mainly for the cephfs suite

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 0293580b502da3dc874430861c6cfac976403a67)

3 years agodoc/cephadm: monitoring: Further Reading
Sebastian Wagner [Mon, 30 Aug 2021 11:14:30 +0000 (13:14 +0200)]
doc/cephadm: monitoring: Further Reading

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit dc1180c485f91f86df0194b0769234d51a450816)

3 years ago.github/labeler: Add monitoring
Sebastian Wagner [Mon, 30 Aug 2021 10:45:56 +0000 (12:45 +0200)]
.github/labeler: Add monitoring

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 5624a62024ef64209d16d619e4a610870a244f37)

3 years agodoc/cephadm: monitoring: Add "Adding Alertmanager webhooks"
Sebastian Wagner [Mon, 30 Aug 2021 10:37:26 +0000 (12:37 +0200)]
doc/cephadm: monitoring: Add "Adding Alertmanager webhooks"

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 90f4cc017a49859224f0817fefc31ae459f5deec)

3 years agodoc/cephadm: monitoring: Add "Setting up Grafana"
Sebastian Wagner [Mon, 30 Aug 2021 10:26:23 +0000 (12:26 +0200)]
doc/cephadm: monitoring: Add "Setting up Grafana"

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit d17613086697252f31b57cddef1813b3dc2625d8)

3 years agodoc/cephadm: monitoring: move "deploying w/o" up
Sebastian Wagner [Mon, 30 Aug 2021 10:23:13 +0000 (12:23 +0200)]
doc/cephadm: monitoring: move "deploying w/o" up

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 5aa5fea8ee3e57e7636627031f651113a1d31cb4)

3 years agodoc/cephadm: monitoring: default placements
Sebastian Wagner [Mon, 30 Aug 2021 10:20:53 +0000 (12:20 +0200)]
doc/cephadm: monitoring: default placements

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit efd79a4adcfefc887bb42f1dc6882eec18576c1d)

3 years agoMerge pull request #43748 from tchaikov/pacific-doc-build
Sebastian Wagner [Mon, 1 Nov 2021 13:38:31 +0000 (14:38 +0100)]
Merge pull request #43748 from tchaikov/pacific-doc-build

pacific: admin/doc-requirements.txt: pin Sphinx at 3.5.4

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agoadmin/doc-requirements.txt: pin Sphinx at 3.5.4 43748/head
Kefu Chai [Sat, 30 Oct 2021 03:18:17 +0000 (11:18 +0800)]
admin/doc-requirements.txt: pin Sphinx at 3.5.4

* pin Sphinx at 3.5.4
* pin docutils at 0.18

at least the combination of these two versions
is known to compile.

to address the bug reported at
https://sourceforge.net/p/docutils/bugs/431/

the backtrace looks like:

/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/sphinx/util/docutils.py:285:
RemovedInSphinx30Warning: function based directive support is now
deprecated. Use class based directive instead.
  warnings.warn('function based directive support is now deprecated. '

Exception occurred:
  File
"/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/docutils/writers/html5_polyglot/__init__.py",
line 445, in section_title_tags
    if (ids and self.settings.section_self_link
AttributeError: 'Values' object has no attribute 'section_self_link'

please note this change is not cherry-picked from
master, because master already bumped Sphinx to 3.5.4
in 4968baa2523bd2a5ca6be147b26bc28906a864c9.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
3 years agoMerge pull request #43543 from rhcs-dashboard/wip-52870-pacific
Yuri Weinstein [Thu, 28 Oct 2021 20:02:43 +0000 (13:02 -0700)]
Merge pull request #43543 from rhcs-dashboard/wip-52870-pacific

pacific: mgr/dashboard: clean-up controllers and API backward versioning compatibility

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
3 years agoMerge pull request #43417 from trociny/wip-51646-pacific
Yuri Weinstein [Wed, 27 Oct 2021 13:16:54 +0000 (06:16 -0700)]
Merge pull request #43417 from trociny/wip-51646-pacific

pacific: osd/OSD: mkfs need wait for transcation completely finish

Reviewed-by: Kefu Chai <kchai@redhat.com>
3 years agoMerge pull request #43562 from lxbsz/vino_fix
Yuri Weinstein [Wed, 27 Oct 2021 13:15:50 +0000 (06:15 -0700)]
Merge pull request #43562 from lxbsz/vino_fix

Pacific: test/libcephfs: put inodes after lookup

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
3 years agoMerge pull request #43559 from batrick/i52654-pacific
Yuri Weinstein [Wed, 27 Oct 2021 13:14:59 +0000 (06:14 -0700)]
Merge pull request #43559 from batrick/i52654-pacific

pacific: pybind/mgr/cephadm: set allow_standby_replay during CephFS upgrade

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
3 years agoMerge pull request #43475 from lxbsz/tracker_52876
Yuri Weinstein [Wed, 27 Oct 2021 13:13:26 +0000 (06:13 -0700)]
Merge pull request #43475 from lxbsz/tracker_52876

pacific: test: shutdown the mounter after test finishes

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
3 years agoMerge pull request #43644 from aaSharma14/wip-52965-pacific
Ernesto Puerta [Wed, 27 Oct 2021 10:23:54 +0000 (12:23 +0200)]
Merge pull request #43644 from aaSharma14/wip-52965-pacific

pacific: mgr/dashboard: monitoring: grafonnet refactoring for radosgw dashboards

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #43619 from smithfarm/wip-53005-pacific
Yuri Weinstein [Tue, 26 Oct 2021 20:38:12 +0000 (13:38 -0700)]
Merge pull request #43619 from smithfarm/wip-53005-pacific

pacific: rgw/tracing: unify SO version numbers within librgw2 package

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #43512 from neha-ojha/wip-52770-pacific
Yuri Weinstein [Tue, 26 Oct 2021 20:29:56 +0000 (13:29 -0700)]
Merge pull request #43512 from neha-ojha/wip-52770-pacific

pacific: os/bluestore: list obj which equals to pend

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
3 years agoMerge pull request #43513 from neha-ojha/wip-52620-pacific
Yuri Weinstein [Tue, 26 Oct 2021 20:29:11 +0000 (13:29 -0700)]
Merge pull request #43513 from neha-ojha/wip-52620-pacific

pacific: osd: fix partial recovery become whole object recovery after restart osd

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
3 years agoMerge pull request #43511 from neha-ojha/wip-52843-pacific
Yuri Weinstein [Tue, 26 Oct 2021 20:28:38 +0000 (13:28 -0700)]
Merge pull request #43511 from neha-ojha/wip-52843-pacific

pacific: msg/async/ProtocolV2: Set the recv_stamp at the beginning of receiving a message

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
3 years agoMerge pull request #43445 from k0ste/wip-52848-pacific
Yuri Weinstein [Tue, 26 Oct 2021 20:27:30 +0000 (13:27 -0700)]
Merge pull request #43445 from k0ste/wip-52848-pacific

pacific: mgr: Add check to prevent mgr from crashing

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #43437 from trociny/wip-52831-pacific
Yuri Weinstein [Tue, 26 Oct 2021 20:26:46 +0000 (13:26 -0700)]
Merge pull request #43437 from trociny/wip-52831-pacific

pacific: osd: re-cache peer_bytes on every peering state activate

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #43421 from callithea/wip-52289-pacific
Yuri Weinstein [Tue, 26 Oct 2021 20:26:16 +0000 (13:26 -0700)]
Merge pull request #43421 from callithea/wip-52289-pacific

pacific: qa/tasks/mgr: skip test_diskprediction_local on python>=3.8

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #43353 from kamoltat/wip-ksirivad-backport-pacific-37544
Yuri Weinstein [Tue, 26 Oct 2021 20:24:37 +0000 (13:24 -0700)]
Merge pull request #43353 from kamoltat/wip-ksirivad-backport-pacific-37544

pacific: mgr/progress: optimize global recovery && introduce 5 seconds interval

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
3 years agomgr/dashboard: monitoring: grafonnet refactoring for hosts dashboards 43644/head
Aashish Sharma [Fri, 8 Oct 2021 10:03:13 +0000 (15:33 +0530)]
mgr/dashboard: monitoring: grafonnet refactoring for hosts dashboards

This PR intends to refactor hosts dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit f7714de294dd7376a9a8ae5131aa429322b459c3)

Conflicts:
monitoring/grafana/dashboards/jsonnet/grafana_dashboards.jsonnet(merging all the jsonnet dashboards in one PR)

3 years agoMerge pull request #43646 from rhcs-dashboard/wip-53026-pacific
Ernesto Puerta [Mon, 25 Oct 2021 14:00:47 +0000 (16:00 +0200)]
Merge pull request #43646 from rhcs-dashboard/wip-53026-pacific

pacific: mgr/dashboard: pin a version for autopep8 and pyfakefs

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agomgr/dashboard: pin a version for autopep8 and pyfakefs 43646/head
Nizamudeen A [Mon, 25 Oct 2021 08:42:57 +0000 (14:12 +0530)]
mgr/dashboard: pin a version for autopep8 and pyfakefs

Fixes: https://tracker.ceph.com/issues/53024
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 946dab4f608ec47e0a3cfefdf8e7d1afda69117f)

3 years agomgr/dashboard: monitoring: grafonnet refactoring for cephfs dashboards
Aashish Sharma [Fri, 8 Oct 2021 10:07:17 +0000 (15:37 +0530)]
mgr/dashboard: monitoring: grafonnet refactoring for cephfs dashboards

This PR intends to refactor cephfs dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit ed954b0e6ce24fbae66f78f7e4f90416b9ed7749)

3 years agomgr/dashboard: monitoring: grafonnet refactoring for osds dashboards
Aashish Sharma [Fri, 8 Oct 2021 09:58:13 +0000 (15:28 +0530)]
mgr/dashboard: monitoring: grafonnet refactoring for osds dashboards

This PR intends to refactor osds dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit e490e2f3abe707a2e891171f3c230d44e282c601)

3 years agomgr/dashboard: monitoring: grafonnet refactoring for pools dashboards
Aashish Sharma [Fri, 8 Oct 2021 09:52:46 +0000 (15:22 +0530)]
mgr/dashboard: monitoring: grafonnet refactoring for pools dashboards

This PR intends to refactor pools dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 8c48821c21f7a6b248de10ff6750a63bab1e4948)

3 years agomgr/dashboard: monitoring: grafonnet refactoring for rbd dashboards
Aashish Sharma [Fri, 8 Oct 2021 09:42:41 +0000 (15:12 +0530)]
mgr/dashboard: monitoring: grafonnet refactoring for rbd dashboards

This PR intends to refactor rbd dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit e737aaa000a31e2f37ca90eb813f031a42edef3b)

3 years agomgr/dashboard: monitoring: grafonnet refactoring for radosgw dashboards
Aashish Sharma [Fri, 8 Oct 2021 09:30:09 +0000 (15:00 +0530)]
mgr/dashboard: monitoring: grafonnet refactoring for radosgw dashboards

This PR intends to refactor radosgw dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit eb01954cd999430417555628e0099f645d371746)

3 years agorgw/tracing: unify SO version numbers within librgw2 package 43619/head
Nathan Cutler [Wed, 20 Oct 2021 10:51:02 +0000 (12:51 +0200)]
rgw/tracing: unify SO version numbers within librgw2 package

The librgw2 package contains several SO files. Two of those - librgw_op_tp.so
and librgw_rados_tp.so - had a different version number than the main librgw.

This was a violation of the openSUSE Shared Library Packaging Policy [1] but it
also seems like a "violation" of common sense.

[1] https://en.opensuse.org/openSUSE:Shared_library_packaging_policy#Package_naming

Fixes: https://tracker.ceph.com/issues/52979
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 172d6e01d5079f445044da9fe0823ceb353bdc86)

3 years agoMerge pull request #43548 from rzarzynski/pacific-50483
Yuri Weinstein [Thu, 21 Oct 2021 13:41:46 +0000 (06:41 -0700)]
Merge pull request #43548 from rzarzynski/pacific-50483

pacific: msgr/async: fix unsafe access in unregister_conn()

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #43610 from rhcs-dashboard/wip-pr_triage_dashboard-pacific
Ernesto Puerta [Thu, 21 Oct 2021 08:53:19 +0000 (10:53 +0200)]
Merge pull request #43610 from rhcs-dashboard/wip-pr_triage_dashboard-pacific

.github: add dashboard PRs to Dashboard project

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #43440 from rhcs-dashboard/wip-52835-pacific
Ernesto Puerta [Thu, 21 Oct 2021 08:52:42 +0000 (10:52 +0200)]
Merge pull request #43440 from rhcs-dashboard/wip-52835-pacific

pacific: qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years ago.github/pr-triage: rename GH token 43610/head
Ernesto Puerta [Mon, 11 Oct 2021 11:05:34 +0000 (13:05 +0200)]
.github/pr-triage: rename GH token

Repo projects use GITHUB_TOKEN instead of MY_GITHUB_TOKEN:
https://github.com/srggrs/assign-one-project-github-action/blob/master/entrypoint.sh#L19

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 2220646c2085f6967e61d21ff19145666f5a1285)

3 years ago.github: add dashboard PRs to Dashboard project
Ernesto Puerta [Fri, 8 Oct 2021 16:43:25 +0000 (18:43 +0200)]
.github: add dashboard PRs to Dashboard project

This action automatically adds PRs with 'dashboard' label to the
'Dashboard' project (https://github.com/ceph/ceph/projects/6).

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit ed55c527f10237c0ab48038639a971e85f8e1377)

3 years agoMerge pull request #43200 from batrick/i52639
Yuri Weinstein [Wed, 20 Oct 2021 15:35:09 +0000 (08:35 -0700)]
Merge pull request #43200 from batrick/i52639

pacific: MDSMonitor: handle damaged state from standby-replay

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoqa/tasks/backfill_toofull: make test work when compression on 43437/head
Mykola Golub [Wed, 13 Oct 2021 15:22:09 +0000 (18:22 +0300)]
qa/tasks/backfill_toofull: make test work when compression on

The osd backfill reservation does not take compression into account so
we need to operate with "uncompressed" bytes when calculating nearfull
ratio.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 429ac06cbb44b8a8263beb0d0780a01cedb517ba)

3 years agoMerge pull request #43267 from cfsnyder/wip-52588-pacific
Guillaume Abrioux [Mon, 18 Oct 2021 15:55:31 +0000 (17:55 +0200)]
Merge pull request #43267 from cfsnyder/wip-52588-pacific

pacific: ceph-volume: fix lvm activate --all --no-systemd

3 years agoMerge pull request #43523 from rhcs-dashboard/wip-52911-pacific
Ernesto Puerta [Mon, 18 Oct 2021 15:11:27 +0000 (17:11 +0200)]
Merge pull request #43523 from rhcs-dashboard/wip-52911-pacific

pacific:  mgr/dashboard: replace "Ceph-cluster" Client connections with active-standby MGRs

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #43541 from rhcs-dashboard/wip-52931-pacific
Ernesto Puerta [Mon, 18 Oct 2021 15:09:29 +0000 (17:09 +0200)]
Merge pull request #43541 from rhcs-dashboard/wip-52931-pacific

pacific: mgr/dashboard: Fix orchestrator/01-hosts.e2e-spec.ts failure

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #43240 from callithea/wip-52292-pacific
Ernesto Puerta [Mon, 18 Oct 2021 15:08:22 +0000 (17:08 +0200)]
Merge pull request #43240 from callithea/wip-52292-pacific

pacific: mgr/dashboard: visual tests: Add more ignore regions for dashboard component

Reviewed-by: aaryanporwal <NOT@FOUND>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agomgr/dashboard: replace string version with class 43543/head
Ernesto Puerta [Fri, 24 Sep 2021 15:46:42 +0000 (17:46 +0200)]
mgr/dashboard: replace string version with class

* APIVersion:
  * Moved to a separate file
  * Added doctests
  * Added sentinel values:
    * DEFAULT = 1.0
    * EXPERIMENTAL = 0.1
    * NONE = 0.0
  * Added to_mime_type() helper method
* Controllers.__init__:
  * Added type hints
  * Replaced string versions with APIVersions
* Feedback controller:
  * Replaced with EXPERIMENTAL (probably it should be NONE)

Fixes: https://tracker.ceph.com/issues/52480
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
 Conflicts:
src/pybind/mgr/dashboard/controllers/__init__.py
   - Remove the current changes and keep the incoming new changes
src/pybind/mgr/dashboard/controllers/crush_rule.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/controllers/docs.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/controllers/feedback.py
   - Deleted the file since feedback module isn't backported to pacific
src/pybind/mgr/dashboard/controllers/host.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/openapi.yaml
   - Generated a new openapi yaml file
src/pybind/mgr/dashboard/tests/__init__.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/tests/test_docs.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/tests/test_host.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/tests/test_tools.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/tests/test_versioning.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/controllers/crush_rule.py
   - Removed the MethodMap decorator which updates the version of the
     enpoint to 2.0 because those changes which caused that version
     updating were not backported to pacific

3 years agotest: shutdown the mounter after test finishes 43475/head
Xiubo Li [Sat, 9 Oct 2021 03:12:18 +0000 (11:12 +0800)]
test: shutdown the mounter after test finishes

In the previous backport commit (5772641cb9bde083), when resolving
the conflicts, this has been missed.

Fixes: https://tracker.ceph.com/issues/52876
Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agotest/libcephfs: put inodes after lookup 43562/head
Patrick Donnelly [Tue, 14 Sep 2021 17:02:12 +0000 (13:02 -0400)]
test/libcephfs: put inodes after lookup

Otherwise, the client umount will hang due to inability to trim the
inodes looked up using the low-level interface. This results in slow-op
warnings and an eviction:

2021-09-11T17:23:31.097+0000 7f99c3522700  0 log_channel(cluster) log [WRN] : evicting unresponsive client smithi176 (9756), after 303.924 seconds
2021-09-11T17:23:31.097+0000 7f99c3522700 10 mds.0.server autoclosing stale session client.9756 172.21.15.176:0/3891214934 last renewed caps 303.924s ago

From: /ceph/teuthology-archive/yuriw-2021-09-11_16:21:09-smoke-pacific-distro-basic-smithi/6385038/remote/smithi175/log/ceph-mds.b.log.gz

Fixes: https://tracker.ceph.com/issues/52572
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c0252063b94d811dc7863058999856ac5614d1eb)

 Conflicts:
src/test/libcephfs/test.cc

3 years agoqa: add test for cephfs upgrade sequence 43559/head
Patrick Donnelly [Fri, 1 Oct 2021 16:06:50 +0000 (12:06 -0400)]
qa: add test for cephfs upgrade sequence

This also checks max_mds>1 and allow_standby_replay are restored to
previous values.

Future work can add tests for multiple file systems (or volumes).

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b1420e5771927f5c659e0e5edbc5714035f3df09)

3 years agoqa: add tasks to check mds upgrade state
Patrick Donnelly [Fri, 1 Oct 2021 16:05:42 +0000 (12:05 -0400)]
qa: add tasks to check mds upgrade state

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 5a7382214fe4dbd4b79773c6e732512ade22793a)

3 years agoqa: add note about where caps are generated
Patrick Donnelly [Fri, 1 Oct 2021 16:05:12 +0000 (12:05 -0400)]
qa: add note about where caps are generated

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit dbe5573ed4781cb4b214e701c77be7bc2cddabf3)

3 years agoqa: move CephManager cluster instantiation to subtask
Patrick Donnelly [Tue, 5 Oct 2021 17:31:02 +0000 (13:31 -0400)]
qa: move CephManager cluster instantiation to subtask

This needs to be available for the cephfs_setup task so administration
mounts can run ceph commands, potentially through `cephadm shell`.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 7812cfb6744fc3bce50e26aa7dd6a4e47a43bb23)

3 years agopybind/mgr/cephadm: disable allow_standby_replay during CephFS upgrade
Patrick Donnelly [Sat, 18 Sep 2021 00:15:01 +0000 (20:15 -0400)]
pybind/mgr/cephadm: disable allow_standby_replay during CephFS upgrade

Following procedure in [1].

Also: harden checks for active. Ensure "up" and "in" are both [0]. There
should be no standby-replay daemon.

[1] https://docs.ceph.com/en/pacific/cephfs/upgrading/

Fixes: https://tracker.ceph.com/issues/52654
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit bca21f01ce3bb32e0951f0fe15da88a81750a191)

3 years agopybind/mgr/cephadm: always do mds upgrade sequence
Patrick Donnelly [Thu, 23 Sep 2021 23:49:31 +0000 (19:49 -0400)]
pybind/mgr/cephadm: always do mds upgrade sequence

Minor versions also require this sequence.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 4affb5c7029f6b83d640aa7b7206d9cf61e75f1d)

3 years agomgr/dashboard: make modified API endpoints backward compatible
Avan Thakkar [Thu, 23 Sep 2021 11:15:16 +0000 (16:45 +0530)]
mgr/dashboard: make modified API endpoints backward compatible

Fixes: https://tracker.ceph.com/issues/52480
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Introducing APIVersion class to handle versioning for API-endpints and making
them backward compatible.

3 years agomgr/dashboard: clean-up controllers
Ernesto Puerta [Tue, 7 Sep 2021 15:07:48 +0000 (17:07 +0200)]
mgr/dashboard: clean-up controllers

Fixes: https://tracker.ceph.com/issues/52589
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
 Conflicts:
src/pybind/mgr/dashboard/CMakeLists.txt
   - Added some testts in the CephTest section

3 years agomsgr/async: fix unsafe access in unregister_conn() 43548/head
Sage Weil [Mon, 19 Apr 2021 14:26:30 +0000 (09:26 -0500)]
msgr/async: fix unsafe access in unregister_conn()

We were looking at anon_conns and accepting_conns without holding
the lock (deleted_lock is not sufficient).

Drop this test, and move the decrements:

- inc when we add to conns or anon_conns (no changes there)
- dec when we remove from deleted_conns (several different paths!)

Fixes: https://tracker.ceph.com/issues/49237
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit d51d80b3234e17690061f65dc7e1515f4244a5a3)
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agomgr/dashboard: Fix orchestrator/01-hosts.e2e-spec.ts failure 43541/head
Nizamudeen A [Thu, 7 Oct 2021 15:36:29 +0000 (21:06 +0530)]
mgr/dashboard: Fix orchestrator/01-hosts.e2e-spec.ts failure

The test is failing on deleting a host because the agent daemon is
present in that host. Its not possible to simply delete a host. We need
to drain it first and then delete it.

Fixes: https://tracker.ceph.com/issues/52764
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit db5cfb15e55dadf7bd5c381f53a4ea548fcea152)

3 years agomgr/dashboard: replace Client connections with active-stdby mgrs 43523/head
Avan Thakkar [Thu, 30 Sep 2021 22:26:42 +0000 (03:56 +0530)]
mgr/dashboard: replace Client connections with active-stdby mgrs

Fixes: https://tracker.ceph.com/issues/52121
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit d388c5e958ddf5447c78db50ca2061bb443d2227)

3 years agoosd: fix partial recovery become whole object recovery after restart osd 43513/head
Jianwei Zhang [Mon, 13 Sep 2021 10:13:18 +0000 (18:13 +0800)]
osd: fix partial recovery become whole object recovery after restart osd

support SERVER_OCTOPUS feature for pg_missing_item::encode()

Fixes: https://tracker.ceph.com/issues/52583
Signed-off-by: Jianwei Zhang <jianwei1216@qq.com>
(cherry picked from commit dcdb188b6f577551fb377ba34145419f81322b03)

3 years agoos/bluestore: list obj which equals to pend 43512/head
Kefu Chai [Fri, 24 Sep 2021 15:33:03 +0000 (23:33 +0800)]
os/bluestore: list obj which equals to pend

otherwise we could have failures like

scrub : stat mismatch, got 3/4 objects, 1/2 clones, 3/4 dirty, 3/4 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 whiteouts, 49/56 bytes, 0/0 manifest objects, 0/0 hit_set_archive bytes."

where the numbers of scrubbed object, clones, dirty and omap are always
less than the total number of corresponding numbers, if the PG contains
object(s) whose hash happens to be 0xffffffff.

in this change, if the calculated hash of the upper bound is greater
than the maximum possible number represented by uint32_t, in addition to
setting the hash of the upper bound hobj to 0xffffffff, we also set the
nspace of hobj of the upper bound to "\xff", so that the upper bound
is greater than an hobj whose hash happens to be 0xfffffff. please note,
the nspace of "\xff" is not an ascii string, so it's not likely to be
less than a real-world nspace of an hobj.

with this new *greater* upper bound, we are able to include the previous
missing hobj when listing the objects in a PG. so the scrub won't be
annoyed when the number of objects does not match.

Fixes: https://tracker.ceph.com/issues/52705
Signed-off-by: Mykola Golub <mykola.golub@clyso.com>
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit ffab13bcd9006c1f961a24b8016df9d1fe06ba1d)

3 years agoos/bluestore: use scope_guard to log latency
Kefu Chai [Wed, 22 Sep 2021 16:42:33 +0000 (00:42 +0800)]
os/bluestore: use scope_guard to log latency

simpler this way, and avoid using `goto`.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 715a83822ebc1a3d102d1ec13323b69db0600719)

3 years agomsg/async/ProtocolV2: replace ltt_recv_stamp with recv_stamp 43511/head
Dongdong Tao [Tue, 28 Sep 2021 06:40:43 +0000 (14:40 +0800)]
msg/async/ProtocolV2: replace ltt_recv_stamp with recv_stamp

Fixes: https://tracker.ceph.com/issues/52739
Signed-off-by: dongdong tao <dongdong.tao@canonical.com>
(cherry picked from commit 1b1a91c31ba6078caff045c499b8737e0068460f)

3 years agomsg/async/ProtocolV2: Set the recv_stamp at the beginning of receiving a message...
taodd [Sat, 25 Sep 2021 03:56:02 +0000 (11:56 +0800)]
msg/async/ProtocolV2: Set the recv_stamp at the beginning of receiving a message instead of after receiving.

Fixes: https://tracker.ceph.com/issues/52739
Signed-off-by: dongdong tao <dongdong.tao@canonical.com>
(cherry picked from commit 5ca30f396bface2a8e95a0efb1b97f8c1b64de1c)

3 years agoMerge pull request #43368 from tchaikov/pacific-pr-39602
Yuri Weinstein [Tue, 12 Oct 2021 12:44:53 +0000 (05:44 -0700)]
Merge pull request #43368 from tchaikov/pacific-pr-39602

pacific: mgr/influx: use "N/A" for unknown hostname

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #43351 from rhcs-dashboard/wip-52772-pacific
Yuri Weinstein [Tue, 12 Oct 2021 12:44:18 +0000 (05:44 -0700)]
Merge pull request #43351 from rhcs-dashboard/wip-52772-pacific

pacific: qa/mgr/dashboard: add extra wait to test

Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #43347 from rhcs-dashboard/wip-52763-pacific
Yuri Weinstein [Tue, 12 Oct 2021 12:43:31 +0000 (05:43 -0700)]
Merge pull request #43347 from rhcs-dashboard/wip-52763-pacific

pacific: mgr/dashboard: Move force maintenance test to the workflow test suite

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #43167 from ktdreyer/pacific-52610-cmake-thread-libs-init
Yuri Weinstein [Tue, 12 Oct 2021 12:41:57 +0000 (05:41 -0700)]
Merge pull request #43167 from ktdreyer/pacific-52610-cmake-thread-libs-init

pacific: cmake: link Threads::Threads instead of CMAKE_THREAD_LIBS_INIT

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #43199 from vshankar/wip-52627
Yuri Weinstein [Fri, 8 Oct 2021 13:34:17 +0000 (06:34 -0700)]
Merge pull request #43199 from vshankar/wip-52627

pacific: mgr/mirroring: remove unnecessary fs_name arg from daemon status command

Reviewed-by: Xiubo Li <xiubli@redhat.com>
3 years agoMerge pull request #43198 from vshankar/wip-52444
Yuri Weinstein [Fri, 8 Oct 2021 13:33:42 +0000 (06:33 -0700)]
Merge pull request #43198 from vshankar/wip-52444

pacific: cephfs-mirror: shutdown ClusterWatcher on termination

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
3 years agoMerge pull request #43148 from lxbsz/fair_mutex
Yuri Weinstein [Fri, 8 Oct 2021 13:32:12 +0000 (06:32 -0700)]
Merge pull request #43148 from lxbsz/fair_mutex

pacific: mds: switch mds_lock to fair mutex to fix the slow performance issue

Reviewed-by: Jeff Layton <jlayton@redhat.com>
3 years agoMetricCollector.h: Add check to prevent mgr from crashing 43445/head
Aswin Toni [Fri, 1 Oct 2021 14:12:22 +0000 (16:12 +0200)]
MetricCollector.h: Add check to prevent mgr from crashing

Fixes: https://tracker.ceph.com/issues/52801
Signed-off-by: Aswin Toni <aswin.toni@cern.ch>
(cherry picked from commit 9a05872fdd499575961ee1a8d188d19054841eb8)

3 years agoqa/mgr/dashboard/test_pool: don't check HEALTH_OK 43440/head
Ernesto Puerta [Wed, 22 Sep 2021 12:25:44 +0000 (14:25 +0200)]
qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Fixes: https://tracker.ceph.com/issues/48845
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 2283cb068b82033b14587c7bac6a28440221dcd8)

3 years agoqa/suites/rados: add backfill_toofull test
Mykola Golub [Thu, 9 Sep 2021 11:44:25 +0000 (14:44 +0300)]
qa/suites/rados: add backfill_toofull test

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 76743e005866664795e9240460734b31108824e2)

3 years agoqa/tasks/ceph_manager: fix assertion
Mykola Golub [Sun, 23 May 2021 08:55:33 +0000 (11:55 +0300)]
qa/tasks/ceph_manager: fix assertion

The osd may be 0.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit e0a926a2c18d76225fd4d4051bc19b9a1917b932)

3 years agoosd: re-cache peer_bytes on every peering state activate
Mykola Golub [Mon, 30 Aug 2021 06:58:04 +0000 (07:58 +0100)]
osd: re-cache peer_bytes on every peering state activate

peer_bytes is used for backfill reservation request and may be
reset if backfill is interrupted, and we want it set back before
continuing backfill and re-sending the reservation request.

Fixes: https://tracker.ceph.com/issues/52448
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit bdfdf96d2f6c3cf7e5595ae5b8238fd4c0b3c6bc)

3 years agoMerge pull request #43348 from cfsnyder/wip-52350-pacific
Yuri Weinstein [Tue, 5 Oct 2021 15:00:24 +0000 (08:00 -0700)]
Merge pull request #43348 from cfsnyder/wip-52350-pacific

pacific: rgw: fix sts memory leak

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #42643 from cfsnyder/wip-51803-pacific
Yuri Weinstein [Tue, 5 Oct 2021 14:59:37 +0000 (07:59 -0700)]
Merge pull request #42643 from cfsnyder/wip-51803-pacific

pacific: rgw/notifications: send correct size in case of delete marker creation

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoqa/tasks/mgr: skip test_diskprediction_local on python>=3.8 43421/head
Kefu Chai [Wed, 7 Apr 2021 05:38:27 +0000 (13:38 +0800)]
qa/tasks/mgr: skip test_diskprediction_local on python>=3.8

query the python version before trying to test diskprediction_local

Fixes: https://tracker.ceph.com/issues/50196
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 39b2b5edc008900d531be95ece1ce75a1e036914)

3 years agomgr/selftest: add a command for querying python version
Kefu Chai [Wed, 7 Apr 2021 06:40:05 +0000 (14:40 +0800)]
mgr/selftest: add a command for querying python version

so the test driver can skip certain tests based on the version of python
runtime on the test node

Fixes: https://tracker.ceph.com/issues/50196
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 91bc0e54ab816fca12a08817c261bbbf65606726)

3 years agoosd/OSD: mkfs need wait for transcation completely finish 43417/head
Chen Fan [Wed, 9 Jun 2021 05:29:03 +0000 (13:29 +0800)]
osd/OSD: mkfs need wait for transcation completely finish

when do ceph-osd mkfs, when ceph-osd process exit, sometimes
the block data could be written incompletely. we need add
wait for it complete.

Signed-off-by: Chen Fan <fan.chen@easystack.cn>
(cherry picked from commit 0ffadad3a83b3ca634d7d58a80c84d1d8761e2ea)

3 years agoMerge pull request #43235 from MrFreezeex/wip-51839-pacific
Yuri Weinstein [Mon, 4 Oct 2021 15:18:03 +0000 (08:18 -0700)]
Merge pull request #43235 from MrFreezeex/wip-51839-pacific

pacific: ceph.spec: selinux scripts respect CEPH_AUTO_RESTART_ON_UPGRADE

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Dan van der Ster <daniel.vanderster@cern.ch>
3 years agoMerge pull request #43264 from cfsnyder/wip-52332-pacific
Yuri Weinstein [Fri, 1 Oct 2021 15:17:47 +0000 (08:17 -0700)]
Merge pull request #43264 from cfsnyder/wip-52332-pacific

pacific: cmake: s/Python_EXECUTABLE/Python3_EXECUTABLE/

Reviewed-by: Michael Fritch <mfritch@suse.com>
3 years agoMerge pull request #43099 from cfsnyder/wip-51952-pacific
Yuri Weinstein [Thu, 30 Sep 2021 22:53:16 +0000 (15:53 -0700)]
Merge pull request #43099 from cfsnyder/wip-51952-pacific

pacific: osd: fix to recover adjacent clone when set_chunk is called

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #43306 from myoungwon/pacific-backport-52322
Yuri Weinstein [Thu, 30 Sep 2021 22:52:46 +0000 (15:52 -0700)]
Merge pull request #43306 from myoungwon/pacific-backport-52322

pacific: osd: fix to allow inc manifest leaked

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agomgr/influx: use "N/A" for unknown hostname 43368/head
Kefu Chai [Mon, 22 Feb 2021 05:53:42 +0000 (13:53 +0800)]
mgr/influx: use "N/A" for unknown hostname

in theory, there is chance that get_metadata() returns None, so let use
"N/A" in this case.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e457ca50011f70cf01a62323998af233a484f338)

3 years agomgr/progress: optimize global recovery module 43353/head
Kamoltat [Mon, 5 Oct 2020 09:38:35 +0000 (09:38 +0000)]
mgr/progress: optimize global recovery module

Instead of fetching `pg_stats` from the python
part of manager module, we filter out the pgs
that are in active + clean state in ActivePyModules.cc
then parse these pgs along with `reported_epoch` and
the `total_num_pgs` of the clusters to global recovery
module.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit fa92db1b37e5633e89fc39a4653c39973bf23867)

3 years agomgr/test_progress.py: Delay recover in test_progress
Kamoltat [Tue, 13 Jul 2021 19:14:43 +0000 (19:14 +0000)]
mgr/test_progress.py: Delay recover in test_progress

Changes some the tests in teuthology to make
the test more deterministic.
Using:

`ceph osd set norecover` and
`ceph osd set nobackfill` when marking osds in
or out. As this will delay the recovery and make
sure it the test cases get the chance to check
that there is actually events poping up in
the progress module.

took out test_osd_cannot_recover from
tasks/mgr/test_progress.py since it is no longer
a relevant test case since recovery will get
triggered regardless if pg is unmoved.

Ignoring `OSDMAP_FLAGS` in teuthology
because we are using norecover and nobackfill
to delay the recovery process, therefore, it
will create a health warning and fails the
teuthology test.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 5f33f2f6e0609b452db47b341aaf6d5889917563)

3 years agopybind/mgr/progress: introduce 5 second sleep interval
Kamoltat [Tue, 13 Jul 2021 19:06:44 +0000 (19:06 +0000)]
pybind/mgr/progress: introduce 5 second sleep interval

Current progress module only checks pg stats
and osdmap when it is notified by the cluster.
However, this is expensive in large cluster
with many pools and osds. we
change it to only check both pg stats and osdmap
every 5 seconds.

in the function _osd_in_out() we now calculate
`is_relocated` by: old_osds != new_osds such that
it does not matter if the difference between osds
are positive or negative.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 4504749b81f9cb11d92d5f280565aff3f243adf3)

3 years agopybind/mgr/progress/test_progress.py: fix type of reported_epoch
Neha Ojha [Wed, 30 Jun 2021 19:50:00 +0000 (19:50 +0000)]
pybind/mgr/progress/test_progress.py: fix type of reported_epoch

because reported_epoch is an int, not a string

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit a8f3a0eb83653ce6b50aaccd43bdc456e6394484)

3 years agopybind/mgr/progress/module.py: no need to cast reported_epoch and _start_epoch
Neha Ojha [Wed, 30 Jun 2021 19:38:15 +0000 (19:38 +0000)]
pybind/mgr/progress/module.py: no need to cast reported_epoch and _start_epoch

reported_epoch is an int, see 22128e3de697f3fdf66faf3fe3b701a3a599968f
and _start_epoch is also an int, see type annotations in
2af2afa5e9191115bb6f0b36194830ffb91938bf

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit da268faed8e7a3eacb68b1c92855dc3a43225961)
Signed-off-by: Kamoltat <ksirivad@redhat.com>