git-server-git.apps.pok.os.sepia.ceph.com Git

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

Ernesto Puerta [Thu, 2 Jun 2022 10:27:02 +0000 (12:27 +0200)]

qa: fix teuthology master branch ref

Fixes: https://tracker.ceph.com/issues/55826
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit e91773df68c286266a2855e69bf542b4c73379d9)

Conflicts:
qa/tox.ini
- accept only master-main rename

commit | commitdiff | tree

David Galloway [Wed, 1 Jun 2022 20:19:33 +0000 (16:19 -0400)]

Merge pull request #46490 from ceph/pacific-nobranch

pacific: qa: remove .teuthology_branch file

commit | commitdiff | tree

Jeff Layton [Wed, 1 Jun 2022 18:26:33 +0000 (14:26 -0400)]

qa: remove .teuthology_branch file

This was originally added to help support the py2 -> py3 conversion.
That's long since complete so we should be able to just remove this file
now.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 81430de9b70be16a439bf2445f3345b83035a861)

commit | commitdiff | tree

Ernesto Puerta [Wed, 1 Jun 2022 16:44:12 +0000 (18:44 +0200)]

Merge pull request #46459 from rhcs-dashboard/wip-55601-pacific

pacific: mgr/dashboard: introduce memory and cpu usage for daemons

Reviewed-by: Sarthak Gupta <sarthak.dev.0702@gmail.com>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: nSedrickm <NOT@FOUND>
Reviewed-by: sunilangadi2 <NOT@FOUND>

commit | commitdiff | tree

Ernesto Puerta [Wed, 1 Jun 2022 10:54:24 +0000 (12:54 +0200)]

Merge pull request #46461 from rhcs-dashboard/wip-55116-pacific

pacific: mgr/dashboard: don't log 3xx as errors

Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: nSedrickm <NOT@FOUND>

commit | commitdiff | tree

Aashish Sharma [Fri, 8 Apr 2022 05:19:04 +0000 (10:49 +0530)]

mgr/dashboard: fix linting errors and add test

Fixes: https://tracker.ceph.com/issues/55218
Signed-off-by: Aashish Sharma <aasharma@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Tue, 31 May 2022 17:52:34 +0000 (19:52 +0200)]

Merge pull request #46448 from ceph/fix-triage-pacific

pacific: .github: continue on error and reorder milestone step

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Tue, 22 Mar 2022 13:40:30 +0000 (14:40 +0100)]

mgr/dashboard:  don't log 3xx as errors

Let's avoid printing these ugly/misleading/redundant messages:

```
0 [dashboard DEBUG controllers.home] frontend language from headers: ['en-us']
0 [dashboard DEBUG controllers.home] found directory for language 'en-us'
0 [dashboard DEBUG controllers.home] serving static content: /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/build/src/pybind/mgr/dashboard/frontend/dist/en-US/styles.css
0 [dashboard ERROR exception] Internal Server Error
Traceback (most recent call last):
  File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/services/exception.py", line 47, in dashboard_exception_handler
    return handler(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/cherrypy/_cpdispatch.py", line 60, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/pybind/mgr/dashboard/controllers/home.py", line 134, in __call__
    return serve_file(full_path)
  File "/usr/lib/python3/dist-packages/cherrypy/lib/static.py", line 70, in serve_file
    cptools.validate_since()
  File "/usr/lib/python3/dist-packages/cherrypy/lib/cptools.py", line 117, in validate_since
    raise cherrypy.HTTPRedirect([], 304)
cherrypy._cperror.HTTPRedirect: ([], 304)
```

Fixes: https://tracker.ceph.com/issues/54991
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 406e54d8c41bbc94b7285077d3055766629a2313)

commit | commitdiff | tree

Avan Thakkar [Thu, 7 Apr 2022 11:01:20 +0000 (16:31 +0530)]

mgr/dashboard: introduce memory and cpu usage for daemons

Fixes: https://tracker.ceph.com/issues/55218
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Co-authored-by: Aashish Sharma <aasharma@redhat.com>
Introducing 2 new columns in Cluster->Host->Daemons table for Memory and CPU usage.

(cherry picked from commit 263940502bdd9858c97923f394cd3d918e86e921)

Conflicts:
src/pybind/mgr/cephadm/module.py
- _process_ls_output() doesn't exist in pacific as agent isn't yet backported. So similar changes
needs to be done in serve.py instead.

commit | commitdiff | tree

Ernesto Puerta [Tue, 31 May 2022 17:16:13 +0000 (19:16 +0200)]

Merge pull request #46204 from rhcs-dashboard/wip-55570-pacific

pacific: mgr/dashboard: fix ssl cert validation for ingress service creation

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Mon, 18 Apr 2022 16:50:52 +0000 (18:50 +0200)]

.github/pr-triage: reorder milestone step

In `master` the milestone step exits and causes remaining tasks not to be run. I previously tried with the `continue-on-error` flag, but it didn't work, so let's try putting that steps at the end.

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit d8c0229b90cc20e89f7037a72af8b5d41b6b0861)

commit | commitdiff | tree

Ernesto Puerta [Thu, 17 Mar 2022 19:53:31 +0000 (20:53 +0100)]

.github: continue on error

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit b6791ee09a49398cdef39faae5f2e72f43061d64)

commit | commitdiff | tree

Laura Flores [Sun, 29 May 2022 23:06:44 +0000 (18:06 -0500)]

Merge pull request #46391 from ljflores/wip-55745-pacific

commit | commitdiff | tree

Patrick Donnelly [Fri, 27 May 2022 12:29:05 +0000 (08:29 -0400)]

Merge PR #46336 into pacific

* refs/pull/46336/head:
16.2.9
mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex

commit | commitdiff | tree

Ernesto Puerta [Fri, 27 May 2022 10:55:16 +0000 (12:55 +0200)]

Merge pull request #46277 from votdev/wip-55642-pacific

pacific: mgr/dashboard: Creating and editing Prometheus AlertManager silences is buggy

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>

commit | commitdiff | tree

Ernesto Puerta [Fri, 27 May 2022 10:50:49 +0000 (12:50 +0200)]

Merge pull request #46379 from rhcs-dashboard/wip-55738-pacific

pacific: mgr/dashboard: form field validation icons overlap with other icons

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: nSedrickm <NOT@FOUND>

commit | commitdiff | tree

Ernesto Puerta [Fri, 27 May 2022 09:11:42 +0000 (11:11 +0200)]

Merge pull request #46343 from rhcs-dashboard/wip-55718-pacific

pacific: mgr/dashboard: customizable log-in page text/banner

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Fri, 27 May 2022 09:10:59 +0000 (11:10 +0200)]

Merge pull request #46228 from rhcs-dashboard/wip-55415-pacific

pacific: mgr/dashboard: fix wrong pg status processing

Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Thu, 26 May 2022 16:15:43 +0000 (18:15 +0200)]

Merge pull request #46322 from rhcs-dashboard/wip-55690-pacific

pacific: mgr/dashboard: unselect rows in datatables

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

commit | commitdiff | tree

Laura Flores [Mon, 16 May 2022 22:59:42 +0000 (17:59 -0500)]

qa/suites/rados/thrash-erasure-code-big/thrashers: add `osd max backfills` setting to mapgap and pggrow

All `rados/thrash-erasure-code-big` tests that die due to the “wait_for_recovery” timeout have one thing in common: They contain either `thrashers/pggrow` or `thrashers/mapgap`.

The difference between pggrow and mapgap vs. all other non-offending thrashers (default, careful, fastread, and morepggrow) is that they lack an override setting for `osd max backfills`. `osd max backfills` is the max number of backfill operations allowed to/from an OSD. The higher the number, the quicker the recovery. By default, this value is 1. On all of the non-offending thrashers (default, careful, fastread, and morepggrow), the default 1 value gets overridden in their .yaml files with a value > 1. This is not the case for pggrow and mapgap, however, as they lack an `osd max backfills` override setting.

The mclock op scheduler is known to override `osd max backfills` with a high value, but all of the thrash-erasure-code-big thrashers have their op queue set to “debug_random”, which chooses randomly between op queues (the debug_random op queue is set to override the default mclock_scheduler in qa/config/rados.yaml). So, coupled with the “debug_random” op queue, the low `osd max backfill` setting is causing some tests to time out in recovery.

WITHOUT `osd max backfills`, as they are now, “mapgap” and “pggrow” tests die due to timed-out recovery about 17/100 times, as seen here with a pggrow test: http://pulpito.front.sepia.ceph.com/lflores-2022-05-18_14:24:29-rados:thrash-erasure-code-big-master-distro-default-smithi/

WITH `osd max backfills` specified, as I have suggested in this PR, 99/100 tests passed, with one test failing for a different reason:
http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_22:40:27-rados:thrash-erasure-code-big-master-distro-default-smithi/

I also scheduled 145 tests WITH `osd max backfills` that are a mix of pggrow and mapgap thrashers. 144/145 tests passed, with one test failing for a different reason. http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_15:27:54-rados:thrash-erasure-code-big-master-distro-default-smithi/

Fixes: https://tracker.ceph.com/issues/51076
Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit 40062676c2ceed49b9fa147127ffa83ba6118e2a)

commit | commitdiff | tree

Adam King [Wed, 25 May 2022 13:36:04 +0000 (09:36 -0400)]

Merge pull request #46359 from adk3798/pacific-staggered-upgrade

pacific: mgr/cephadm: staggered upgrade

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>

commit | commitdiff | tree

Adam King [Wed, 25 May 2022 13:33:09 +0000 (09:33 -0400)]

Merge pull request #45964 from adk3798/pacific-raw-osd

pacific: mgr/cephadm: Raw OSD Support

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sarthak0702 [Wed, 11 May 2022 18:57:47 +0000 (00:27 +0530)]

mgr/dashboard: form field validation icons overlap with other icons

Signed-off-by: Sarthak0702 <sarthak.dev.0702@gmail.com>
(cherry picked from commit 0bd2d023026af737b1894f74a545f039a6ec2428)

commit | commitdiff | tree

Adam King [Mon, 23 May 2022 22:53:38 +0000 (18:53 -0400)]

Merge pull request #46352 from mgfritch/backport-46218-pacific

pacific: cephadm: prometheus: The generatorURL in alerts is only using hostname

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>

commit | commitdiff | tree

Adam King [Tue, 19 Apr 2022 17:20:45 +0000 (13:20 -0400)]

doc/cephadm: staggered upgrade docs

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 6a68def64eb720ef0eeace7c0d19c48cb1f6e5bb)

commit | commitdiff | tree

Adam King [Wed, 13 Apr 2022 04:36:02 +0000 (00:36 -0400)]

mgr/cephadm: unit test for staggered upgrade param validation

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 0a46fcb423133e662254ec1aad3704bcaf5e101b)

Conflicts:
src/pybind/mgr/cephadm/tests/test_upgrade.py

commit | commitdiff | tree

Adam King [Tue, 12 Apr 2022 16:39:26 +0000 (12:39 -0400)]

qa/suites/orch/cephadm: staggered upgrade test

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 791e1d20b363c5960e11263312293383e2748a9d)

commit | commitdiff | tree

Adam King [Fri, 1 Apr 2022 13:41:01 +0000 (09:41 -0400)]

mgr/cephadm: make use of new upgrade control parameters

Fixes: https://tracker.ceph.com/issues/54135
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit c1f3497b43bff6f7640161807dce01dc089ce405)

Conflicts:
src/pybind/mgr/cephadm/upgrade.py

commit | commitdiff | tree

Adam King [Fri, 1 Apr 2022 12:20:28 +0000 (08:20 -0400)]

mgr/cephadm: make UpgradeState from_json a bit safer

This way, for downgrades to whatever versions
this lands in onward, having added new parameters to
UpgradeState shouldn't break anything. Can't do much
about downgrades to older versions from this one
but this should help in the future.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit aeaa0b5fd87068a31bfa61dd088c49affce42419)

commit | commitdiff | tree

Adam King [Wed, 30 Mar 2022 13:49:56 +0000 (09:49 -0400)]

mgr/cephadm: add new args and validation for staggered upgrade

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit e6b0fe0e4859f83ca69d14d89f9e47f0ea74e770)

Conflicts:
src/pybind/mgr/orchestrator/module.py

commit | commitdiff | tree

Adam King [Mon, 28 Mar 2022 16:10:15 +0000 (12:10 -0400)]

mgr/cephadm: split _do_upgrade into sub functions

This function was around 500 lines and difficult to work
with. Splitting it into sub functions should hopefully make
it a bit easier to understand and make changes to.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 7b83c51fe63ae006b15dcf509c08a722f104788e)

Conflicts:
src/pybind/mgr/cephadm/upgrade.py

commit | commitdiff | tree

Volker Theile [Tue, 10 May 2022 13:25:54 +0000 (15:25 +0200)]

cephadm: prometheus: The generatorURL in alerts is only using hostname

Prometheus is currently using only the hostname in the 'generatorURL' of an alert which causes issues when clicking on the URL in the Ceph Dashboard or somewhere else, because in most cases the hostname of the node that is running the Prometheus container is not resolvable.

To fix that the command line argument '--web.external-url' must be appended in the systemd unit file of the Prometheus container, e.g. '--web.external-url http://foo.bar:9095' whereas a FQDN hostname is used.

Fixes: https://tracker.ceph.com/issues/55595
Signed-off-by: Volker Theile <vtheile@suse.com>
(cherry picked from commit 4281dc1bbc466dd061781a984b34bb0eafaf482f)

commit | commitdiff | tree

Adam King [Thu, 19 May 2022 22:14:20 +0000 (18:14 -0400)]

Merge pull request #46327 from adk3798/pacific-batch-may1

pacific: cephadm batch backport May

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 19 May 2022 22:13:03 +0000 (18:13 -0400)]

Merge pull request #46309 from adk3798/pacific-public-network-bootstrap

pacific: cephadm: improve network handling during bootstrap

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 19 May 2022 22:10:55 +0000 (18:10 -0400)]

Merge pull request #44769 from guits/wip-54009-pacific

pacific: ceph-volume: zap osds in rollback_osd()

Reviewed-by: Teoman ONAY <tonay@redhat.com>

commit | commitdiff | tree

Sarthak0702 [Thu, 14 Apr 2022 10:17:21 +0000 (15:47 +0530)]

mgr/dashboard: customizable log-in page text/banner

Fixes:https://tracker.ceph.com/issues/55231
Signed-off-by: Sarthak0702 <sarthak.dev.0702@gmail.com>
(cherry picked from commit 9f8bcd764e6d488d488e6ba1c05c2972329827b7)

commit | commitdiff | tree

Volker Theile [Mon, 9 May 2022 13:31:15 +0000 (15:31 +0200)]

mgr/dashboard: Creating and editing Prometheus AlertManager silences is buggy

When creating a new monitoring silence the form is pre-filled with the wrong alert data. It is always used the alert data from the very first object in the list of the API response but not the specified alert identified by the 'fingerprint' property.

The same problem applies to editing silences. The selected silence is not edited, it's always the first one in the list returned API response but not that with the specified 'id' property.

The main problem of the origin implementation is that the Prometheus Alertmanager API endpoints /api/v1/[alerts/silences] do not support querying. To fix that, filtering is done in the frontend.

Fixes: https://tracker.ceph.com/issues/55578
Signed-off-by: Volker Theile <vtheile@suse.com>
(cherry picked from commit 658486b566f0f9cac2fc0225c4cd78702f943d40)

commit | commitdiff | tree

zdover23 [Wed, 18 May 2022 23:21:07 +0000 (09:21 +1000)]

Merge pull request #46326 from zdover23/wip-pr-46315-backport-to-pacific

pacific: doc/start: s/3/three/ in intro.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Zac Dover [Wed, 18 May 2022 10:36:53 +0000 (20:36 +1000)]

doc/start: s/3/three/ in intro.rst

I'm changing "3" to "three" for two reasons:

1. It's correct.
2. This allows me to test backports into Octopus, Pacific, and Quincy.
   I am particularly interested to see what happens when I attempt
   the backport into Octopus, because backports into Octopus have
   failed. This will provide me with another unit of data.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 28efcec2d65e85ff2fa54e62b5b134e63ace853b)

commit | commitdiff | tree

Jenkins Build Slave User [Wed, 18 May 2022 19:51:52 +0000 (19:51 +0000)]

16.2.9

commit | commitdiff | tree

Adam King [Tue, 30 Nov 2021 13:45:47 +0000 (08:45 -0500)]

mgr/cephadm: unit test for re-adding host and receiving loopback address

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit abfbbd383cadfa3e2862d939444e0e9218b3cb3b)

commit | commitdiff | tree

Adam King [Thu, 18 Nov 2021 20:22:39 +0000 (15:22 -0500)]

mgr/cephadm: re-use old ip when re-adding hosts if necessary

When a host is re-added without an explicit ip we can default to the old
ip we had stored for the host rather than either keeping the loopback
address or throwing an exception. We only want to actually error when
the only options left are error or use a resolved loopback address

Fixes: https://tracker.ceph.com/issues/53438
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 7e8d8317bef1b35cddd99950e503f57710002e80)

Conflicts:
src/pybind/mgr/cephadm/module.py

commit | commitdiff | tree

Redouane Kachach [Tue, 17 May 2022 15:26:39 +0000 (17:26 +0200)]

mgr/cephadm: stripping out / from the end of the url
Fixes: https://tracker.ceph.com/issues/55638
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 17032f6be22e9efc3e199d7e35091025bfaae965)

commit | commitdiff | tree

Adam King [Tue, 17 May 2022 00:44:11 +0000 (20:44 -0400)]

mgr/cephadm: force fail over when we want to remove active mgr

Fixes: https://tracker.ceph.com/issues/55679
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 63d13df4eb469fb6f5d85ee06184e7df670aa193)

commit | commitdiff | tree

Redouane Kachach [Mon, 9 May 2022 15:17:30 +0000 (17:17 +0200)]

mgr/cephadm: fixing yaml parsing during bootstrap
Fixes: https://tracker.ceph.com/issues/55555
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 4af8a884416509daa65898335de3d8a355890675)

commit | commitdiff | tree

Adam King [Fri, 13 May 2022 16:53:09 +0000 (12:53 -0400)]

cephadm: fix adoption of osds from custom name clusters

Fixes: https://tracker.ceph.com/issues/55654
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 361c71c7321929898a9cc381b05f4cd65aba36f7)

commit | commitdiff | tree

Redouane Kachach [Thu, 21 Apr 2022 10:01:44 +0000 (12:01 +0200)]

mgr/cephadm: do not add _admin label when no-minimize-config is provided
Fixes: https://tracker.ceph.com/issues/52727
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 01c8999d0354a71a7ef8526aab9b39e30d67c1bb)

commit | commitdiff | tree

Moritz Röhrich [Mon, 21 Mar 2022 16:32:25 +0000 (17:32 +0100)]

cephadm: avoid crashing on expected non-zero exit

- Avoid crashing when a call out to an external program expectedly does
not return exit status zero.

There are programs that communicate other information than error/no
error through exit status. E.g. `systemctl status` will return different
exit codes depending on the actual status of the units in question.
In cases where this is expected crashing with a RuntimeError exception
is inappropriate and should be avoided.

Fixes: https://tracker.ceph.com/issues/55117
Signed-off-by: Moritz Röhrich <moritz.rohrich@suse.com>
(cherry picked from commit a02be6f22fa18094cd8758700ab74581b6ce1701)

commit | commitdiff | tree

David Galloway [Wed, 18 May 2022 19:32:21 +0000 (15:32 -0400)]

Merge pull request #46302 from cfsnyder/wip-cfsnyder-gil-deadlock-fix-pacific

pacific: mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex

commit | commitdiff | tree

Sarthak0702 [Wed, 9 Mar 2022 12:10:20 +0000 (17:40 +0530)]

mgr/dashboard: unselect rows in datatables

Fixes: https://tracker.ceph.com/issues/53244
Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>
(cherry picked from commit b79e2a6c6a9368a4fc167b05970db463cd60edab)

commit | commitdiff | tree

Cory Snyder [Tue, 17 May 2022 09:24:53 +0000 (05:24 -0400)]

mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex

The mgr process can deadlock if the GIL is held while attempting to lock a mutex.
Relevant regressions were introduced in commit a356bac. This fixes those regressions
and also cleans up some unnecessary yielding of the GIL.

Fixes: https://tracker.ceph.com/issues/55687
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 46a7c1c61189334d55e54ef16fa627e3d9e5a905)

commit | commitdiff | tree

Redouane Kachach [Thu, 5 May 2022 14:08:12 +0000 (16:08 +0200)]

mgr/cephadm: fixing ipv6/128 and ipv4/32 subnets handling
Fixes: https://tracker.ceph.com/issues/51257
Fixes: https://tracker.ceph.com/issues/53496
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 75945ad74cf614b3516abd3a50de56cbaab58346)

commit | commitdiff | tree

Redouane Kachach [Thu, 5 May 2022 13:53:49 +0000 (15:53 +0200)]

mgr/cephadm: fixing ipv6 handling during bootstrap
Fixes: https://tracker.ceph.com/issues/55556
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit ae0cbacd1d8d78f41a06fd3b5cd3c0fd693e4c0f)

commit | commitdiff | tree

Redouane Kachach [Fri, 1 Apr 2022 16:03:42 +0000 (18:03 +0200)]

mgr/cephadm: Adding cephadm networking configuration checks+refactoring
Fixes: https://tracker.ceph.com/issues/55174
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit e0bafe6b1da104782b29edf7035d7bc93f89e12f)

Conflicts:
src/cephadm/cephadm
src/cephadm/tests/test_cephadm.py

commit | commitdiff | tree

Redouane Kachach [Wed, 30 Mar 2022 13:48:40 +0000 (15:48 +0200)]

mgr/cephadm: fixing public network conf parsing
Fixes: https://tracker.ceph.com/issues/55132
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 3ef6341e8ef5fe6a01f15c847f6bc9e2205d4d97)

commit | commitdiff | tree

Anthony D'Atri [Tue, 17 May 2022 19:35:09 +0000 (12:35 -0700)]

Merge pull request #45878 from dparmar18/backport_mdsdoc_pacific

pacific: doc/cephfs/add-remove-mds: added cephadm note, refined "Adding an MDS"

commit | commitdiff | tree

zdover23 [Tue, 17 May 2022 15:08:47 +0000 (01:08 +1000)]

Merge pull request #46288 from zdover23/wip-doc-tracker-55676-backport-pacific

pacific: doc/dev: update basic-workflow.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 30 Mar 2022 14:18:26 +0000 (16:18 +0200)]

ceph-volume/tests: reject loop devices in lvm.conf

The current task doesn't works (typo?).
Otherwise api/lvm.py can't work properly, functions such as
`get_single_lv()` and many other don't return the expected results.
Indeed, lvm is confused because of the nvme_loop setup.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a5fab15e44517ac63f3fd257989e81b8127b86d9)

commit | commitdiff | tree

Guillaume Abrioux [Mon, 28 Mar 2022 22:01:39 +0000 (00:01 +0200)]

ceph-volume: do not leave pv when zapping osds

when zapping a device and no vg/lv are left, the pv should be
removed too.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7f007e7fc75b4d6e7465c684f7e5b2458883dcc5)

commit | commitdiff | tree

Guillaume Abrioux [Wed, 23 Mar 2022 09:04:45 +0000 (10:04 +0100)]

orchestrator: support complex osd creation

This adds the support of complex OSD creation with command
`orch daemon add osd`.
Any argument supported by `DriveGroupSpec()` can be passed on the command line.

Usage:
```
ceph orch daemon add osd host:data_devices=device1,device2,db_devices=device3,osds_per_device=2,...
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8aa2f4745adff0ba3c7a0731cf48ccc1c85b33f3)

commit | commitdiff | tree

Guillaume Abrioux [Tue, 22 Mar 2022 15:35:58 +0000 (16:35 +0100)]

DriveSelection: skip unavailable devices

Cephadm shouldn't try to deploy a disk reported as unavailable by ceph-volume.
The idea here is to check the rejection reason so we can still use DB devices
in case of OSD replacement.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3a88547559769f4dd438f6557cef22ef9004fa2a)

Conflicts:
src/python-common/ceph/deployment/inventory.py

commit | commitdiff | tree

Guillaume Abrioux [Fri, 11 Mar 2022 09:29:35 +0000 (10:29 +0100)]

ceph-volume: various fixes in arg_validators

if a device with an FS is passed, ceph-volume should abort
the OSD creation.

Fixes: https://tracker.ceph.com/issues/54535
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9f4b830dcfb45eda81eabf18a8461ac4e1bf642e)

Conflicts:
src/ceph-volume/ceph_volume/devices/lvm/common.py
src/ceph-volume/ceph_volume/tests/util/test_arg_validators.py
src/ceph-volume/ceph_volume/util/arg_validators.py

commit | commitdiff | tree

Guillaume Abrioux [Wed, 23 Mar 2022 09:07:05 +0000 (10:07 +0100)]

doc/cephadm: fix a typo

s/osd_crush_choose_leaf_type/osd_crush_chooseleaf_type

```
[ceph: root@adm-1 /]# ceph config set global osd_crush_choose_leaf_type 0
Error EINVAL: unrecognized config option 'osd_crush_choose_leaf_type'
[ceph: root@adm-1 /]# ceph config set global osd_crush_chooseleaf_type 0
[ceph: root@adm-1 /]#
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d43189c17b03420674ea5424666388b8272c2580)

commit | commitdiff | tree

Guillaume Abrioux [Mon, 14 Mar 2022 14:40:47 +0000 (14:40 +0000)]

ceph-volume/tests: speed up tox tests

Let's use `--numprocesses=auto` in order to speed up the unit tests execution.

See the difference, without `--numprocesses=auto`:
```

... omitted output ...

real    1m22.884s
user    0m23.003s
sys     0m20.504s
```

with `--numprocesses=auto`:

```

... omitted output ...

real    0m18.767s
user    0m33.056s
sys     0m23.244s
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cd5eb7939ed92b584c45689a3169847811b8518d)

commit | commitdiff | tree

Adam King [Thu, 10 Mar 2022 17:43:28 +0000 (12:43 -0500)]

mgr/cephadm: generate one c-v raw prepare cmd per data device in raw mode

Fixes: https://tracker.ceph.com/issues/54522
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit b6556e5dbd21192c9207faf84c96f32bd8877d18)

Conflicts:
src/pybind/mgr/cephadm/services/osd.py
src/python-common/ceph/deployment/translate.py
src/python-common/ceph/tests/test_drive_group.py

commit | commitdiff | tree

Sage Weil [Thu, 4 Nov 2021 14:07:14 +0000 (10:07 -0400)]

mgr/orchestrator: improve usage string for 'orch daemon add osd'

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 816bacba35c9861eedbb49d35dc70c7cbe8a5e8e)

commit | commitdiff | tree

Sage Weil [Thu, 12 Aug 2021 15:12:59 +0000 (11:12 -0400)]

ceph-volume: activate: try simple mode too

This is of dubious value to cephadm since /etc/ceph/osd/* won't be
populated inside of a conatiner. However, it makes sense from a purely
ceph-volume perspective.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 101c810a80eea14ab2a1edc8166dbbe76cd9e87a)

commit | commitdiff | tree

Sage Weil [Tue, 3 Aug 2021 18:36:56 +0000 (14:36 -0400)]

mgr/cephadm: identify and instantiate raw osds post-create

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 11d366d4410938c8588b0d212d05b5ebe23efe4d)

Conflicts:
src/pybind/mgr/cephadm/tests/test_cephadm.py

commit | commitdiff | tree

Sage Weil [Tue, 3 Aug 2021 18:36:39 +0000 (14:36 -0400)]

mgr/orchestrator: accept --method arg to 'orch daemon add osd'

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit cef129d90abd73536a141fb375ce35cc7e5081a4)

Conflicts:
src/pybind/mgr/orchestrator/module.py

commit | commitdiff | tree

Sage Weil [Tue, 3 Aug 2021 18:35:27 +0000 (14:35 -0400)]

python-common: drivegroup: add 'method' property

The DriveGroup method can be none, 'raw', or 'lvm'.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 81e46bc64472a88c493b26f5948ae29f71ebfbed)

Conflicts:
src/python-common/ceph/deployment/drive_group.py

commit | commitdiff | tree

Sage Weil [Thu, 5 Aug 2021 17:29:17 +0000 (13:29 -0400)]

ceph-volume: top-level 'activate' command

First try raw, then lvm.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 3d7ceec684b0ac5b83fae4c397b134236fac485e)

commit | commitdiff | tree

Sage Weil [Thu, 5 Aug 2021 17:23:27 +0000 (13:23 -0400)]

ceph-volume: lvm activate: add --no-tmpfs

This isn't necessary for cephadm, but having this arg match raw activate
makes the interface more consistent.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 9dc35338754442b0730b974de4fc3cc6ffb172b6)

commit | commitdiff | tree

Sage Weil [Thu, 5 Aug 2021 16:02:22 +0000 (12:02 -0400)]

ceph-volume: lvm activate: infer bluestore or filestore

No need to require --filestore and/or --bluestore args since we can tell
from the LV tags which one it is.

We can't drop the arguments without breaking existing users, though, so
redefine them to mean *force* bluesetore or filestore activation (even
though this will error out if the tags don't match).

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 451feec4b269e2a5816687136adc74082ec8f2f3)

commit | commitdiff | tree

Sage Weil [Tue, 3 Aug 2021 18:34:54 +0000 (14:34 -0400)]

ceph-volume: raw activate: accept --osd-id and/or --osd-uuid instead of device

This makes it possible to start raw osds based on their uuid/id instead of
device name (which may not be stable).

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 1ce1b3b8ea8ef6b99abe8c14d69d83a47cbaf762)

commit | commitdiff | tree

dparmar18 [Fri, 25 Mar 2022 08:18:54 +0000 (13:48 +0530)]

doc/cephfs/add-remove-mds: added cephadm note, refined "Adding an MDS"

Description: 1) Add a note about using cephadm for setting up the
                cluster and mds(s), also mention the use of ceph
                orchestrator if one needs to setup mds(s) manually.
     2) Changed the term `data point` to `directory` in
                point 1 under "Adding an MDS" section for better
                clarity.

Fixes: https://tracker.ceph.com/issues/54551
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 9e8e1a250e1192cdb1b86650596543d42a2f0401)

commit | commitdiff | tree

Cory Snyder [Tue, 17 May 2022 09:24:53 +0000 (05:24 -0400)]

commit | commitdiff | tree

Guillaume Abrioux [Tue, 23 Nov 2021 14:33:35 +0000 (15:33 +0100)]

ceph-volume: zap osds in rollback_osd()

rollback_osd() should zap and wipe the device for the corresponding osd
that was being prepared after a failure happens.

Fixes: https://tracker.ceph.com/issues/53376
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit effe65533f4b7248137fcdc0ae966f8438a05b01)

commit | commitdiff | tree

Zac Dover [Wed, 13 Apr 2022 14:09:38 +0000 (00:09 +1000)]

doc/dev: update basic-workflow.rst

This PR updates the basic-workflow.rst file
to serve the needs of people in 2022 who were not
present at jump street.

The text has been refined up to the section called
"Integration Tests" (non-inclusive).

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit a227e4007a5ce66b63e42facf97f89655edf2169)

commit | commitdiff | tree

David Galloway [Mon, 16 May 2022 18:36:37 +0000 (14:36 -0400)]

Merge remote-tracking branch 'gh/pacific' into pacific-release

commit | commitdiff | tree

zdover23 [Sat, 14 May 2022 21:38:31 +0000 (07:38 +1000)]

Merge pull request #46117 from zdover23/wip-doc-pr-46109-backport-to-pacific

Wip doc pr 46109 backport to pacific

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Jenkins Build Slave User [Thu, 12 May 2022 22:23:14 +0000 (22:23 +0000)]

16.2.8

commit | commitdiff | tree

Ernesto Puerta [Fri, 11 Mar 2022 16:29:07 +0000 (17:29 +0100)]

mgr/dashboard: fix wrong pg status processing

Fixes: https://tracker.ceph.com/issues/54481
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 206dd9d4a71a70c46972597a838fda05ceec03da)

commit | commitdiff | tree

Avan Thakkar [Mon, 2 May 2022 09:03:27 +0000 (14:33 +0530)]

mgr/dashboard: add unit tests for ingress service

Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 2c15c26a71ec3acf67f7005e775832928432c821)

commit | commitdiff | tree

Avan Thakkar [Mon, 2 May 2022 08:02:36 +0000 (13:32 +0530)]

mgr/dashboard: fix ssl cert validation for ingress service creation

Fixes: https://tracker.ceph.com/issues/55511
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 0017fa5bc91570e1cf873b59efa3cd1787c49216)

commit | commitdiff | tree

Yuri Weinstein [Wed, 4 May 2022 04:53:27 +0000 (21:53 -0700)]

Merge pull request #46096 from aclamk/wip-aclamk-unbounded-wholespace-iterator-pacific

pacific: revival and backport of fix for RocksDB optimized iterators

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Ville Ojamo [Mon, 2 May 2022 09:01:51 +0000 (16:01 +0700)]

doc/radosgw: fix pgcalc link

The pgcalc tool has moved to the "old" ceph site so update
the link to avoid a 404.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit 7e1dc469648028d064a6c0faeabe9ecb3c11f32f)

commit | commitdiff | tree

Ville Ojamo [Mon, 2 May 2022 08:59:26 +0000 (15:59 +0700)]

doc/rados/operations: fix pgcalc link

The pgcalc tool has moved to the "old" ceph site so update
the link to avoid a 404.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit 45f8d746acefe01e2416cedf92aadba1555c22f8)

commit | commitdiff | tree

Adam Kupczyk [Fri, 29 Apr 2022 21:32:43 +0000 (23:32 +0200)]

kv/RocksDBStore: Remove feature to make WholeSpaceIterator based on bounded iterator

Iterator-bounding feature is introduced to make RocksDB iterators limited, so they
would less likely traverse over tombstones.
This is used when listing keys in fixed range, for example OMAPS for specific object.

It is problematic when extending this logic to WholeSpaceIterator,
since prefix must be taken into account.

Fixes: https://tracker.ceph.com/issues/55444
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>

commit | commitdiff | tree

Cory Snyder [Thu, 21 Apr 2022 19:56:06 +0000 (15:56 -0400)]

kv/RocksDBStore: simplify RocksDBStore::get_cf_handle(string, IteratorBounds)

Adds a precondition to RocksDBStore::get_cf_handle(string, IteratorBounds)
to avoid duplicating logic of the only caller (RocksDBStore::get_iterator).
Assertions will fail if preconditions are not met.

Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 55ef16f6cc1d344b09798e566c2470e81928327a)

commit | commitdiff | tree

Cory Snyder [Thu, 21 Apr 2022 17:13:22 +0000 (13:13 -0400)]

bluestore: add config option to allow rocksdb iterator bounds to be disabled

Add osd_rocksdb_iterator_bounds_enabled config option to allow rocksdb iterator bounds to be disabled.
Also includes minor refactoring to shorten code associated with IteratorBounds initialization in bluestore.

Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit ca3ccd9)

Conflicts:
src/common/options/osd.yaml.in

Cherry-pick notes:
- Conflicts due to option definition in common/options.cc in Pacific vs. common/options/osd.yaml.in in later releases

commit | commitdiff | tree

Cory Snyder [Fri, 15 Apr 2022 00:54:15 +0000 (20:54 -0400)]

bluestore: set upper and lower bounds on rocksdb omap iterators

Limits RocksDB omap Seek operations to the relevant key range of the object's omap.
This prevents RocksDB from unnecessarily iterating over delete range tombstones in
irrelevant omap CF shards. Avoids extreme performance degradation commonly caused
by tombstones generated from RGW bucket resharding cleanup. Also prefer CFIteratorImpl
over ShardMergeIteratorImpl when we can determine that all keys within specified
IteratorBounds must be in a single CF.

Fixes: https://tracker.ceph.com/issues/55324
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 850c16c2468c3200a340493c12930543f326b0e1)

commit | commitdiff | tree

Yuri Weinstein [Fri, 29 Apr 2022 22:28:11 +0000 (15:28 -0700)]

Merge pull request #46085 from adk3798/pacific-revert-network-handling

pacific: revert bootstrap network handling changes

Reviewed-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 29 Apr 2022 22:27:17 +0000 (15:27 -0700)]

Merge pull request #46092 from neha-ojha/wip-55444-pacific

pacific: [Revert] bluestore: set upper and lower bounds on rocksdb omap iterators

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

commit | commitdiff | tree

Neha Ojha [Thu, 28 Apr 2022 21:59:43 +0000 (21:59 +0000)]

Revert "bluestore: set upper and lower bounds on rocksdb omap iterators"

This reverts commit d0b03f227ca7338ec9825b5ce9e549336ef82e9f.

Caused a regression https://tracker.ceph.com/issues/55444

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Thu, 28 Apr 2022 21:59:34 +0000 (21:59 +0000)]

Revert "bluestore: add config option to allow rocksdb iterator bounds to be disabled"

This reverts commit 7d9603058dc9d59c608976d822d59f4738960d16.

Caused a regression https://tracker.ceph.com/issues/55444

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Thu, 28 Apr 2022 21:59:23 +0000 (21:59 +0000)]

Revert "kv/RocksDBStore: simplify RocksDBStore::get_cf_handle(string, IteratorBounds)"

This reverts commit a1f40617a3d2c83bfd5e1c957d7cff13150d00c7.

Caused a regression https://tracker.ceph.com/issues/55444

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 28 Apr 2022 21:40:25 +0000 (17:40 -0400)]

Revert "mgr/cephadm: fixing public network conf parsing"

This reverts commit ccc97518b37c885e3144c0a289206a6ccc19551a.

Signed-off-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 28 Apr 2022 21:39:50 +0000 (17:39 -0400)]

Revert "mgr/cephadm: Adding cephadm networking configuration checks+refactoring"

This reverts commit af4251ee20dbc699449842380d890cf18626be4c.

Signed-off-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 26 Apr 2022 18:53:03 +0000 (11:53 -0700)]

Merge pull request #45967 from tchaikov/pacific-pr-45916

pacific: cmake/modules: always use the python3 specified in command line

Reviewed-by: David Galloway <dgallowa@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Tue, 26 Apr 2022 17:33:35 +0000 (19:33 +0200)]

Merge pull request #45868 from votdev/wip-55276-pacific

pacific: mgr/dashboard: RGW users and buckets tables are empty if the selected gateway is down

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom