git.apps.os.sepia.ceph.com Git - ceph.git/log

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Greg Farnum [Tue, 30 Nov 2021 18:27:54 +0000 (18:27 +0000)]

test: test OSDMap::is_blocklisted in unit tests

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 952af3388466d530c6268ff9b98bcdd725cce738)

Conflicts:
src/test/osd/TestOSDMap.cc
Signed-off-by: Greg Farnum <gfarnum@redhat.com>

commit | commitdiff | tree

Greg Farnum [Tue, 30 Nov 2021 18:29:46 +0000 (18:29 +0000)]

osd: Check range_blocklist in is_blocklisted(): we actually blocklist ranges

Carry a parallel map from cidr addresses to a new
range_bits class (stored entirely as ephemeral state) so that we
don't need to re-compute masks and bit mappings too often, and to
separate out the unpleasant ipv6 bit mapping logic. Then check
against those with range_bits::matches() the same way we check
for equality on specific-entity matches. Nice and simple loops!

Fixes: https://tracker.ceph.com/issues/53050
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 3e26209cbc61cb7fbd4e3f310a28c4cd0f6bb287)

commit | commitdiff | tree

Greg Farnum [Tue, 16 Nov 2021 18:41:08 +0000 (18:41 +0000)]

mon: dump range blocklist when dumping regular blocklist

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit dc09905f1e95201ba8257b70c60c9985eee6ffdb)

commit | commitdiff | tree

Greg Farnum [Tue, 2 Nov 2021 00:38:50 +0000 (00:38 +0000)]

osdmap: convert get_blocklist() to provide the entity/IP and range blocklists

Providing a non-range-aware blocklist accessor would just be
asking for trouble, so don't.

The ugly part of this is how the Objecter is currently just
throwing the range blocklist on the end of its own list. The in-tree
callers are okay with this, and I'd like to look at removing the
blocklist events API from librados entirely -- it exposes "OSD-only"
state to clients and, as evidenced by this patch series, is not
particularly stable.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 9c5e21a685b58e4be0360279d9d22efd513edab2)

commit | commitdiff | tree

Greg Farnum [Wed, 8 Dec 2021 21:32:58 +0000 (21:32 +0000)]

mon: take blocklist ranges as a subcommand, not implicitly from address format

I discovered in testing with CephFS that this tends to interpret client IPs
(which don't have ports, but do have nonces) as invalid ranges. So give it
a separate input keyword that has to be applied first.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 73a1f1b51e586ff7476ff4f4c1682abd0a317074)

commit | commitdiff | tree

Greg Farnum [Mon, 15 Nov 2021 20:06:50 +0000 (20:06 +0000)]

mon: check 'nonce' validity for cidr ranges

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 5c903e5b0a48f60dcf644f83478f97136d7dc56c)

commit | commitdiff | tree

Greg Farnum [Mon, 15 Nov 2021 20:42:35 +0000 (20:42 +0000)]

mon: trim range_blocklist alongside the regular one

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 4b08448131ff63213f65ac2c2454d53158663ca2)

commit | commitdiff | tree

Greg Farnum [Thu, 28 Oct 2021 23:04:23 +0000 (23:04 +0000)]

mon: osdmon: simplify maybe_rm_from_pending_blocklists

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 871427881a60f7a203d08373a1ae1e6db9e2976b)

commit | commitdiff | tree

Greg Farnum [Thu, 28 Oct 2021 22:34:40 +0000 (22:34 +0000)]

mon: osdmon: allow users to enter range blocklists.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 93617f7f4c6ba3463ab4c6e2df3cc2df9b00fc12)

Conflicts:
src/include/ceph_features.h
Signed-off-by: Greg Farnum <gfarnum@redhat.com>

commit | commitdiff | tree

Greg Farnum [Wed, 27 Oct 2021 21:06:37 +0000 (21:06 +0000)]

mon: osdmon: don't overwrite type for entity_addr_t which is a cidr range

Doing so makes it no longer a cidr range entity_addr_t.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 9a1a01f8814df175d2d2c7a81f701d161cb4bab8)

commit | commitdiff | tree

Greg Farnum [Thu, 28 Oct 2021 20:44:49 +0000 (20:44 +0000)]

mon: osdmon: extract blocklist manipulation functions into lambdas

I'm about to add new range blocklists that match the existing IP/entity
ones, and don't want to have separate update logic.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 37fbf121fa6ee76387f07b766bccab5e2b82bbc1)

commit | commitdiff | tree

Greg Farnum [Thu, 28 Oct 2021 22:00:27 +0000 (22:00 +0000)]

osdmap: store new range_blocklist, updated as we do the existing blocklist

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit c0b87d9aca6f61ffe726ce3407059c527b319cbe)

commit | commitdiff | tree

Greg Farnum [Mon, 25 Oct 2021 19:53:04 +0000 (19:53 +0000)]

msg: common: allow entity_addr_t to store a CIDR address range

This required very little change to the existing code. Use with care, because
existing code expects an IP address instead of a range, but it saves on
writing a new parser.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 8941450ff17336b0ed60947e365a8bffcc4a32b0)

commit | commitdiff | tree

Greg Farnum [Tue, 2 Nov 2021 00:34:34 +0000 (00:34 +0000)]

mds: Server: Simplify apply_blocklist and usage of the OSDMap's blocklist

This previoulsly re-implemented a bunch of the OSDMap::is_blocklisted()
function, and wasn't actually any faster to run -- the list of new blocklists
may be smaller than the full set, but OSDMap::blocklist is an unordered_map
of constant lookup time so it shouldn't slow things down. More importantly,
this is much simpler, less likely to be buggy from duplicate code, and lets
the MDS off the hook for dealing with range blocklisting.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 79f7576401cc9d857f84396314d7476336c0e271)

commit | commitdiff | tree

Greg Farnum [Mon, 1 Nov 2021 23:52:53 +0000 (23:52 +0000)]

client: Simplify blocklist tracking and interface

I'm not sure if the blocklist events tracking in Client.cc was ever
the simplest way to track that state, but it definitely isn't now. We
can just hand our addr_vec to the OSDMap and ask it -- it handles
version compatibility issues and, happily, means the Client doesn't
need to learn to deal with ranges directly.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 5f57daedc9550aaeb8b55e2c8dc71b6f27372e84)

commit | commitdiff | tree

Ernesto Puerta [Tue, 31 May 2022 17:52:34 +0000 (19:52 +0200)]

Merge pull request #46448 from ceph/fix-triage-pacific

pacific: .github: continue on error and reorder milestone step

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Tue, 31 May 2022 17:16:13 +0000 (19:16 +0200)]

Merge pull request #46204 from rhcs-dashboard/wip-55570-pacific

pacific: mgr/dashboard: fix ssl cert validation for ingress service creation

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Mon, 18 Apr 2022 16:50:52 +0000 (18:50 +0200)]

.github/pr-triage: reorder milestone step

In `master` the milestone step exits and causes remaining tasks not to be run. I previously tried with the `continue-on-error` flag, but it didn't work, so let's try putting that steps at the end.

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit d8c0229b90cc20e89f7037a72af8b5d41b6b0861)

commit | commitdiff | tree

Ernesto Puerta [Thu, 17 Mar 2022 19:53:31 +0000 (20:53 +0100)]

.github: continue on error

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit b6791ee09a49398cdef39faae5f2e72f43061d64)

commit | commitdiff | tree

Laura Flores [Sun, 29 May 2022 23:06:44 +0000 (18:06 -0500)]

Merge pull request #46391 from ljflores/wip-55745-pacific

commit | commitdiff | tree

Patrick Donnelly [Fri, 27 May 2022 12:29:05 +0000 (08:29 -0400)]

Merge PR #46336 into pacific

* refs/pull/46336/head:
16.2.9
mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex

commit | commitdiff | tree

Ernesto Puerta [Fri, 27 May 2022 10:55:16 +0000 (12:55 +0200)]

Merge pull request #46277 from votdev/wip-55642-pacific

pacific: mgr/dashboard: Creating and editing Prometheus AlertManager silences is buggy

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>

commit | commitdiff | tree

Ernesto Puerta [Fri, 27 May 2022 10:50:49 +0000 (12:50 +0200)]

Merge pull request #46379 from rhcs-dashboard/wip-55738-pacific

pacific: mgr/dashboard: form field validation icons overlap with other icons

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: nSedrickm <NOT@FOUND>

commit | commitdiff | tree

Ernesto Puerta [Fri, 27 May 2022 09:11:42 +0000 (11:11 +0200)]

Merge pull request #46343 from rhcs-dashboard/wip-55718-pacific

pacific: mgr/dashboard: customizable log-in page text/banner

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Fri, 27 May 2022 09:10:59 +0000 (11:10 +0200)]

Merge pull request #46228 from rhcs-dashboard/wip-55415-pacific

pacific: mgr/dashboard: fix wrong pg status processing

Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Thu, 26 May 2022 16:15:43 +0000 (18:15 +0200)]

Merge pull request #46322 from rhcs-dashboard/wip-55690-pacific

pacific: mgr/dashboard: unselect rows in datatables

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

commit | commitdiff | tree

Laura Flores [Mon, 16 May 2022 22:59:42 +0000 (17:59 -0500)]

qa/suites/rados/thrash-erasure-code-big/thrashers: add `osd max backfills` setting to mapgap and pggrow

All `rados/thrash-erasure-code-big` tests that die due to the “wait_for_recovery” timeout have one thing in common: They contain either `thrashers/pggrow` or `thrashers/mapgap`.

The difference between pggrow and mapgap vs. all other non-offending thrashers (default, careful, fastread, and morepggrow) is that they lack an override setting for `osd max backfills`. `osd max backfills` is the max number of backfill operations allowed to/from an OSD. The higher the number, the quicker the recovery. By default, this value is 1. On all of the non-offending thrashers (default, careful, fastread, and morepggrow), the default 1 value gets overridden in their .yaml files with a value > 1. This is not the case for pggrow and mapgap, however, as they lack an `osd max backfills` override setting.

The mclock op scheduler is known to override `osd max backfills` with a high value, but all of the thrash-erasure-code-big thrashers have their op queue set to “debug_random”, which chooses randomly between op queues (the debug_random op queue is set to override the default mclock_scheduler in qa/config/rados.yaml). So, coupled with the “debug_random” op queue, the low `osd max backfill` setting is causing some tests to time out in recovery.

WITHOUT `osd max backfills`, as they are now, “mapgap” and “pggrow” tests die due to timed-out recovery about 17/100 times, as seen here with a pggrow test: http://pulpito.front.sepia.ceph.com/lflores-2022-05-18_14:24:29-rados:thrash-erasure-code-big-master-distro-default-smithi/

WITH `osd max backfills` specified, as I have suggested in this PR, 99/100 tests passed, with one test failing for a different reason:
http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_22:40:27-rados:thrash-erasure-code-big-master-distro-default-smithi/

I also scheduled 145 tests WITH `osd max backfills` that are a mix of pggrow and mapgap thrashers. 144/145 tests passed, with one test failing for a different reason. http://pulpito.front.sepia.ceph.com/lflores-2022-05-17_15:27:54-rados:thrash-erasure-code-big-master-distro-default-smithi/

Fixes: https://tracker.ceph.com/issues/51076
Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit 40062676c2ceed49b9fa147127ffa83ba6118e2a)

commit | commitdiff | tree

Adam King [Wed, 25 May 2022 13:36:04 +0000 (09:36 -0400)]

Merge pull request #46359 from adk3798/pacific-staggered-upgrade

pacific: mgr/cephadm: staggered upgrade

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>

commit | commitdiff | tree

Adam King [Wed, 25 May 2022 13:33:09 +0000 (09:33 -0400)]

Merge pull request #45964 from adk3798/pacific-raw-osd

pacific: mgr/cephadm: Raw OSD Support

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sarthak0702 [Wed, 11 May 2022 18:57:47 +0000 (00:27 +0530)]

mgr/dashboard: form field validation icons overlap with other icons

Signed-off-by: Sarthak0702 <sarthak.dev.0702@gmail.com>
(cherry picked from commit 0bd2d023026af737b1894f74a545f039a6ec2428)

commit | commitdiff | tree

Adam King [Mon, 23 May 2022 22:53:38 +0000 (18:53 -0400)]

Merge pull request #46352 from mgfritch/backport-46218-pacific

pacific: cephadm: prometheus: The generatorURL in alerts is only using hostname

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>

commit | commitdiff | tree

Adam King [Tue, 19 Apr 2022 17:20:45 +0000 (13:20 -0400)]

doc/cephadm: staggered upgrade docs

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 6a68def64eb720ef0eeace7c0d19c48cb1f6e5bb)

commit | commitdiff | tree

Adam King [Wed, 13 Apr 2022 04:36:02 +0000 (00:36 -0400)]

mgr/cephadm: unit test for staggered upgrade param validation

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 0a46fcb423133e662254ec1aad3704bcaf5e101b)

Conflicts:
src/pybind/mgr/cephadm/tests/test_upgrade.py

commit | commitdiff | tree

Adam King [Tue, 12 Apr 2022 16:39:26 +0000 (12:39 -0400)]

qa/suites/orch/cephadm: staggered upgrade test

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 791e1d20b363c5960e11263312293383e2748a9d)

commit | commitdiff | tree

Adam King [Fri, 1 Apr 2022 13:41:01 +0000 (09:41 -0400)]

mgr/cephadm: make use of new upgrade control parameters

Fixes: https://tracker.ceph.com/issues/54135
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit c1f3497b43bff6f7640161807dce01dc089ce405)

Conflicts:
src/pybind/mgr/cephadm/upgrade.py

commit | commitdiff | tree

Adam King [Fri, 1 Apr 2022 12:20:28 +0000 (08:20 -0400)]

mgr/cephadm: make UpgradeState from_json a bit safer

This way, for downgrades to whatever versions
this lands in onward, having added new parameters to
UpgradeState shouldn't break anything. Can't do much
about downgrades to older versions from this one
but this should help in the future.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit aeaa0b5fd87068a31bfa61dd088c49affce42419)

commit | commitdiff | tree

Adam King [Wed, 30 Mar 2022 13:49:56 +0000 (09:49 -0400)]

mgr/cephadm: add new args and validation for staggered upgrade

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit e6b0fe0e4859f83ca69d14d89f9e47f0ea74e770)

Conflicts:
src/pybind/mgr/orchestrator/module.py

commit | commitdiff | tree

Adam King [Mon, 28 Mar 2022 16:10:15 +0000 (12:10 -0400)]

mgr/cephadm: split _do_upgrade into sub functions

This function was around 500 lines and difficult to work
with. Splitting it into sub functions should hopefully make
it a bit easier to understand and make changes to.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 7b83c51fe63ae006b15dcf509c08a722f104788e)

Conflicts:
src/pybind/mgr/cephadm/upgrade.py

commit | commitdiff | tree

Volker Theile [Tue, 10 May 2022 13:25:54 +0000 (15:25 +0200)]

cephadm: prometheus: The generatorURL in alerts is only using hostname

Prometheus is currently using only the hostname in the 'generatorURL' of an alert which causes issues when clicking on the URL in the Ceph Dashboard or somewhere else, because in most cases the hostname of the node that is running the Prometheus container is not resolvable.

To fix that the command line argument '--web.external-url' must be appended in the systemd unit file of the Prometheus container, e.g. '--web.external-url http://foo.bar:9095' whereas a FQDN hostname is used.

Fixes: https://tracker.ceph.com/issues/55595
Signed-off-by: Volker Theile <vtheile@suse.com>
(cherry picked from commit 4281dc1bbc466dd061781a984b34bb0eafaf482f)

commit | commitdiff | tree

Adam King [Thu, 19 May 2022 22:14:20 +0000 (18:14 -0400)]

Merge pull request #46327 from adk3798/pacific-batch-may1

pacific: cephadm batch backport May

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 19 May 2022 22:13:03 +0000 (18:13 -0400)]

Merge pull request #46309 from adk3798/pacific-public-network-bootstrap

pacific: cephadm: improve network handling during bootstrap

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 19 May 2022 22:10:55 +0000 (18:10 -0400)]

Merge pull request #44769 from guits/wip-54009-pacific

pacific: ceph-volume: zap osds in rollback_osd()

Reviewed-by: Teoman ONAY <tonay@redhat.com>

commit | commitdiff | tree

Sarthak0702 [Thu, 14 Apr 2022 10:17:21 +0000 (15:47 +0530)]

mgr/dashboard: customizable log-in page text/banner

Fixes:https://tracker.ceph.com/issues/55231
Signed-off-by: Sarthak0702 <sarthak.dev.0702@gmail.com>
(cherry picked from commit 9f8bcd764e6d488d488e6ba1c05c2972329827b7)

commit | commitdiff | tree

Volker Theile [Mon, 9 May 2022 13:31:15 +0000 (15:31 +0200)]

mgr/dashboard: Creating and editing Prometheus AlertManager silences is buggy

When creating a new monitoring silence the form is pre-filled with the wrong alert data. It is always used the alert data from the very first object in the list of the API response but not the specified alert identified by the 'fingerprint' property.

The same problem applies to editing silences. The selected silence is not edited, it's always the first one in the list returned API response but not that with the specified 'id' property.

The main problem of the origin implementation is that the Prometheus Alertmanager API endpoints /api/v1/[alerts/silences] do not support querying. To fix that, filtering is done in the frontend.

Fixes: https://tracker.ceph.com/issues/55578
Signed-off-by: Volker Theile <vtheile@suse.com>
(cherry picked from commit 658486b566f0f9cac2fc0225c4cd78702f943d40)

commit | commitdiff | tree

zdover23 [Wed, 18 May 2022 23:21:07 +0000 (09:21 +1000)]

Merge pull request #46326 from zdover23/wip-pr-46315-backport-to-pacific

pacific: doc/start: s/3/three/ in intro.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Zac Dover [Wed, 18 May 2022 10:36:53 +0000 (20:36 +1000)]

doc/start: s/3/three/ in intro.rst

I'm changing "3" to "three" for two reasons:

1. It's correct.
2. This allows me to test backports into Octopus, Pacific, and Quincy.
   I am particularly interested to see what happens when I attempt
   the backport into Octopus, because backports into Octopus have
   failed. This will provide me with another unit of data.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 28efcec2d65e85ff2fa54e62b5b134e63ace853b)

commit | commitdiff | tree

Jenkins Build Slave User [Wed, 18 May 2022 19:51:52 +0000 (19:51 +0000)]

16.2.9

commit | commitdiff | tree

Adam King [Tue, 30 Nov 2021 13:45:47 +0000 (08:45 -0500)]

mgr/cephadm: unit test for re-adding host and receiving loopback address

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit abfbbd383cadfa3e2862d939444e0e9218b3cb3b)

commit | commitdiff | tree

Adam King [Thu, 18 Nov 2021 20:22:39 +0000 (15:22 -0500)]

mgr/cephadm: re-use old ip when re-adding hosts if necessary

When a host is re-added without an explicit ip we can default to the old
ip we had stored for the host rather than either keeping the loopback
address or throwing an exception. We only want to actually error when
the only options left are error or use a resolved loopback address

Fixes: https://tracker.ceph.com/issues/53438
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 7e8d8317bef1b35cddd99950e503f57710002e80)

Conflicts:
src/pybind/mgr/cephadm/module.py

commit | commitdiff | tree

Redouane Kachach [Tue, 17 May 2022 15:26:39 +0000 (17:26 +0200)]

mgr/cephadm: stripping out / from the end of the url
Fixes: https://tracker.ceph.com/issues/55638
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 17032f6be22e9efc3e199d7e35091025bfaae965)

commit | commitdiff | tree

Adam King [Tue, 17 May 2022 00:44:11 +0000 (20:44 -0400)]

mgr/cephadm: force fail over when we want to remove active mgr

Fixes: https://tracker.ceph.com/issues/55679
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 63d13df4eb469fb6f5d85ee06184e7df670aa193)

commit | commitdiff | tree

Redouane Kachach [Mon, 9 May 2022 15:17:30 +0000 (17:17 +0200)]

mgr/cephadm: fixing yaml parsing during bootstrap
Fixes: https://tracker.ceph.com/issues/55555
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 4af8a884416509daa65898335de3d8a355890675)

commit | commitdiff | tree

Adam King [Fri, 13 May 2022 16:53:09 +0000 (12:53 -0400)]

cephadm: fix adoption of osds from custom name clusters

Fixes: https://tracker.ceph.com/issues/55654
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 361c71c7321929898a9cc381b05f4cd65aba36f7)

commit | commitdiff | tree

Redouane Kachach [Thu, 21 Apr 2022 10:01:44 +0000 (12:01 +0200)]

mgr/cephadm: do not add _admin label when no-minimize-config is provided
Fixes: https://tracker.ceph.com/issues/52727
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 01c8999d0354a71a7ef8526aab9b39e30d67c1bb)

commit | commitdiff | tree

Moritz Röhrich [Mon, 21 Mar 2022 16:32:25 +0000 (17:32 +0100)]

cephadm: avoid crashing on expected non-zero exit

- Avoid crashing when a call out to an external program expectedly does
not return exit status zero.

There are programs that communicate other information than error/no
error through exit status. E.g. `systemctl status` will return different
exit codes depending on the actual status of the units in question.
In cases where this is expected crashing with a RuntimeError exception
is inappropriate and should be avoided.

Fixes: https://tracker.ceph.com/issues/55117
Signed-off-by: Moritz Röhrich <moritz.rohrich@suse.com>
(cherry picked from commit a02be6f22fa18094cd8758700ab74581b6ce1701)

commit | commitdiff | tree

David Galloway [Wed, 18 May 2022 19:32:21 +0000 (15:32 -0400)]

Merge pull request #46302 from cfsnyder/wip-cfsnyder-gil-deadlock-fix-pacific

pacific: mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex

commit | commitdiff | tree

Sarthak0702 [Wed, 9 Mar 2022 12:10:20 +0000 (17:40 +0530)]

mgr/dashboard: unselect rows in datatables

Fixes: https://tracker.ceph.com/issues/53244
Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>
(cherry picked from commit b79e2a6c6a9368a4fc167b05970db463cd60edab)

commit | commitdiff | tree

Cory Snyder [Tue, 17 May 2022 09:24:53 +0000 (05:24 -0400)]

mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex

The mgr process can deadlock if the GIL is held while attempting to lock a mutex.
Relevant regressions were introduced in commit a356bac. This fixes those regressions
and also cleans up some unnecessary yielding of the GIL.

Fixes: https://tracker.ceph.com/issues/55687
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 46a7c1c61189334d55e54ef16fa627e3d9e5a905)

commit | commitdiff | tree

Redouane Kachach [Thu, 5 May 2022 14:08:12 +0000 (16:08 +0200)]

mgr/cephadm: fixing ipv6/128 and ipv4/32 subnets handling
Fixes: https://tracker.ceph.com/issues/51257
Fixes: https://tracker.ceph.com/issues/53496
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 75945ad74cf614b3516abd3a50de56cbaab58346)

commit | commitdiff | tree

Redouane Kachach [Thu, 5 May 2022 13:53:49 +0000 (15:53 +0200)]

mgr/cephadm: fixing ipv6 handling during bootstrap
Fixes: https://tracker.ceph.com/issues/55556
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit ae0cbacd1d8d78f41a06fd3b5cd3c0fd693e4c0f)

commit | commitdiff | tree

Redouane Kachach [Fri, 1 Apr 2022 16:03:42 +0000 (18:03 +0200)]

mgr/cephadm: Adding cephadm networking configuration checks+refactoring
Fixes: https://tracker.ceph.com/issues/55174
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit e0bafe6b1da104782b29edf7035d7bc93f89e12f)

Conflicts:
src/cephadm/cephadm
src/cephadm/tests/test_cephadm.py

commit | commitdiff | tree

Redouane Kachach [Wed, 30 Mar 2022 13:48:40 +0000 (15:48 +0200)]

mgr/cephadm: fixing public network conf parsing
Fixes: https://tracker.ceph.com/issues/55132
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 3ef6341e8ef5fe6a01f15c847f6bc9e2205d4d97)

commit | commitdiff | tree

Anthony D'Atri [Tue, 17 May 2022 19:35:09 +0000 (12:35 -0700)]

Merge pull request #45878 from dparmar18/backport_mdsdoc_pacific

pacific: doc/cephfs/add-remove-mds: added cephadm note, refined "Adding an MDS"

commit | commitdiff | tree

zdover23 [Tue, 17 May 2022 15:08:47 +0000 (01:08 +1000)]

Merge pull request #46288 from zdover23/wip-doc-tracker-55676-backport-pacific

pacific: doc/dev: update basic-workflow.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 30 Mar 2022 14:18:26 +0000 (16:18 +0200)]

ceph-volume/tests: reject loop devices in lvm.conf

The current task doesn't works (typo?).
Otherwise api/lvm.py can't work properly, functions such as
`get_single_lv()` and many other don't return the expected results.
Indeed, lvm is confused because of the nvme_loop setup.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a5fab15e44517ac63f3fd257989e81b8127b86d9)

commit | commitdiff | tree

Guillaume Abrioux [Mon, 28 Mar 2022 22:01:39 +0000 (00:01 +0200)]

ceph-volume: do not leave pv when zapping osds

when zapping a device and no vg/lv are left, the pv should be
removed too.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7f007e7fc75b4d6e7465c684f7e5b2458883dcc5)

commit | commitdiff | tree

Guillaume Abrioux [Wed, 23 Mar 2022 09:04:45 +0000 (10:04 +0100)]

orchestrator: support complex osd creation

This adds the support of complex OSD creation with command
`orch daemon add osd`.
Any argument supported by `DriveGroupSpec()` can be passed on the command line.

Usage:
```
ceph orch daemon add osd host:data_devices=device1,device2,db_devices=device3,osds_per_device=2,...
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8aa2f4745adff0ba3c7a0731cf48ccc1c85b33f3)

commit | commitdiff | tree

Guillaume Abrioux [Tue, 22 Mar 2022 15:35:58 +0000 (16:35 +0100)]

DriveSelection: skip unavailable devices

Cephadm shouldn't try to deploy a disk reported as unavailable by ceph-volume.
The idea here is to check the rejection reason so we can still use DB devices
in case of OSD replacement.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3a88547559769f4dd438f6557cef22ef9004fa2a)

Conflicts:
src/python-common/ceph/deployment/inventory.py

commit | commitdiff | tree

Guillaume Abrioux [Fri, 11 Mar 2022 09:29:35 +0000 (10:29 +0100)]

ceph-volume: various fixes in arg_validators

if a device with an FS is passed, ceph-volume should abort
the OSD creation.

Fixes: https://tracker.ceph.com/issues/54535
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9f4b830dcfb45eda81eabf18a8461ac4e1bf642e)

Conflicts:
src/ceph-volume/ceph_volume/devices/lvm/common.py
src/ceph-volume/ceph_volume/tests/util/test_arg_validators.py
src/ceph-volume/ceph_volume/util/arg_validators.py

commit | commitdiff | tree

Guillaume Abrioux [Wed, 23 Mar 2022 09:07:05 +0000 (10:07 +0100)]

doc/cephadm: fix a typo

s/osd_crush_choose_leaf_type/osd_crush_chooseleaf_type

```
[ceph: root@adm-1 /]# ceph config set global osd_crush_choose_leaf_type 0
Error EINVAL: unrecognized config option 'osd_crush_choose_leaf_type'
[ceph: root@adm-1 /]# ceph config set global osd_crush_chooseleaf_type 0
[ceph: root@adm-1 /]#
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d43189c17b03420674ea5424666388b8272c2580)

commit | commitdiff | tree

Guillaume Abrioux [Mon, 14 Mar 2022 14:40:47 +0000 (14:40 +0000)]

ceph-volume/tests: speed up tox tests

Let's use `--numprocesses=auto` in order to speed up the unit tests execution.

See the difference, without `--numprocesses=auto`:
```

... omitted output ...

real    1m22.884s
user    0m23.003s
sys     0m20.504s
```

with `--numprocesses=auto`:

```

... omitted output ...

real    0m18.767s
user    0m33.056s
sys     0m23.244s
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cd5eb7939ed92b584c45689a3169847811b8518d)

commit | commitdiff | tree

Adam King [Thu, 10 Mar 2022 17:43:28 +0000 (12:43 -0500)]

mgr/cephadm: generate one c-v raw prepare cmd per data device in raw mode

Fixes: https://tracker.ceph.com/issues/54522
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit b6556e5dbd21192c9207faf84c96f32bd8877d18)

Conflicts:
src/pybind/mgr/cephadm/services/osd.py
src/python-common/ceph/deployment/translate.py
src/python-common/ceph/tests/test_drive_group.py

commit | commitdiff | tree

Sage Weil [Thu, 4 Nov 2021 14:07:14 +0000 (10:07 -0400)]

mgr/orchestrator: improve usage string for 'orch daemon add osd'

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 816bacba35c9861eedbb49d35dc70c7cbe8a5e8e)

commit | commitdiff | tree

Sage Weil [Thu, 12 Aug 2021 15:12:59 +0000 (11:12 -0400)]

ceph-volume: activate: try simple mode too

This is of dubious value to cephadm since /etc/ceph/osd/* won't be
populated inside of a conatiner. However, it makes sense from a purely
ceph-volume perspective.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 101c810a80eea14ab2a1edc8166dbbe76cd9e87a)

commit | commitdiff | tree

Sage Weil [Tue, 3 Aug 2021 18:36:56 +0000 (14:36 -0400)]

mgr/cephadm: identify and instantiate raw osds post-create

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 11d366d4410938c8588b0d212d05b5ebe23efe4d)

Conflicts:
src/pybind/mgr/cephadm/tests/test_cephadm.py

commit | commitdiff | tree

Sage Weil [Tue, 3 Aug 2021 18:36:39 +0000 (14:36 -0400)]

mgr/orchestrator: accept --method arg to 'orch daemon add osd'

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit cef129d90abd73536a141fb375ce35cc7e5081a4)

Conflicts:
src/pybind/mgr/orchestrator/module.py

commit | commitdiff | tree

Sage Weil [Tue, 3 Aug 2021 18:35:27 +0000 (14:35 -0400)]

python-common: drivegroup: add 'method' property

The DriveGroup method can be none, 'raw', or 'lvm'.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 81e46bc64472a88c493b26f5948ae29f71ebfbed)

Conflicts:
src/python-common/ceph/deployment/drive_group.py

commit | commitdiff | tree

Sage Weil [Thu, 5 Aug 2021 17:29:17 +0000 (13:29 -0400)]

ceph-volume: top-level 'activate' command

First try raw, then lvm.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 3d7ceec684b0ac5b83fae4c397b134236fac485e)

commit | commitdiff | tree

Sage Weil [Thu, 5 Aug 2021 17:23:27 +0000 (13:23 -0400)]

ceph-volume: lvm activate: add --no-tmpfs

This isn't necessary for cephadm, but having this arg match raw activate
makes the interface more consistent.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 9dc35338754442b0730b974de4fc3cc6ffb172b6)

commit | commitdiff | tree

Sage Weil [Thu, 5 Aug 2021 16:02:22 +0000 (12:02 -0400)]

ceph-volume: lvm activate: infer bluestore or filestore

No need to require --filestore and/or --bluestore args since we can tell
from the LV tags which one it is.

We can't drop the arguments without breaking existing users, though, so
redefine them to mean *force* bluesetore or filestore activation (even
though this will error out if the tags don't match).

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 451feec4b269e2a5816687136adc74082ec8f2f3)

commit | commitdiff | tree

Sage Weil [Tue, 3 Aug 2021 18:34:54 +0000 (14:34 -0400)]

ceph-volume: raw activate: accept --osd-id and/or --osd-uuid instead of device

This makes it possible to start raw osds based on their uuid/id instead of
device name (which may not be stable).

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 1ce1b3b8ea8ef6b99abe8c14d69d83a47cbaf762)

commit | commitdiff | tree

dparmar18 [Fri, 25 Mar 2022 08:18:54 +0000 (13:48 +0530)]

doc/cephfs/add-remove-mds: added cephadm note, refined "Adding an MDS"

Description: 1) Add a note about using cephadm for setting up the
                cluster and mds(s), also mention the use of ceph
                orchestrator if one needs to setup mds(s) manually.
     2) Changed the term `data point` to `directory` in
                point 1 under "Adding an MDS" section for better
                clarity.

Fixes: https://tracker.ceph.com/issues/54551
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 9e8e1a250e1192cdb1b86650596543d42a2f0401)

commit | commitdiff | tree

Cory Snyder [Tue, 17 May 2022 09:24:53 +0000 (05:24 -0400)]

mgr/ActivePyModules.cc: fix cases where GIL is held while attempting to lock mutex

The mgr process can deadlock if the GIL is held while attempting to lock a mutex.
Relevant regressions were introduced in commit a356bac. This fixes those regressions
and also cleans up some unnecessary yielding of the GIL.

Fixes: https://tracker.ceph.com/issues/55687
Signed-off-by: Cory Snyder <csnyder@iland.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 23 Nov 2021 14:33:35 +0000 (15:33 +0100)]

ceph-volume: zap osds in rollback_osd()

rollback_osd() should zap and wipe the device for the corresponding osd
that was being prepared after a failure happens.

Fixes: https://tracker.ceph.com/issues/53376
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit effe65533f4b7248137fcdc0ae966f8438a05b01)

commit | commitdiff | tree

Zac Dover [Wed, 13 Apr 2022 14:09:38 +0000 (00:09 +1000)]

doc/dev: update basic-workflow.rst

This PR updates the basic-workflow.rst file
to serve the needs of people in 2022 who were not
present at jump street.

The text has been refined up to the section called
"Integration Tests" (non-inclusive).

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit a227e4007a5ce66b63e42facf97f89655edf2169)

commit | commitdiff | tree

David Galloway [Mon, 16 May 2022 18:36:37 +0000 (14:36 -0400)]

Merge remote-tracking branch 'gh/pacific' into pacific-release

commit | commitdiff | tree

zdover23 [Sat, 14 May 2022 21:38:31 +0000 (07:38 +1000)]

Merge pull request #46117 from zdover23/wip-doc-pr-46109-backport-to-pacific

Wip doc pr 46109 backport to pacific

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Jenkins Build Slave User [Thu, 12 May 2022 22:23:14 +0000 (22:23 +0000)]

16.2.8

commit | commitdiff | tree

Ernesto Puerta [Fri, 11 Mar 2022 16:29:07 +0000 (17:29 +0100)]

mgr/dashboard: fix wrong pg status processing

Fixes: https://tracker.ceph.com/issues/54481
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 206dd9d4a71a70c46972597a838fda05ceec03da)

commit | commitdiff | tree

Avan Thakkar [Mon, 2 May 2022 09:03:27 +0000 (14:33 +0530)]

mgr/dashboard: add unit tests for ingress service

Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 2c15c26a71ec3acf67f7005e775832928432c821)

commit | commitdiff | tree

Avan Thakkar [Mon, 2 May 2022 08:02:36 +0000 (13:32 +0530)]

mgr/dashboard: fix ssl cert validation for ingress service creation

Fixes: https://tracker.ceph.com/issues/55511
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 0017fa5bc91570e1cf873b59efa3cd1787c49216)

commit | commitdiff | tree

Yuri Weinstein [Wed, 4 May 2022 04:53:27 +0000 (21:53 -0700)]

Merge pull request #46096 from aclamk/wip-aclamk-unbounded-wholespace-iterator-pacific

pacific: revival and backport of fix for RocksDB optimized iterators

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Ville Ojamo [Mon, 2 May 2022 09:01:51 +0000 (16:01 +0700)]

doc/radosgw: fix pgcalc link

The pgcalc tool has moved to the "old" ceph site so update
the link to avoid a 404.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit 7e1dc469648028d064a6c0faeabe9ecb3c11f32f)

commit | commitdiff | tree

Ville Ojamo [Mon, 2 May 2022 08:59:26 +0000 (15:59 +0700)]

doc/rados/operations: fix pgcalc link

The pgcalc tool has moved to the "old" ceph site so update
the link to avoid a 404.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit 45f8d746acefe01e2416cedf92aadba1555c22f8)

commit | commitdiff | tree

Adam Kupczyk [Fri, 29 Apr 2022 21:32:43 +0000 (23:32 +0200)]

kv/RocksDBStore: Remove feature to make WholeSpaceIterator based on bounded iterator

Iterator-bounding feature is introduced to make RocksDB iterators limited, so they
would less likely traverse over tombstones.
This is used when listing keys in fixed range, for example OMAPS for specific object.

It is problematic when extending this logic to WholeSpaceIterator,
since prefix must be taken into account.

Fixes: https://tracker.ceph.com/issues/55444
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>

commit | commitdiff | tree

Cory Snyder [Thu, 21 Apr 2022 19:56:06 +0000 (15:56 -0400)]

kv/RocksDBStore: simplify RocksDBStore::get_cf_handle(string, IteratorBounds)

Adds a precondition to RocksDBStore::get_cf_handle(string, IteratorBounds)
to avoid duplicating logic of the only caller (RocksDBStore::get_iterator).
Assertions will fail if preconditions are not met.

Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 55ef16f6cc1d344b09798e566c2470e81928327a)

commit | commitdiff | tree

Cory Snyder [Thu, 21 Apr 2022 17:13:22 +0000 (13:13 -0400)]

bluestore: add config option to allow rocksdb iterator bounds to be disabled

Add osd_rocksdb_iterator_bounds_enabled config option to allow rocksdb iterator bounds to be disabled.
Also includes minor refactoring to shorten code associated with IteratorBounds initialization in bluestore.

Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit ca3ccd9)

Conflicts:
src/common/options/osd.yaml.in

Cherry-pick notes:
- Conflicts due to option definition in common/options.cc in Pacific vs. common/options/osd.yaml.in in later releases

commit | commitdiff | tree

Cory Snyder [Fri, 15 Apr 2022 00:54:15 +0000 (20:54 -0400)]

bluestore: set upper and lower bounds on rocksdb omap iterators

Limits RocksDB omap Seek operations to the relevant key range of the object's omap.
This prevents RocksDB from unnecessarily iterating over delete range tombstones in
irrelevant omap CF shards. Avoids extreme performance degradation commonly caused
by tombstones generated from RGW bucket resharding cleanup. Also prefer CFIteratorImpl
over ShardMergeIteratorImpl when we can determine that all keys within specified
IteratorBounds must be in a single CF.

Fixes: https://tracker.ceph.com/issues/55324
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 850c16c2468c3200a340493c12930543f326b0e1)

commit | commitdiff | tree

Yuri Weinstein [Fri, 29 Apr 2022 22:28:11 +0000 (15:28 -0700)]

Merge pull request #46085 from adk3798/pacific-revert-network-handling

pacific: revert bootstrap network handling changes

Reviewed-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 29 Apr 2022 22:27:17 +0000 (15:27 -0700)]

Merge pull request #46092 from neha-ojha/wip-55444-pacific

pacific: [Revert] bluestore: set upper and lower bounds on rocksdb omap iterators

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.