]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
16 months agodoc/rbd: minor changes to the rbd man page 56257/head
N Balachandran [Mon, 18 Mar 2024 12:22:47 +0000 (17:52 +0530)]
doc/rbd: minor changes to the rbd man page

Fixes typos and grammar for some commands. Adds
additional details for some commandds.

Signed-off-by: N Balachandran <nibalach@redhat.com>
(cherry picked from commit 5dcff6a4b8d835fc55e454af977dc5ebad99d37f)

16 months agoMerge pull request #56088 from adk3798/wip-64688-quincy
Adam King [Mon, 18 Mar 2024 12:27:33 +0000 (08:27 -0400)]
Merge pull request #56088 from adk3798/wip-64688-quincy

quincy: mgr/cephadm: fix placement with label and host pattern

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56086 from adk3798/wip-64630-quincy
Adam King [Mon, 18 Mar 2024 12:26:23 +0000 (08:26 -0400)]
Merge pull request #56086 from adk3798/wip-64630-quincy

quincy: mgr/cephadm: catch CancelledError in asyncio timeout handler

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56083 from adk3798/wip-63534-quincy
Adam King [Mon, 18 Mar 2024 12:25:18 +0000 (08:25 -0400)]
Merge pull request #56083 from adk3798/wip-63534-quincy

quincy: mgr/cephadm: fix reweighting of OSD when OSD removal is stopped

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55973 from adk3798/wip-62531-quincy
Adam King [Mon, 18 Mar 2024 12:24:15 +0000 (08:24 -0400)]
Merge pull request #55973 from adk3798/wip-62531-quincy

quincy: mgr/cephadm: allow draining host without removing conf/keyring files

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55763 from ajarr/wip-64555-quincy
Yuri Weinstein [Sat, 16 Mar 2024 16:10:09 +0000 (09:10 -0700)]
Merge pull request #55763 from ajarr/wip-64555-quincy

quincy: qa: Add tests to validate synced images on rbd-mirror

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
16 months agoMerge pull request #55664 from idryomov/wip-64423-quincy
Yuri Weinstein [Sat, 16 Mar 2024 16:09:21 +0000 (09:09 -0700)]
Merge pull request #55664 from idryomov/wip-64423-quincy

quincy: librbd: fix split() for SparseExtent and SparseBufferlistExtent

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
16 months agoMerge pull request #55618 from trociny/wip-64463-quincy
Yuri Weinstein [Sat, 16 Mar 2024 16:07:58 +0000 (09:07 -0700)]
Merge pull request #55618 from trociny/wip-64463-quincy

quincy: tools/rbd: make 'children' command support --image-id

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
16 months agoMerge pull request #56236 from zdover23/wip-doc-2024-03-16-backport-56182-to-quincy
Anthony D'Atri [Sat, 16 Mar 2024 01:32:38 +0000 (21:32 -0400)]
Merge pull request #56236 from zdover23/wip-doc-2024-03-16-backport-56182-to-quincy

quincy: doc/glossary: add "librados" entry

16 months agodoc/glossary: add "librados" entry 56236/head
Zac Dover [Thu, 14 Mar 2024 06:29:09 +0000 (16:29 +1000)]
doc/glossary: add "librados" entry

Add a "librados" entry to the glossary.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 2a93a8e837a42559f8a81c6fd9274b24f4fdf7f6)

16 months agoMerge pull request #56087 from adk3798/wip-64645-quincy
Adam King [Fri, 15 Mar 2024 19:36:15 +0000 (15:36 -0400)]
Merge pull request #56087 from adk3798/wip-64645-quincy

quincy: cephadm: remove restriction for crush device classes

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56085 from adk3798/wip-63818-quincy
Adam King [Fri, 15 Mar 2024 19:35:27 +0000 (15:35 -0400)]
Merge pull request #56085 from adk3798/wip-63818-quincy

quincy: python-common/drive_selection: fix limit with existing devices

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56084 from adk3798/wip-63816-quincy
Adam King [Fri, 15 Mar 2024 19:34:41 +0000 (15:34 -0400)]
Merge pull request #56084 from adk3798/wip-63816-quincy

quincy: python-common: fix osdspec_affinity check

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56081 from adk3798/wip-63446-quincy
Adam King [Fri, 15 Mar 2024 19:25:17 +0000 (15:25 -0400)]
Merge pull request #56081 from adk3798/wip-63446-quincy

quincy: mgr/cephadm: support for removing host entry from crush map during host removal

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
16 months agoMerge pull request #56080 from adk3798/wip-63435-quincy
Adam King [Fri, 15 Mar 2024 19:24:00 +0000 (15:24 -0400)]
Merge pull request #56080 from adk3798/wip-63435-quincy

quincy: mgr/cephadm: update timestamp on repeat daemon/service events

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56079 from adk3798/wip-63116-quincy
Adam King [Fri, 15 Mar 2024 19:23:21 +0000 (15:23 -0400)]
Merge pull request #56079 from adk3798/wip-63116-quincy

quincy: mgr/cephadm: ceph orch add fails when ipv6 address is surrounded by square brackets.

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agomgr/cephadm: add ability to zap OSDs' devices while draining host 55973/head
Adam King [Mon, 5 Jun 2023 19:05:55 +0000 (15:05 -0400)]
mgr/cephadm: add ability to zap OSDs' devices while draining host

Currently, when cephadm drains a host, it will remove all OSDs on
the host, but provides no option to zap the OSD's devices afterwards.
Given users are draining the host likely to remove it from the cluster,
it makes sense some users would want to clean up the devices on the
host that were being used for OSDs. Cephadm already supports zapping
devices outside of host draining, so it makes shouldn't take much to
add that functionality to the host drain as well.

Fixes: https://tracker.ceph.com/issues/61593
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 85043ff4cee108c152f5aa8af267c85e353c475a)

16 months agomgr/cephadm: add utils class for tracking special host labels
Adam King [Wed, 22 Feb 2023 19:07:58 +0000 (14:07 -0500)]
mgr/cephadm: add utils class for tracking special host labels

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 0e90c7e097c4dafafbb6b669949c2b1ea8de25c8)

Conflicts:
src/pybind/mgr/cephadm/inventory.py

16 months agomgr/cephadm: allow draining host without removing conf/keyring files
Adam King [Tue, 21 Feb 2023 18:53:32 +0000 (13:53 -0500)]
mgr/cephadm: allow draining host without removing conf/keyring files

Fixes: https://tracker.ceph.com/issues/58820
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 871aefb11d0a736d66150fee40c213f4210fead4)

16 months agoMerge pull request #55970 from adk3798/wip-62471-quincy
Adam King [Fri, 15 Mar 2024 19:19:52 +0000 (15:19 -0400)]
Merge pull request #55970 from adk3798/wip-62471-quincy

quincy: mgr/cephadm: pick correct IPs for ingress service based on VIP

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55963 from adk3798/quincy-rgw-frontend-args
Adam King [Fri, 15 Mar 2024 19:19:29 +0000 (15:19 -0400)]
Merge pull request #55963 from adk3798/quincy-rgw-frontend-args

quincy: mgr/cephadm: Adding extra arguments support for RGW frontend

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55960 from adk3798/wip-61676-quincy
Adam King [Fri, 15 Mar 2024 19:18:42 +0000 (15:18 -0400)]
Merge pull request #55960 from adk3798/wip-61676-quincy

quincy: cephadm: allow ports to be opened in firewall during adoption, reconfig, redeploy

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56176 from zdover23/wip-doc-2024-03-14-quincy-compiling-cephadm...
Zac Dover [Thu, 14 Mar 2024 19:46:13 +0000 (05:46 +1000)]
Merge pull request #56176 from zdover23/wip-doc-2024-03-14-quincy-compiling-cephadm-note-2

quincy: doc/cephadm: explain different methods of cephadm delivery

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoqa/workunits/rbd: switch rbd-mirror workunits to bash 55763/head
Ilya Dryomov [Sat, 9 Mar 2024 21:53:44 +0000 (22:53 +0100)]
qa/workunits/rbd: switch rbd-mirror workunits to bash

By making use of here strings in commit ea3a567f7f03 ("qa/workunits:
make wait_for_status_in_pool_dir() reentrant") we grew a dependency on
bash.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 166a2362378b1ff93e43f483f354c428fd6cef9e)
Signed-off-by: Ramana Raja <rraja@redhat.com>
Conflicts:
qa/workunits/rbd/rbd_mirror_journal.sh
        -  Commit 3fd8a03887354 not backported
           "qa/workunits/rbd: merge journal and snapshot test scripts"

16 months agoqa: Add tests to validate syncing of images using rbd-mirror
Ramana Raja [Thu, 25 May 2023 16:48:12 +0000 (16:48 +0000)]
qa: Add tests to validate syncing of images using rbd-mirror

Introduce functional tests to validate that the images under
workloads are correctly mirrored between two clusters using snapshot
based mirroring.

Run workload on a primary image using a krbd or nbd client. Take
mirror snapshots of the image under workload. Unmount the mapped image
and calculate its MD5 checksum before demoting it. After demotion,
wait for the mirror status of the image to be 'up+unknown' in both
the clusters. This is to make sure that the non-primary image in the
other cluster is ready to be promoted. Now promote the non-primary
image in the other cluster. Map the promoted image and calculate its
MD5 checksum. Verify that the checksums of the demoted and promoted
images in the two clusters are the same.

The above test is run as part of two different workunits:
 - a workunit that validates the syncing of multiple mirrored images
   with workloads running on them
 - another workunit that validates the syncing of a single mirrored
   image with workload running on it and the image is set as primary
   alternatively between the two clusters, as it happens during
   failover and failback scenarios.

Fixes: https://tracker.ceph.com/issues/61617
Signed-off-by: Ramana Raja <rraja@redhat.com>
Co-authored-by: Ilya Dryomov <idryomov@redhat.com>
Co-authored-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit b7aae5c3c5a1dd24c4cb7ceb499292af00bae680)

Cherry-pick notes:
- In qa/workunits/rbd/compare_mirror_images.sh, replace
  `wait_for_replaying_status_in_pool_dir` with `wait_for_status_in_pool_dir`
  Commit 3fd8a03 that added `wait_for_replaying_status_in_pool_dir`
  not backported

16 months agoMerge pull request #54315 from batrick/wip-63420-quincy
Venky Shankar [Thu, 14 Mar 2024 01:17:42 +0000 (06:47 +0530)]
Merge pull request #54315 from batrick/wip-63420-quincy

quincy: mds: ensure next replay is queued on req drop

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
16 months agoMerge pull request #55318 from adk3798/wip-62447-quincy
Adam King [Wed, 13 Mar 2024 18:21:12 +0000 (14:21 -0400)]
Merge pull request #55318 from adk3798/wip-62447-quincy

quincy: mgr/cephadm: Add "networks" parameter to orch apply rgw

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #53425 from adk3798/quincy-tcmu-custom-configs
Adam King [Wed, 13 Mar 2024 18:20:20 +0000 (14:20 -0400)]
Merge pull request #53425 from adk3798/quincy-tcmu-custom-configs

quincy: cephadm: make custom_configs work for tcmu-runner container

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55473 from idryomov/wip-47287-quincy
Ilya Dryomov [Wed, 13 Mar 2024 18:05:23 +0000 (19:05 +0100)]
Merge pull request #55473 from idryomov/wip-47287-quincy

quincy: librbd: return ENOENT from Snapshot::get_timestamp for nonexistent snap_id

Reviewed-by: Ramana Raja <rraja@redhat.com>
16 months agodoc/cephadm: explain different methods of cephadm delivery 56176/head
Zac Dover [Wed, 13 Mar 2024 17:25:06 +0000 (03:25 +1000)]
doc/cephadm: explain different methods of cephadm delivery

Explain that only in Reef and later releases is cephadm distributed as
an executable compiled from source code. This note is to go into Quincy
and only into Quincy, to direct new users of Ceph whom circumstance has
delivered into the hands of Quincy and who might have the wrong idea
that the documentation of Reef and later releases applies to their
release.

Signed-off-by: Zac Dover <zac.dover@proton.me>
16 months agoMerge pull request #55235 from ifed01/wip-ifed-cache-ratios-qui
Igor Fedotov [Wed, 13 Mar 2024 15:15:50 +0000 (18:15 +0300)]
Merge pull request #55235 from ifed01/wip-ifed-cache-ratios-qui

quincy: osd: make _set_cache_sizes ratio aware of cache_kv_onode_ratio

Reviewed-by: Mark Nelson <mark.nelson@clyso.com>
Reviewed-by: Pere Diaz Bou <pere-altea@hotmail.com>
16 months agomgr/cephadm: catch CancelledError in asyncio timeout handler 56086/head
Adam King [Fri, 16 Feb 2024 16:24:32 +0000 (11:24 -0500)]
mgr/cephadm: catch CancelledError in asyncio timeout handler

Specifically, concurrent.futures.CancelledError. At least on
python 3.9, this error can be raised when certain commands
being run asynchronously fail. Not catching this results in
the whole cephadm module crashing with something like

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/utils.py", line 94, in do_work
    return f(*arg)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 267, in refresh
    r = self._refresh_facts(host)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 370, in _refresh_facts
    val = self.mgr.wait_async(self._run_cephadm_json(
  File "/usr/share/ceph/mgr/cephadm/module.py", line 671, in wait_async
    return self.event_loop.get_result(coro, timeout)
  File "/usr/share/ceph/mgr/cephadm/ssh.py", line 64, in get_result
    return future.result(timeout)
  File "/lib64/python3.9/concurrent/futures/_base.py", line 444, in result
    raise CancelledError()
concurrent.futures._base.CancelledError

Fixes: https://tracker.ceph.com/issues/64473
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 9c34973932bf3a0ec50c1c63bcba5e35bfe407e5)

16 months agoMerge pull request #55556 from adk3798/quincy-cephadm-asyncio-timeout-fixup
Adam King [Wed, 13 Mar 2024 14:41:04 +0000 (10:41 -0400)]
Merge pull request #55556 from adk3798/quincy-cephadm-asyncio-timeout-fixup

quincy: mgr/cephadm: fixups for asyncio based timeout

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoqa/cephadm: test --rm-crush-entry host rm flag in host drain test 56081/head
Adam King [Fri, 29 Sep 2023 20:52:37 +0000 (16:52 -0400)]
qa/cephadm: test --rm-crush-entry host rm flag in host drain test

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 7870538dc1e19760cd96a3d343ae3d3235f71eb2)

16 months agoqa/cephadm: add teuthology test for host draining
Adam King [Fri, 29 Sep 2023 20:09:48 +0000 (16:09 -0400)]
qa/cephadm: add teuthology test for host draining

This was a gap in our testing in general, but I'm
adding it here right now specifically to use it
to test the "--rm-crush-entry" flag in a follow
up commit

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 13f16e8d7bb029980d6688680390521253970e9a)

16 months agomgr/cephadm: add --rm-crush-entry flag to host removal
Adam King [Fri, 29 Sep 2023 18:39:10 +0000 (14:39 -0400)]
mgr/cephadm: add --rm-crush-entry flag to host removal

This will tell cephadm to try and remove the
crush bucket for the host at the end of the host
removal process. If this fails, we still consider the
host as having been successfully remove from
cephadm's POV, but the user will get back an error
message telling them we failed to remove the
host from the crush map

Fixes: https://tracker.ceph.com/issues/63031
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit fa0f62aa57755c45c713367620dc834530276b25)

Conflicts:
src/pybind/mgr/cephadm/module.py

16 months agomgr/cephadm: update timestamp on repeat daemon/service events 56080/head
Adam King [Wed, 18 Oct 2023 18:00:05 +0000 (14:00 -0400)]
mgr/cephadm: update timestamp on repeat daemon/service events

If you have a daemon/service event and then an identical
event happens later (e.g. the same daemon is redeployed
multiple times) the events are not updated on the repeat
instances. In cases like this I think it makes more
sense to update the timestamp so users can see the most
recent time the event happened.

Fixes: https://tracker.ceph.com/issues/63238
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 13512cc202c90abd6c5f1e2747d121cc07689d1b)

16 months agoMerge pull request #55174 from ronen-fr/wip-64018-quincy
Ronen Friedman [Wed, 13 Mar 2024 12:50:39 +0000 (14:50 +0200)]
Merge pull request #55174 from ronen-fr/wip-64018-quincy

quincy: osd/scrub: increasing max_osd_scrubs to 3

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Pere Diaz Bou <pere-altea@hotmail.com>
16 months agoMerge pull request #56134 from zdover23/wip-doc-2024-03-12-backport-56113-to-quincy-2
zdover23 [Wed, 13 Mar 2024 03:45:57 +0000 (13:45 +1000)]
Merge pull request #56134 from zdover23/wip-doc-2024-03-12-backport-56113-to-quincy-2

quincy: doc/cephadm: Improve multiple files

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
16 months agoosd/scrub: increasing max_osd_scrubs to 3 55174/head
Ronen Friedman [Mon, 22 May 2023 15:09:28 +0000 (18:09 +0300)]
osd/scrub: increasing max_osd_scrubs to 3

Bug reports seem to hint that the current default value of
'1' is too low: the cluster is susceptible to scrub scheduling
delays and issues stemming from local software/networking/hardware
problems, even if affecting a very small number of OSDs.

Squid will include a major overhaul of the way scrubs are counted
in the cluster, providing a better solution to the problem. For
now - modifying the default is an effective stop-gap measure.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit cc7b4afda972c144d7ebc679ff7f42d86f1dc493)

16 months agodoc/cephadm: Improve multiple files 56134/head
Anthony D'Atri [Mon, 11 Mar 2024 07:04:47 +0000 (03:04 -0400)]
doc/cephadm: Improve multiple files

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 800dd29e60fcd2bcd27db56d3fe45c58ddf10c8a)

16 months agoMerge pull request #56128 from adk3798/wip-64837-quincy
Nizamudeen A [Tue, 12 Mar 2024 13:22:52 +0000 (18:52 +0530)]
Merge pull request #56128 from adk3798/wip-64837-quincy

quincy: mgr/dashboard: debugging make check failure

16 months agoMerge PR #54374 into quincy
Patrick Donnelly [Tue, 12 Mar 2024 13:17:58 +0000 (09:17 -0400)]
Merge PR #54374 into quincy

* refs/pull/54374/head:
common: resolve config proxy deadlock using refcounted pointers
common: add missing locks in config_proxy methods
common/ceph_mutex: note whether mutex debug methods are usable
qa: add reproducer for obs removal deadlock
qa: narrow search to debug_asok

Reviewed-by: Laura Flores <lflores@redhat.com>
16 months agomgr/dashboard: debugging make check failure 56128/head
Nizamudeen A [Mon, 4 Mar 2024 12:52:48 +0000 (18:22 +0530)]
mgr/dashboard: debugging make check failure

Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 958c60d8a74e3c38abe043e7f2cfbe4224cfb411)

16 months agoMerge pull request #55554 from ceph/wip-yuriw-disable-rbd-upgarde-p2p-quincy
Yuri Weinstein [Mon, 11 Mar 2024 15:20:10 +0000 (08:20 -0700)]
Merge pull request #55554 from ceph/wip-yuriw-disable-rbd-upgarde-p2p-quincy

quincy: qa/suites/upgrade/quincy-p2p: run librbd python API tests from quincy tip

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
16 months agoMerge pull request #55425 from petrutlucian94/wip-64293-quincy
Yuri Weinstein [Mon, 11 Mar 2024 15:16:55 +0000 (08:16 -0700)]
Merge pull request #55425 from petrutlucian94/wip-64293-quincy

quincy: msg: update MOSDOp() to use ceph_tid_t instead of long

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
16 months agoMerge pull request #56112 from zdover23/wip-doc-2024-03-11-backport-56091-to-quincy
Anthony D'Atri [Mon, 11 Mar 2024 07:06:15 +0000 (03:06 -0400)]
Merge pull request #56112 from zdover23/wip-doc-2024-03-11-backport-56091-to-quincy

quincy: doc/cephadm: improve host-management.rst

16 months agodoc/cephadm: improve host-management.rst 56112/head
Anthony D'Atri [Sun, 10 Mar 2024 19:49:35 +0000 (15:49 -0400)]
doc/cephadm: improve host-management.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 9fb51bb537e8bd9ea60633091acfc48a58262c3b)

16 months agomgr/cephadm: fix rgw spec migration with simple specs 55963/head
Adam King [Mon, 3 Jul 2023 18:33:34 +0000 (14:33 -0400)]
mgr/cephadm: fix rgw spec migration with simple specs

The rgw spec migration code, intended to formalize
the rgw_frontend_type spec option, doesn't work with
simple specs i.e.

service_type: rgw
service_id: rgw.1
service_name: rgw.rgw.1
placement:
  label: rgw

because the migration code assumes there will always be
a "spec" section inside the spec. This is the case for
more involved rgw specs such as

service_type: rgw
service_id: foo
placement:
  label: rgw
  count_per_host: 2
spec:
  rgw_realm: myrealm
  rgw_zone: myzone
  rgw_frontend_type: "beast"
  rgw_frontend_port: 5000

which is what the migration is actually concerned about
(verification of the rgw_frontend_type in these specs).

In the case where the spec is more simple, we should
just leave the spec alone and move on. Unfortunately
the current code assumes the field will always be
there and hits an unhandled KeyError when trying to
migrate the more simple specs. This causes the
cephadm module to crash shortly after starting an
upgrade to a version that includes this migration
and it's very difficult to find the root cause. This
can be worked around by adding fields to the rgw
spec before upgrade so the "spec" field exists in
the spec and the migration works as intended.

This commit fixes the migration in the simple
case as well as adding testing for that case to
both the unit tests and orch/cephadm teuthology
upgrade tests

Fixes: https://tracker.ceph.com/issues/61889
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 1860ef83877c76c51a20aac48036bd9590572cc2)

Conflicts:
qa/suites/orch/cephadm/upgrade/3-upgrade/simple.yaml

16 months agomgr/cephadm: fix placement with label and host pattern 56088/head
Adam King [Wed, 14 Feb 2024 16:28:11 +0000 (11:28 -0500)]
mgr/cephadm: fix placement with label and host pattern

Previously, when both the label and host pattern were
provided, only the label was actually used for the placement

Fixes: https://tracker.ceph.com/issues/64428
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 106f34ba31c82dd87f4c3f9ad82d8ace81e6c689)

16 months agocephadm: remove restriction for crush device classes 56087/head
Seena Fallah [Sun, 11 Feb 2024 21:50:05 +0000 (22:50 +0100)]
cephadm: remove restriction for crush device classes

A restriction has been introduced here (https://github.com/ceph/ceph/commit/6c6cb2f5130dbcf8e42cf03666173948411fc92b) which doesn't let OSDs be created with custom crush device classes.
Crush Device Class is the key that helps the crush distinguish between multiple storage classes, so it must accept any custom names.

Fixes: https://tracker.ceph.com/issues/64382
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 5999196f37bc5cb12de26d5f0aa077229e3ffc42)

Conflicts:
src/python-common/ceph/deployment/translate.py

16 months agopython-common/drive_selection: fix limit with existing devices 56085/head
Adam King [Mon, 27 Nov 2023 20:04:42 +0000 (15:04 -0500)]
python-common/drive_selection: fix limit with existing devices

When devices have already been used for OSDs, they are still
allowed to pass filtering as they are still needed for the
resulting ceph-volume lvm batch command. This was causing an
issue with limit however. Limit adds the devices we've found
that match the filter and existing OSD daemons tied to the spec.
This allows double counting of devices that hae been used for
OSDs, as they're counted in terms of being an existing device
and that they match the filter. To avoid this issue, devices
should only be counted towards the limit if they are not already
part of an OSD.

An additional note: The limit feature is only applied for
data devices, so there is no need to worry about the effect
of this change on selection of db, wal, or journal devices.
Also, we would still want to not count these devices if they
did end up passing the data device filter but had been used
for a db/wal/journal device previously.

Fixes: https://tracker.ceph.com/issues/63525
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit d3f1a0e1c0b98b9f1251837ecc8edc367e590dad)

16 months agopython-common: fix osdspec_affinity check 56084/head
Guillaume Abrioux [Tue, 5 Dec 2023 16:58:07 +0000 (17:58 +0100)]
python-common: fix osdspec_affinity check

When no `service_id` is provided to service spec (osd) it results in
OSDs created with "osdspec_affinity" attribute set to a string
containing "None".

The DriveSelection class relies on the comparison of the actual
value of this attribute with the value of the service_id which has
the python type `None` in that case.

If any existing deployments were created without the service_id
attribute, we now have to support this case and make sure the check
won't filter out devices unexpectedly.

Fixes: https://tracker.ceph.com/issues/63729
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit c68b5af0fb639fccc89d26606c7924c6834bf606)

16 months agomgr/cephadm: fix reweighting of OSD when OSD removal is stopped 56083/head
Adam King [Tue, 7 Nov 2023 20:49:57 +0000 (15:49 -0500)]
mgr/cephadm: fix reweighting of OSD when OSD removal is stopped

Previously, when you ran "ceph orch osd rm stop <osd-id>"
cephadm would pass in a new OSD object to the removal
queue that would not have any of the fields set previously
for the OSD. This was mostly fine when removing it from
the queue as those fields were no longer needed, but an
exception was the initial weight, which you need if
you want to set the weight back when you stop removal.

This patch changes it so it will now remove the actual
OSD object the removal queue stores so that we will
get to use the previously set original weight. It also
changes when we grab the original weight to make it
happen earlier and adds it to the to_json so it survives
any potential mgr failovers.

Fixes: https://tracker.ceph.com/issues/63481
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 99fc4a8d406291b65a53f157442bc54bc67e8b0d)

16 months agoceph orch add fails when ipv6 address is surrounded by square brackets. 56079/head
Teoman ONAY [Mon, 3 Jul 2023 14:00:20 +0000 (16:00 +0200)]
ceph orch add fails when ipv6 address is surrounded by square brackets.

fixes: https://tracker.ceph.com/issues/61885
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2153448

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 1ea71bee6197ed0357b586498a43d9d726160a43)

16 months agoMerge pull request #55975 from adk3798/wip-62800-quincy
Adam King [Sun, 10 Mar 2024 18:34:45 +0000 (14:34 -0400)]
Merge pull request #55975 from adk3798/wip-62800-quincy

quincy: cephadm: run tcmu-runner through script to do restart on failure

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55974 from adk3798/wip-62796-quincy
Adam King [Sun, 10 Mar 2024 18:26:47 +0000 (14:26 -0400)]
Merge pull request #55974 from adk3798/wip-62796-quincy

quincy: mgr/cephadm: don't use image tag in orch upgrade ls

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55966 from adk3798/wip-62468-quincy
Adam King [Sun, 10 Mar 2024 18:22:13 +0000 (14:22 -0400)]
Merge pull request #55966 from adk3798/wip-62468-quincy

quincy: cephadm: add tcmu-runner to logrotate config

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55965 from adk3798/wip-62461-quincy
Adam King [Sun, 10 Mar 2024 18:21:13 +0000 (14:21 -0400)]
Merge pull request #55965 from adk3798/wip-62461-quincy

quincy: cephadm: support for CA signed keys

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55964 from adk3798/wip-61965-quincy
Adam King [Sun, 10 Mar 2024 18:19:36 +0000 (14:19 -0400)]
Merge pull request #55964 from adk3798/wip-61965-quincy

quincy: mgr/cephadm: add is_host_<status> functions to HostCache

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55962 from adk3798/wip-61685-quincy
Adam King [Sun, 10 Mar 2024 18:18:13 +0000 (14:18 -0400)]
Merge pull request #55962 from adk3798/wip-61685-quincy

quincy: python-common/drive_group: handle fields outside of 'spec' even when 'spec' is provided

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55961 from adk3798/wip-61682-quincy
Adam King [Sun, 10 Mar 2024 18:00:02 +0000 (14:00 -0400)]
Merge pull request #55961 from adk3798/wip-61682-quincy

quincy: python-common/drive_selection: lower log level of limit policy message

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55959 from adk3798/wip-61543-quincy
Adam King [Sun, 10 Mar 2024 17:58:49 +0000 (13:58 -0400)]
Merge pull request #55959 from adk3798/wip-61543-quincy

quincy: cephadm: Adding support to configure public_network cfg section

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55958 from adk3798/quincy-test-custom-config
Adam King [Sun, 10 Mar 2024 17:57:56 +0000 (13:57 -0400)]
Merge pull request #55958 from adk3798/quincy-test-custom-config

quincy: qa/cephadm: testing for extra daemon/container features

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56074 from zdover23/wip-doc-2024-03-09-backport-56068-to-quincy
zdover23 [Sat, 9 Mar 2024 13:39:11 +0000 (23:39 +1000)]
Merge pull request #56074 from zdover23/wip-doc-2024-03-09-backport-56068-to-quincy

quincy: doc/glossary: add "Crimson" entry

Reviewed-by: Cole Mitchell <cole.mitchell.ceph@gmail.com>
16 months agodoc/glossary: add "Crimson" entry 56074/head
Zac Dover [Fri, 8 Mar 2024 17:17:59 +0000 (03:17 +1000)]
doc/glossary: add "Crimson" entry

Add a "Crimson" entry to the glossary.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit b31e061cc087b31a7e9e841dd21e7403a2197378)

16 months agoMerge pull request #56042 from zdover23/wip-doc-2024-03-08-backport-56010-to-quincy
zdover23 [Sat, 9 Mar 2024 04:32:14 +0000 (14:32 +1000)]
Merge pull request #56042 from zdover23/wip-doc-2024-03-08-backport-56010-to-quincy

quincy: doc/start: add Slack invite link

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
16 months agoMerge pull request #56058 from zdover23/wip-doc-2024-03-08-backport-56045-to-quincy
Anthony D'Atri [Fri, 8 Mar 2024 16:00:09 +0000 (11:00 -0500)]
Merge pull request #56058 from zdover23/wip-doc-2024-03-08-backport-56045-to-quincy

quincy: doc/rados: restore PGcalc tool

16 months agodoc/rados: restore PGcalc tool 56058/head
Zac Dover [Thu, 7 Mar 2024 17:29:50 +0000 (03:29 +1000)]
doc/rados: restore PGcalc tool

Restore the PGcalc tool to the documentation suite.

Co-authored-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit eaaf72253123de1a66f163f651046817faa97a1a)

16 months agodoc/start: add Slack invite link 56042/head
Zac Dover [Thu, 7 Mar 2024 03:01:47 +0000 (13:01 +1000)]
doc/start: add Slack invite link

Add a link to the ceph-storage Slack invitation page. Previously the
link went to a plain old "this is the ceph-storage Slack" page that did
not direct the reader to sign up.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit dee319e61204b2ee9ac13562c2c7075ef0f2ea4b)

16 months agoMerge pull request #56013 from zdover23/wip-doc-2024-03-07-backport-55995-to-quincy
Anthony D'Atri [Thu, 7 Mar 2024 15:35:06 +0000 (10:35 -0500)]
Merge pull request #56013 from zdover23/wip-doc-2024-03-07-backport-55995-to-quincy

quincy: doc/architecture: correct typo

16 months agodoc/architecture: correct typo 56013/head
Zac Dover [Wed, 6 Mar 2024 11:40:10 +0000 (21:40 +1000)]
doc/architecture: correct typo

s/client/clients/ where necessary, and add a link to the glossary.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit ae08855cf870173dce2a47a28f3bbb22e7ae0ca2)

16 months agocephadm: make custom_configs work for tcmu-runner container 53425/head
Adam King [Mon, 21 Aug 2023 17:48:56 +0000 (13:48 -0400)]
cephadm: make custom_configs work for tcmu-runner container

This is intended to be a temporary workaround to make
custom config files be able to be mounted into
the tcmu-runner container. The hope is to refactor
cephadm's iscsi handling for squid, but a patch
like this could be useful for iscsi in older
releases where currently custom config files
are unusable for the tcmu-runner container

What this patch actually does is have us write the
custom config files to a dir for the tcmu-runner
container so that the rest of the logic works without
change. I thought this would be easier to remove later
than a patch that integrates more with the container
mounts or general deployment

The use case in mind is something like

service_type: iscsi
service_id: foo
service_name: iscsi.foo
placement:
  hosts:
  - host1
custom_configs:
  -  mount_path: /etc/tcmu/tcmu.conf
     content: |
       log_level = 4
spec:
  api_password: admin
  api_port: 5000
  api_user: admin
  pool: foo

which would allow users to modify the logging of the
tcmu-runner container for debugging purposes

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit de92392708bf456bba975cc18b3138035d79ae05)

16 months agocephadm: run tcmu-runner through script to do restart on failure 55975/head
Adam King [Tue, 13 Jun 2023 23:54:30 +0000 (19:54 -0400)]
cephadm: run tcmu-runner through script to do restart on failure

Currently, cephadm runs tcmu-runner as a background
process inside the unit file deployed for iscsi
(rbd-target-api is the primary process). This means
if tcmu-runner crashes for whatever reason, systemd
will not attempt to restart it. This commits sets
up a script to serve as the container entrypoint
for the tcmu-runner container that will run
tcmu-runner and also restart it on failure
(unless there are too many failures in a short
period, at which point it gives up).

The hope is to eventually drop use of this script
for a better solution in squid onward, but this
should be helpful on older releases (quincy and
pacific at least) where we won't be able to
bring that better solution

Fixes: https://tracker.ceph.com/issues/61667
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 47eb6b3f62afe993073429b02051ae0343d7aea3)

Conflicts:
src/cephadm/tests/test_cephadm.py

16 months agocephadm: Fix extra_container_args for iSCSI
Raimund Sacherer [Fri, 26 May 2023 15:52:57 +0000 (17:52 +0200)]
cephadm: Fix extra_container_args for iSCSI

extra_container_args where only applied for rbd_target_api container and not for
tcmu-runner container.

Signed-off-by: Raimund Sacherer <rsachere@redhat.com>
(cherry picked from commit ad60fc3db644b8bf44a582e79888e2fb15d7ce3a)

Conflicts:
src/cephadm/cephadm

16 months agoModify how Iscsi tcmu-runner container is started within systemd
Teoman ONAY [Tue, 31 May 2022 08:34:05 +0000 (10:34 +0200)]
Modify how Iscsi tcmu-runner container is started within systemd

Modify Iscsi tcmu-runner container to be run demonized in the same
systemd slice as all other ceph processes

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit 79c51938e88dc903f03faa42a94070ff8874a7fa)

16 months agoqa: test_iscsi_pids_limit.sh: increase sleep time
Ilya Dryomov [Mon, 11 Apr 2022 10:45:02 +0000 (12:45 +0200)]
qa: test_iscsi_pids_limit.sh: increase sleep time

It could take longer than 30 seconds to fork off 40000 processes on
a busy system.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit a23b9c99387858fb5299a42e7b79f319ece03641)

16 months agomgr/cephadm: don't use image tag in orch upgrade ls 55974/head
Adam King [Fri, 1 Sep 2023 13:05:04 +0000 (09:05 -0400)]
mgr/cephadm: don't use image tag in orch upgrade ls

Using the tag seems to screw up the auth URL generated
and is unnecessary since we're trying to get a list
of tags for the image anyway.

Fixes: https://tracker.ceph.com/issues/62679
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 0f4426d3a085fc2f5137593029469c3b20ace77e)

16 months agomgr/cephadm: fix haproxy monitoring endpoint 55970/head
Redouane Kachach [Mon, 14 Nov 2022 17:49:42 +0000 (18:49 +0100)]
mgr/cephadm: fix haproxy monitoring endpoint
Fixes: https://tracker.ceph.com/issues/58021
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 9636d9467f7bbcbda31da7809065000c49326dca)

Conflicts:
src/pybind/mgr/cephadm/tests/test_services.py

16 months agomgr/cephadm: filter hosts that can't support VIP for ingress
Adam King [Tue, 1 Aug 2023 21:43:36 +0000 (17:43 -0400)]
mgr/cephadm: filter hosts that can't support VIP for ingress

Keepalive daemons need the host to have an interface
on which they can set up their VIP. If a host
does not have any interface that can work, we should
filter it out

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 17bc76f5bb6b3ef8c962ce31a80c5a3a43b5bdd2)

Conflicts:
src/pybind/mgr/cephadm/serve.py
src/pybind/mgr/cephadm/tests/test_services.py

16 months agomgr/cephadm: select IPs/interface based on VIP for keepalive conf
Adam King [Tue, 1 Aug 2023 20:32:06 +0000 (16:32 -0400)]
mgr/cephadm: select IPs/interface based on VIP for keepalive conf

We need to make sure the keepalive conf sets
the unicast src and peer IPs to be the ones
in the same subnet as the VIP we're setting up,
as well as specify the correct interface. Otherwise,
the keepalive daemons don't speak to each other
properly and all end up going into MASTER state.

Fixes: https://tracker.ceph.com/issues/62276
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 58ddc4e20f7cead1f2594241450f4beb5230c746)

Conflicts:
src/pybind/mgr/cephadm/tests/test_services.py

16 months agocephadm: add tcmu-runner to logrotate config 55966/head
Adam King [Fri, 2 Jun 2023 00:06:35 +0000 (20:06 -0400)]
cephadm: add tcmu-runner to logrotate config

This process could be used to set up the tcmu-runner
to log to a file much like other ceph daemons

- create /etc/tcmu directory
- create /etc/tcmu/tcmu.conf directory with default options
- change dir to /var/log
- change log level to 4
- add -v /etc/tcmu:/etc/tcmu to tcmu-runner container podman line in unit.run

In order to support this (mostly for debugging) we should
add tcmu-runner to the logrotate config

Fixes: https://tracker.ceph.com/issues/61571
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit d5d40e07cae8a1d6a94029c4354d146b0baa3971)

16 months agoqa/cephadm: add test for ca signed keys 55965/head
Adam King [Fri, 7 Jul 2023 15:03:56 +0000 (11:03 -0400)]
qa/cephadm: add test for ca signed keys

Test that bootstraps with a CA signed key using
the use_ca_signed_key cephadm override. Then follows
up by doing a check-host on each host which verifies
the cephadm mgr module can reach and authenticate with
the nodes using the new key setup.

This probably should really be a workunit, but
I didn't want to create a full new section for
this test and I needed a section that didn't
already run the cephadm task for every test. I could
see this being moved into some sort of
"test_special_deployment_scenarios" section in the future

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 141af1c6b156da34418100629cd1407b74c681ad)

16 months agoqa/cephadm: add ca signed key to cephadm task
Adam King [Fri, 7 Jul 2023 14:36:39 +0000 (10:36 -0400)]
qa/cephadm: add ca signed key to cephadm task

To allow bootstrapping a cluster using a CA signed
key instead of the standard pubkey authentication.
Will allow explicit testing of this as we add support
for it

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit bef9617c51b426b1479c374f2055f34e3fe20ed1)

Conflicts:
qa/tasks/cephadm.py

16 months agodoc/cephadm: document setting up CA signed keys in running cluster
Adam King [Sat, 3 Jun 2023 19:42:19 +0000 (15:42 -0400)]
doc/cephadm: document setting up CA signed keys in running cluster

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 2c837ea9cff44d6199ef68c03307e7ff3104adcf)

16 months agodoc/cephadm: document bootstrapping with CA signed keys
Adam King [Sat, 3 Jun 2023 19:28:05 +0000 (15:28 -0400)]
doc/cephadm: document bootstrapping with CA signed keys

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 6b4d9b4427608cb5d1c6e1b3fc958fba3ce0c22d)

16 months agodoc/cephadm: document how to pass self made SSH key pairs to bootstrap
Adam King [Sat, 3 Jun 2023 18:39:05 +0000 (14:39 -0400)]
doc/cephadm: document how to pass self made SSH key pairs to bootstrap

This didn't seem to exist in the install section of
the cephadm docs. Wanted to add it in before adding
documentation for bootstrapping with CA signed keys.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit e09a3765476eedae28905b51b666bee92c6fcf8e)

16 months agomgr/cephadm: add support for CA signed SSH keys setups
Adam King [Sat, 3 Jun 2023 17:31:58 +0000 (13:31 -0400)]
mgr/cephadm: add support for CA signed SSH keys setups

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 50f74d6063b820230f57608a1f800dc2507a3e1f)

16 months agomgr/cephadm: add is_host_<status> functions to HostCache 55964/head
Adam King [Thu, 1 Jun 2023 23:23:45 +0000 (19:23 -0400)]
mgr/cephadm: add is_host_<status> functions to HostCache

A bunch of places were doing list compression to see if a host
was unreachable/draining/schedulable by hostname. This is meant to
replace all those instances of list compression with a function
call that does the same

Fixes: https://tracker.ceph.com/issues/61548
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit b4002529b6ec4fae83ae3958f9fc288b22106f90)

16 months agomgr/cephadm: Adding extra arguments support for RGW frontend
Redouane Kachach [Wed, 26 Oct 2022 09:33:38 +0000 (11:33 +0200)]
mgr/cephadm: Adding extra arguments support for RGW frontend
Fixes: https://tracker.ceph.com/issues/57931
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 2c46c0741962e0e6a5ddbc960dfd21948daf0947)

16 months agopython-common/drive_group: handle fields outside of 'spec' even when 'spec' is provided 55962/head
Adam King [Wed, 31 May 2023 17:08:35 +0000 (13:08 -0400)]
python-common/drive_group: handle fields outside of 'spec' even when 'spec' is provided

Otherwise certain specs such as

service_type: osd
service_id: xxx
service_name: osd.xxx
placement:
  hosts:
  - vm-00
spec:
  osds_per_device: 2
data_devices:
  paths:
  - /dev/vde

fail to apply with

Error EINVAL: ServiceSpec: 'dict' object has no attribute 'validate'

which is not a useful error message. This is caused by the
spec assuming all osd specific fields are either defined
in the 'spec' section or outside of it, but not mixed in.
We could also just consider these specs to be invalid
and just raise a better error message, but it seems easier
to make the minor adjustment for it to work, given there doesn't
seem to be an issue with mixing the styles for specs for
other service types.

Fixes: https://tracker.ceph.com/issues/61533
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 12901f617d9f21dcc4de9b039b7ab6484fbc99ca)

16 months agopython-common/drive_selection: lower log level of limit policy message 55961/head
Adam King [Mon, 5 Jun 2023 17:18:06 +0000 (13:18 -0400)]
python-common/drive_selection: lower log level of limit policy message

This gets logged every time cephadm tries to apply a
relevant OSD spec and ends up spamming the logs. There's no reason
we really need this to be at info rather than debug level,
so let's lower it.

Fixes: https://tracker.ceph.com/issues/61592
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 20478025696b4ef39ca3e50c7cbfdb1250f85cfd)

16 months agocephadm: use enum for tracking redeploy/reconfig 55960/head
Adam King [Wed, 31 May 2023 23:38:38 +0000 (19:38 -0400)]
cephadm: use enum for tracking redeploy/reconfig

Since the options are mutually exclusive, using
an enum is preferable to having multiple bools
to track each of them

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 7081759d48f4e9f21a6482c2f32446d9b1f895ea)

Conflicts:
src/cephadm/cephadm

16 months agocephadm: open ports in firewall when adopting monitoring stack daemons
Adam King [Thu, 13 Apr 2023 17:54:00 +0000 (13:54 -0400)]
cephadm: open ports in firewall when adopting monitoring stack daemons

Otherwise we risk the prometheus/alertmanager/grafana
not functioning properly after adoption due to the necessary
port in the firewall not being open.

Fixes: https://tracker.ceph.com/issues/59443
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 40a98174ccab080fd540e05b6adadcf82e9c2a78)

16 months agocephadm: still try to open ports in firewall on redeploy/reconfig
Adam King [Thu, 13 Apr 2023 17:05:11 +0000 (13:05 -0400)]
cephadm: still try to open ports in firewall on redeploy/reconfig

Prior to this patch we were discarding the provided
ports on reconfig and redeploy in order to not fail
thinking there was a port conflict with the instance
of the daemon we were about to reconfig/redeploy. However,
it's still desirable for us to make sure the firewall ports
are open when we do a reconfig/redpeloy, so this refactors
the port handling approach to have it do that but
still avoid checking for port conflicts. It also include
an update of the type signature of deploy_daemon
to the py3 style. That wasn't needed for the change
but since I was added an arugment there I thought we might
as well do it now.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit fdecd66f1306d3bf60780dbd44c9cb8e63b3892a)

Conflicts:
src/cephadm/cephadm

16 months agocephadm: Adding support to configure public_network cfg section 55959/head
Redouane Kachach [Mon, 22 May 2023 09:15:07 +0000 (11:15 +0200)]
cephadm: Adding support to configure public_network cfg section
Fixes: https://tracker.ceph.com/issues/61330
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 50811d114ec91a3e5e340f6845870597ea498b35)

16 months agoqa/cephadm: test for extra daemon features 55958/head
Adam King [Mon, 19 Jun 2023 18:24:23 +0000 (14:24 -0400)]
qa/cephadm: test for extra daemon features

Specifically, extra_container_args, extra_entrypoint_args,
and custom_configs.

This also provides testing for the CustomContainer
class which previously had no usage in any
of the teuthology tests

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 031bbbc17fda3c7b28b12d11b976629d8c1297ed)

16 months agomgr/cephadm: add extra_entrypoint_args to mon spec
Adam King [Mon, 19 Jun 2023 20:07:31 +0000 (16:07 -0400)]
mgr/cephadm: add extra_entrypoint_args to mon spec

There was no reason for the mon spec to not include
this option. I believe this was just an oversight caused
by the addition of the mon spec and extra_entrypoint_args
in separate PRs around the same time.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 370836d46475d8daa6b26acd6f5330abb932bfed)

Conflicts:
src/python-common/ceph/deployment/service_spec.py

16 months agomgr/cephadm: add extra_container_args and custom_configs to CustomContainer
Adam King [Mon, 19 Jun 2023 19:46:45 +0000 (15:46 -0400)]
mgr/cephadm: add extra_container_args and custom_configs to CustomContainer

CustomContainer was skipped previously for the extra_container_args
and custom_configs feature as these could already be done
using other fields within the custom container service spec
(the "args" and "files" fields respectively). It seems
desirable for us to allow setting these things for custom
containers the same as for other services for uniformity sake
and this allows us to use custom containers to test
these features.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit b66892a668d487f4b7ede147eb8855d166c3d1f9)

Conflicts:
src/python-common/ceph/deployment/service_spec.py

16 months agoMerge pull request #55516 from afreen23/wip-64368-quincy
Nizamudeen A [Tue, 5 Mar 2024 08:08:44 +0000 (13:38 +0530)]
Merge pull request #55516 from afreen23/wip-64368-quincy

quincy: mgr/dashboard: fix error while accessing roles tab when policy attached

Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>