]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
16 months agocephadm: improve cephadm pull usage message 56292/head
Adam King [Fri, 1 Mar 2024 18:22:44 +0000 (13:22 -0500)]
cephadm: improve cephadm pull usage message

Generally, it's uncommon for users to run this
directly, but in case they need to for debugging
purposes, we should include how to pass the
image to be pulled in the usage message.

Additionally, include that this is only to be used
for pulling ceph images in the help message, as
that isn't necessarily clear. Pulling anything
else will result in a traceback as it tries
to run `ceph --version` inside the container.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 90b8ba9fc09ace4e4114152b194239061c7acb31)

16 months agoMerge pull request #56196 from vshankar/wip-64926-reef
Venky Shankar [Tue, 19 Mar 2024 13:21:44 +0000 (18:51 +0530)]
Merge pull request #56196 from vshankar/wip-64926-reef

reef: mds: disable `defer_client_eviction_on_laggy_osds' by default

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
16 months agoMerge pull request #55762 from ajarr/wip-64554-reef
Ilya Dryomov [Tue, 19 Mar 2024 10:08:05 +0000 (11:08 +0100)]
Merge pull request #55762 from ajarr/wip-64554-reef

reef: qa: Add tests to validate synced images on rbd-mirror

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
16 months agoMerge pull request #56262 from zdover23/wip-doc-2024-03-19-backport-56247-to-reef
Zac Dover [Mon, 18 Mar 2024 17:39:38 +0000 (03:39 +1000)]
Merge pull request #56262 from zdover23/wip-doc-2024-03-19-backport-56247-to-reef

reef: docs/rbd: fix typo in arg name

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
16 months agodocs/rbd: fix typo in arg name 56262/head
N Balachandran [Mon, 18 Mar 2024 04:02:39 +0000 (09:32 +0530)]
docs/rbd: fix typo in arg name

Replace "{image-}" with "{image-id}" in the "rbd trash rm"
command description.

Signed-off-by: N Balachandran <nibalach@redhat.com>
(cherry picked from commit f3eb489520fd4fae057e61275d16c6c8fd596f3f)

docs/rbd: replace introspect with inspect

Replace "introspect" with "inspect" in the rbd basic commands
description.

Signed-off-by: N Balachandran <nibalach@redhat.com>
(cherry picked from commit ebf2f60f784728c04d8ec59015d666bafcef8218)

docs/rbd: typo in "retrieving image information"

Replace "for the image" with "of the image".

Signed-off-by: N Balachandran <nibalach@redhat.com>
(cherry picked from commit 4fd5c134536d652ae1f9e05ecf52cb81adb3b850)

16 months agoMerge pull request #56256 from zdover23/wip-doc-2024-03-18-backport-56248-to-reef
Anthony D'Atri [Mon, 18 Mar 2024 14:06:04 +0000 (10:06 -0400)]
Merge pull request #56256 from zdover23/wip-doc-2024-03-18-backport-56248-to-reef

reef: doc/rbd: minor changes to the rbd man page

16 months agodoc/rbd: minor changes to the rbd man page 56256/head
N Balachandran [Mon, 18 Mar 2024 12:22:47 +0000 (17:52 +0530)]
doc/rbd: minor changes to the rbd man page

Fixes typos and grammar for some commands. Adds
additional details for some commandds.

Signed-off-by: N Balachandran <nibalach@redhat.com>
(cherry picked from commit 5dcff6a4b8d835fc55e454af977dc5ebad99d37f)

16 months agoMerge pull request #56252 from guits/wip-64932-reef
Guillaume Abrioux [Mon, 18 Mar 2024 10:39:37 +0000 (11:39 +0100)]
Merge pull request #56252 from guits/wip-64932-reef

reef: node-proxy: fix RedFishClient.logout() method

16 months agonode-proxy: support more Location value formats 56252/head
Guillaume Abrioux [Fri, 15 Mar 2024 14:20:29 +0000 (14:20 +0000)]
node-proxy: support more Location value formats

After some tests, it turns out that depending on the hardware,
the header 'Location' which is returned by the server after logged can be different.
I could notice the following:

either:

Location: scheme://address:port/redfish/v1/SessionService/Session

or

Location: /redfish/v1/SessionService/Session

a previous tracker [1] was opened because I thought only the first one existed, which is wrong.

Fixes: https://tracker.ceph.com/issues/64951
[1] https://tracker.ceph.com/issues/64894

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit d7ccf26983c41344a12f33b2a30fc79b65cc548f)

16 months agonode-proxy: fix RedFishClient.logout() method
Guillaume Abrioux [Wed, 13 Mar 2024 13:32:59 +0000 (13:32 +0000)]
node-proxy: fix RedFishClient.logout() method

the endpoint passed down to util.query() is wrong:
is passes the full url (scheme://addr:port/path) where it should only
pass the path. The cause is that RedFishClient.login() basically stores
the value of the Location header in `self.location`.

The consequence of this is that it makes the client unable to properly logout.

Fixes: https://tracker.ceph.com/issues/64894
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit b1d828d1d2f31c02f225bb375d915353582d158a)

16 months agoMerge pull request #56235 from zdover23/wip-doc-2024-03-16-backport-56182-to-reef
Anthony D'Atri [Sat, 16 Mar 2024 01:32:54 +0000 (21:32 -0400)]
Merge pull request #56235 from zdover23/wip-doc-2024-03-16-backport-56182-to-reef

reef: doc/glossary: add "librados" entry

16 months agodoc/glossary: add "librados" entry 56235/head
Zac Dover [Thu, 14 Mar 2024 06:29:09 +0000 (16:29 +1000)]
doc/glossary: add "librados" entry

Add a "librados" entry to the glossary.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 2a93a8e837a42559f8a81c6fd9274b24f4fdf7f6)

16 months agoMerge pull request #56104 from adk3798/wip-64632-reef
Adam King [Fri, 15 Mar 2024 19:50:25 +0000 (15:50 -0400)]
Merge pull request #56104 from adk3798/wip-64632-reef

reef: doc: adding documentation for secure monitoring stack configuration

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
16 months agoMerge pull request #56103 from adk3798/wip-64629-reef
Adam King [Fri, 15 Mar 2024 19:40:38 +0000 (15:40 -0400)]
Merge pull request #56103 from adk3798/wip-64629-reef

reef: mgr/cephadm: catch CancelledError in asyncio timeout handler

Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
16 months agoMerge pull request #56094 from adk3798/wip-63533-reef
Adam King [Fri, 15 Mar 2024 19:39:32 +0000 (15:39 -0400)]
Merge pull request #56094 from adk3798/wip-63533-reef

reef: mgr/cephadm: fix reweighting of OSD when OSD removal is stopped

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56029 from asm0deuz/wip-64698-reef
Adam King [Fri, 15 Mar 2024 19:37:50 +0000 (15:37 -0400)]
Merge pull request #56029 from asm0deuz/wip-64698-reef

reef: mgr/cephadm: Allow idmap overrides in nfs-ganesha configuration

Reviewed-by: Adam King <adking@redhat.com>
16 months agoMerge pull request #55915 from mchangir/wip-64223-reef
Yuri Weinstein [Fri, 15 Mar 2024 13:51:58 +0000 (06:51 -0700)]
Merge pull request #55915 from mchangir/wip-64223-reef

reef: qa: bump up scrub status command timeout

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
16 months agoMerge pull request #55829 from kotreshhr/wip-64582-reef
Yuri Weinstein [Fri, 15 Mar 2024 13:51:25 +0000 (06:51 -0700)]
Merge pull request #55829 from kotreshhr/wip-64582-reef

reef: qa: Fix fs/full suite

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
16 months agoMerge pull request #55746 from lxbsz/wip-64222
Yuri Weinstein [Fri, 15 Mar 2024 13:50:38 +0000 (06:50 -0700)]
Merge pull request #55746 from lxbsz/wip-64222

reef: qa/tasks/cephfs/test_misc: switch duration to timeout

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
16 months agoMerge pull request #55743 from lxbsz/wip-64075
Yuri Weinstein [Fri, 15 Mar 2024 13:50:08 +0000 (06:50 -0700)]
Merge pull request #55743 from lxbsz/wip-64075

reef: mds: just wait the client flushes the snap and dirty buffer

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
16 months agoMerge pull request #55742 from lxbsz/wip-64045
Yuri Weinstein [Fri, 15 Mar 2024 13:49:40 +0000 (06:49 -0700)]
Merge pull request #55742 from lxbsz/wip-64045

reef: mds: use explicitly sized types for network and disk encoding

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
16 months agoMerge pull request #54520 from joscollin/wip-63553-reef
Yuri Weinstein [Fri, 15 Mar 2024 13:48:20 +0000 (06:48 -0700)]
Merge pull request #54520 from joscollin/wip-63553-reef

reef: cephfs-top: include the missing fields in --dump output

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
16 months agoMerge pull request #53893 from rishabh-d-dave/wip-63147-reef
Yuri Weinstein [Fri, 15 Mar 2024 13:47:50 +0000 (06:47 -0700)]
Merge pull request #53893 from rishabh-d-dave/wip-63147-reef

reef: client: append to buffer list to save all result from wildcard command

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
16 months agoMerge pull request #52581 from rishabh-d-dave/wip-62026-reef
Yuri Weinstein [Fri, 15 Mar 2024 13:47:03 +0000 (06:47 -0700)]
Merge pull request #52581 from rishabh-d-dave/wip-62026-reef

reef: mds: allow all types of mds caps

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
16 months agoMerge pull request #55716 from cbodley/wip-64540-reef
Casey Bodley [Fri, 15 Mar 2024 13:32:53 +0000 (13:32 +0000)]
Merge pull request #55716 from cbodley/wip-64540-reef

reef: rgw: RGWSI_SysObj_Cache::remove() invalidates after successful delete

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
16 months agoMerge pull request #56208 from zdover23/wip-doc-2024-03-15-backport-56188-to-reef
Zac Dover [Fri, 15 Mar 2024 11:03:37 +0000 (21:03 +1000)]
Merge pull request #56208 from zdover23/wip-doc-2024-03-15-backport-56188-to-reef

reef: doc/rbd: add clone mapping command

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
16 months agodoc/rbd: add clone mapping command 56208/head
Zac Dover [Thu, 14 Mar 2024 08:37:23 +0000 (18:37 +1000)]
doc/rbd: add clone mapping command

Add a command that explains how to map a formatted clone when the parent
image and the formatted clone have different encryption types.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit d34c1879c8886ec7f19c7a93490c4736ae9a6d20)

16 months agoMerge pull request #55356 from cbodley/wip-64228-reef
Yuri Weinstein [Thu, 14 Mar 2024 19:46:29 +0000 (12:46 -0700)]
Merge pull request #55356 from cbodley/wip-64228-reef

reef: rgw/rest: fix url decode of post params for iam/sts/sns

Reviewed-by: Casey Bodley <cbodley@redhat.com>
16 months agoMerge pull request #56186 from zdover23/wip-doc-2024-03-14-backport-56160-to-reef
Zac Dover [Thu, 14 Mar 2024 19:44:46 +0000 (05:44 +1000)]
Merge pull request #56186 from zdover23/wip-doc-2024-03-14-backport-56160-to-reef

reef: doc/rbd: add map information for clone images to rbd-encryption.rst

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
16 months agoMerge pull request #55197 from rzarzynski/wip-rocksdb-compression-reef
Yuri Weinstein [Thu, 14 Mar 2024 17:06:58 +0000 (10:06 -0700)]
Merge pull request #55197 from rzarzynski/wip-rocksdb-compression-reef

reef: common/options: Set LZ4 compression for bluestore RocksDB.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
16 months agomds: disable `defer_client_eviction_on_laggy_osds' by default 56196/head
Venky Shankar [Mon, 4 Mar 2024 13:23:53 +0000 (18:53 +0530)]
mds: disable `defer_client_eviction_on_laggy_osds' by default

This config can result in a single client holding up mds to service
other clients since once a client is deferred from eviction due to
laggy OSD(s), a new clients cap acquire request can be possibly
blocked until the other laggy client resumes operation, i.e., when
the laggy OSD is considered non-laggy anymore.

Disable the config by default till the issue is fixed.

Fixes: http://tracker.ceph.com/issues/64685
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 109de8bdab86e1adaad580d9e7322c18fa01bc09)

16 months agoMerge pull request #56161 from zdover23/wip-doc-2024-03-13-backport-54173-to-reef
Zac Dover [Thu, 14 Mar 2024 16:34:19 +0000 (02:34 +1000)]
Merge pull request #56161 from zdover23/wip-doc-2024-03-13-backport-54173-to-reef

reef: doc/dev: backport zipapp docs to reef

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
16 months agoqa/workunits/rbd: switch rbd-mirror workunits to bash 55762/head
Ilya Dryomov [Sat, 9 Mar 2024 21:53:44 +0000 (22:53 +0100)]
qa/workunits/rbd: switch rbd-mirror workunits to bash

By making use of here strings in commit ea3a567f7f03 ("qa/workunits:
make wait_for_status_in_pool_dir() reentrant") we grew a dependency on
bash.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 166a2362378b1ff93e43f483f354c428fd6cef9e)
Signed-off-by: Ramana Raja <rraja@redhat.com>
Conflicts:
qa/workunits/rbd/rbd_mirror_journal.sh
        -  Commit 3fd8a03887354 not backported
           "qa/workunits/rbd: merge journal and snapshot test scripts"

16 months agoqa: Add tests to validate syncing of images using rbd-mirror
Ramana Raja [Thu, 25 May 2023 16:48:12 +0000 (16:48 +0000)]
qa: Add tests to validate syncing of images using rbd-mirror

Introduce functional tests to validate that the images under
workloads are correctly mirrored between two clusters using snapshot
based mirroring.

Run workload on a primary image using a krbd or nbd client. Take
mirror snapshots of the image under workload. Unmount the mapped image
and calculate its MD5 checksum before demoting it. After demotion,
wait for the mirror status of the image to be 'up+unknown' in both
the clusters. This is to make sure that the non-primary image in the
other cluster is ready to be promoted. Now promote the non-primary
image in the other cluster. Map the promoted image and calculate its
MD5 checksum. Verify that the checksums of the demoted and promoted
images in the two clusters are the same.

The above test is run as part of two different workunits:
 - a workunit that validates the syncing of multiple mirrored images
   with workloads running on them
 - another workunit that validates the syncing of a single mirrored
   image with workload running on it and the image is set as primary
   alternatively between the two clusters, as it happens during
   failover and failback scenarios.

Fixes: https://tracker.ceph.com/issues/61617
Signed-off-by: Ramana Raja <rraja@redhat.com>
Co-authored-by: Ilya Dryomov <idryomov@redhat.com>
Co-authored-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit b7aae5c3c5a1dd24c4cb7ceb499292af00bae680)

Cherry-pick notes:
- In qa/workunits/rbd/compare_mirror_images.sh, replace
  `wait_for_replaying_status_in_pool_dir` with `wait_for_status_in_pool_dir`
  Commit 3fd8a03 that added `wait_for_replaying_status_in_pool_dir`
  not backported

16 months agodoc/rbd: add map information for clone images to rbd-encryption.rst 56186/head
N Balachandran [Wed, 13 Mar 2024 11:57:49 +0000 (17:27 +0530)]
doc/rbd: add map information for clone images to rbd-encryption.rst

Add information on the arguments required when mapping the
formatted clone of an encrypted parent image.

Co-authored-by: Zac Dover <zac.dover@proton.me>
Signed-off-by: N Balachandran <nibalach@redhat.com>
(cherry picked from commit 7a2e324a6e1c3e145d3b1e04e6f006defbe0e0b4)

16 months agoMerge pull request #56154 from rhcs-dashboard/wip-64883-reef
Nizamudeen A [Thu, 14 Mar 2024 07:17:05 +0000 (12:47 +0530)]
Merge pull request #56154 from rhcs-dashboard/wip-64883-reef

reef: mgr/dashboard: fix snap schedule time format

Reviewed-by: Nizamudeen A <nia@redhat.com>
16 months agomgr/cephadm: catch CancelledError in asyncio timeout handler 56103/head
Adam King [Fri, 16 Feb 2024 16:24:32 +0000 (11:24 -0500)]
mgr/cephadm: catch CancelledError in asyncio timeout handler

Specifically, concurrent.futures.CancelledError. At least on
python 3.9, this error can be raised when certain commands
being run asynchronously fail. Not catching this results in
the whole cephadm module crashing with something like

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/utils.py", line 94, in do_work
    return f(*arg)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 267, in refresh
    r = self._refresh_facts(host)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 370, in _refresh_facts
    val = self.mgr.wait_async(self._run_cephadm_json(
  File "/usr/share/ceph/mgr/cephadm/module.py", line 671, in wait_async
    return self.event_loop.get_result(coro, timeout)
  File "/usr/share/ceph/mgr/cephadm/ssh.py", line 64, in get_result
    return future.result(timeout)
  File "/lib64/python3.9/concurrent/futures/_base.py", line 444, in result
    raise CancelledError()
concurrent.futures._base.CancelledError

Fixes: https://tracker.ceph.com/issues/64473
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 9c34973932bf3a0ec50c1c63bcba5e35bfe407e5)

16 months agoMerge pull request #56102 from adk3798/wip-64627-reef
Adam King [Wed, 13 Mar 2024 14:08:18 +0000 (10:08 -0400)]
Merge pull request #56102 from adk3798/wip-64627-reef

reef: cephadm: create ceph-exporter sock dir if it's not present

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56101 from adk3798/wip-64622-reef
Adam King [Wed, 13 Mar 2024 14:06:13 +0000 (10:06 -0400)]
Merge pull request #56101 from adk3798/wip-64622-reef

reef: mgr/cephadm is not defining haproxy tcp healthchecks for Ganesha

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56100 from adk3798/wip-64620-reef
Adam King [Wed, 13 Mar 2024 14:04:40 +0000 (10:04 -0400)]
Merge pull request #56100 from adk3798/wip-64620-reef

reef: cephadm: Add nvmeof to autotuner calculation

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56092 from adk3798/wip-63447-reef
Adam King [Wed, 13 Mar 2024 13:59:21 +0000 (09:59 -0400)]
Merge pull request #56092 from adk3798/wip-63447-reef

reef: mgr/cephadm: support for removing host entry from crush map during host removal

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
16 months agodoc/dev: backport zipapp docs to reef 56161/head
Zac Dover [Wed, 13 Mar 2024 12:04:35 +0000 (22:04 +1000)]
doc/dev: backport zipapp docs to reef

Backport the docs changes in https://github.com/ceph/ceph/pull/54173 to
the Reef release branch. This was not previously done because the docs
changes in PR#54173 were bundled with code changes.

Signed-off-by: Zac Dover <zac.dover@proton.me>
16 months agomgr/dashboard: fix snap schedule time format 56154/head
Ivo Almeida [Mon, 11 Mar 2024 15:09:57 +0000 (15:09 +0000)]
mgr/dashboard: fix snap schedule time format

Fixes: https://tracker.ceph.com/issues/64831
Signed-off-by: Ivo Almeida <ialmeida@redhat.com>
(cherry picked from commit a2942f01ae9bde76c6d562374a0bd8aceeee317e)

16 months agoMerge pull request #56115 from rhcs-dashboard/wip-64826-reef
Nizamudeen A [Wed, 13 Mar 2024 06:20:58 +0000 (11:50 +0530)]
Merge pull request #56115 from rhcs-dashboard/wip-64826-reef

reef: mgr/dashboard: fix snap schedule list toggle cols

Reviewed-by: Nizamudeen A <nia@redhat.com>
16 months agoqa/cephadm: adjust host drain test to handle explicit placement warning 56092/head
Adam King [Mon, 6 Nov 2023 16:19:09 +0000 (11:19 -0500)]
qa/cephadm: adjust host drain test to handle explicit placement warning

Since we're adding a warning if any host is listed explicitly
in the placement of any service when removing the host,
we need to adjust the host drain test that removes a host
without the --force flag to not have the explicit hostname
in the placement for the mon service.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit b4db5e4ffcf0fb345c99986718b16853f76b148a)

16 months agomgr/cephadm: warn when draining host explicitly listed in placement
Adam King [Mon, 16 Oct 2023 19:15:54 +0000 (15:15 -0400)]
mgr/cephadm: warn when draining host explicitly listed in placement

In the case you apply a spec like

```
service_type: node-exporter
placement:
  hosts:
  - host3
```

and then you run `ceph orch host drain host3`, cephadm will remove
the daemon from that host and the placement would now match nothing.

This is definitely an issue that should be able to be bypassed as
it generally isn't serious, but it would be good to let users
know they have the host listed explicitly in placements like this
when they want to drain it.

Fixes: https://tracker.ceph.com/issues/63220
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 434e5fe6aa69cad11454d437002015cff55b727a)

16 months agoqa/cephadm: test --rm-crush-entry host rm flag in host drain test
Adam King [Fri, 29 Sep 2023 20:52:37 +0000 (16:52 -0400)]
qa/cephadm: test --rm-crush-entry host rm flag in host drain test

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 7870538dc1e19760cd96a3d343ae3d3235f71eb2)

16 months agoqa/cephadm: add teuthology test for host draining
Adam King [Fri, 29 Sep 2023 20:09:48 +0000 (16:09 -0400)]
qa/cephadm: add teuthology test for host draining

This was a gap in our testing in general, but I'm
adding it here right now specifically to use it
to test the "--rm-crush-entry" flag in a follow
up commit

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 13f16e8d7bb029980d6688680390521253970e9a)

16 months agomgr/cephadm: add --rm-crush-entry flag to host removal
Adam King [Fri, 29 Sep 2023 18:39:10 +0000 (14:39 -0400)]
mgr/cephadm: add --rm-crush-entry flag to host removal

This will tell cephadm to try and remove the
crush bucket for the host at the end of the host
removal process. If this fails, we still consider the
host as having been successfully remove from
cephadm's POV, but the user will get back an error
message telling them we failed to remove the
host from the crush map

Fixes: https://tracker.ceph.com/issues/63031
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit fa0f62aa57755c45c713367620dc834530276b25)

Conflicts:
src/pybind/mgr/cephadm/module.py
src/pybind/mgr/orchestrator/_interface.py

16 months agoMerge pull request #56108 from adk3798/wip-64635-reef
Adam King [Wed, 13 Mar 2024 01:40:27 +0000 (21:40 -0400)]
Merge pull request #56108 from adk3798/wip-64635-reef

reef: cephadm/nvmeof: scrape nvmeof prometheus endpoint

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56107 from adk3798/wip-64689-reef
Adam King [Wed, 13 Mar 2024 01:39:13 +0000 (21:39 -0400)]
Merge pull request #56107 from adk3798/wip-64689-reef

reef: mgr/cephadm: fix placement with label and host pattern

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56106 from adk3798/wip-64644-reef
Adam King [Wed, 13 Mar 2024 01:38:30 +0000 (21:38 -0400)]
Merge pull request #56106 from adk3798/wip-64644-reef

reef: cephadm: remove restriction for crush device classes

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56105 from adk3798/wip-64634-reef
Adam King [Wed, 13 Mar 2024 01:37:56 +0000 (21:37 -0400)]
Merge pull request #56105 from adk3798/wip-64634-reef

reef: cephadm: rm podman-auth.json if removing last cluster

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56099 from adk3798/wip-64414-reef
Adam King [Wed, 13 Mar 2024 01:36:46 +0000 (21:36 -0400)]
Merge pull request #56099 from adk3798/wip-64414-reef

reef: cephadm: fix get_version for nvmeof

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56098 from adk3798/wip-63985-reef
Adam King [Wed, 13 Mar 2024 01:36:15 +0000 (21:36 -0400)]
Merge pull request #56098 from adk3798/wip-63985-reef

reef: orchestrator: Add summary line to orch device ls output

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56097 from adk3798/wip-63984-reef
Adam King [Wed, 13 Mar 2024 01:35:30 +0000 (21:35 -0400)]
Merge pull request #56097 from adk3798/wip-63984-reef

reef: orchestrator: Fix representation of CPU threads in host ls --detail command

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56096 from adk3798/wip-63817-reef
Adam King [Wed, 13 Mar 2024 01:34:55 +0000 (21:34 -0400)]
Merge pull request #56096 from adk3798/wip-63817-reef

reef: python-common/drive_selection: fix limit with existing devices

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56095 from adk3798/wip-63815-reef
Adam King [Wed, 13 Mar 2024 01:34:16 +0000 (21:34 -0400)]
Merge pull request #56095 from adk3798/wip-63815-reef

reef: python-common: fix osdspec_affinity check

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56093 from adk3798/wip-63448-reef
Adam King [Wed, 13 Mar 2024 01:33:04 +0000 (21:33 -0400)]
Merge pull request #56093 from adk3798/wip-63448-reef

reef: mgr/cephadm: discovery service (port 8765) fails on ipv6 only clusters

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55957 from adk3798/reef-test-custom-config
Adam King [Wed, 13 Mar 2024 01:31:04 +0000 (21:31 -0400)]
Merge pull request #55957 from adk3798/reef-test-custom-config

reef: qa/cephadm: testing for extra daemon/container features

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56090 from adk3798/wip-63434-reef
Adam King [Wed, 13 Mar 2024 01:29:55 +0000 (21:29 -0400)]
Merge pull request #56090 from adk3798/wip-63434-reef

reef: mgr/cephadm: update timestamp on repeat daemon/service events

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #56089 from adk3798/wip-63190-reef
Adam King [Wed, 13 Mar 2024 01:29:39 +0000 (21:29 -0400)]
Merge pull request #56089 from adk3798/wip-63190-reef

reef: mgr/cephadm: make jaeger-collector a dep for jaeger-agent

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55819 from adk3798/reef-cephadm-nvmeof-log-mount
Adam King [Wed, 13 Mar 2024 01:29:17 +0000 (21:29 -0400)]
Merge pull request #55819 from adk3798/reef-cephadm-nvmeof-log-mount

reef: cephadm: Add mount for nvmeof log location

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55555 from adk3798/reef-cephadm-asyncio-timeout-fixup
Adam King [Wed, 13 Mar 2024 01:28:52 +0000 (21:28 -0400)]
Merge pull request #55555 from adk3798/reef-cephadm-asyncio-timeout-fixup

reef: mgr/cephadm: fixups for asyncio based timeout

Reviewed-by: John Mulligan <jmulligan@redhat.com>
16 months agoMerge pull request #55097 from cbodley/wip-63839-reef
Casey Bodley [Tue, 12 Mar 2024 12:32:36 +0000 (12:32 +0000)]
Merge pull request #55097 from cbodley/wip-63839-reef

reef: qa: remove vstart runner from radosgw_admin task

Reviewed-by: Yuri Weinstein <yuriw@redhat.com>
16 months agoMerge pull request #55815 from rhcs-dashboard/wip-64624-reef
afreen23 [Tue, 12 Mar 2024 10:59:43 +0000 (16:29 +0530)]
Merge pull request #55815 from rhcs-dashboard/wip-64624-reef

reef: mgr/dashboard: fix snap schedule date format

Reviewed-by: Afreen <afreen23.git@gmail.com>
16 months agoMerge pull request #56127 from adk3798/wip-64836-reef
Nizamudeen A [Tue, 12 Mar 2024 06:09:23 +0000 (11:39 +0530)]
Merge pull request #56127 from adk3798/wip-64836-reef

reef: mgr/dashboard: debugging make check failure

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
16 months agoMerge pull request #56130 from zdover23/wip-doc-2024-03-12-backport-56113-to-reef
Anthony D'Atri [Mon, 11 Mar 2024 23:01:42 +0000 (19:01 -0400)]
Merge pull request #56130 from zdover23/wip-doc-2024-03-12-backport-56113-to-reef

reef: doc/cephadm: Improve multiple files

16 months agoMerge pull request #55969 from galsalomon66/wip-64693-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:32:23 +0000 (11:32 -0700)]
Merge pull request #55969 from galsalomon66/wip-64693-reef

reef: rgw/S3select: remove assert from csv-parser, adding updates

Reviewed-by: Casey Bodley <cbodley@redhat.com>
16 months agoMerge pull request #55790 from cbodley/wip-64600-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:31:35 +0000 (11:31 -0700)]
Merge pull request #55790 from cbodley/wip-64600-reef

reef: test/rgw: increase timeouts in unittest_rgw_dmclock_scheduler

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
16 months agoMerge pull request #55655 from cbodley/wip-64500-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:29:39 +0000 (11:29 -0700)]
Merge pull request #55655 from cbodley/wip-64500-reef

reef: rgw/datalog: RGWDataChangesLog::add_entry() uses null_yield

Reviewed-by: Adam Emerson <aemerson@redhat.com>
16 months agoMerge pull request #55621 from cbodley/wip-64426-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:28:36 +0000 (11:28 -0700)]
Merge pull request #55621 from cbodley/wip-64426-reef

reef: rgw/putobj: RadosWriter uses part head object for multipart parts

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
16 months agoMerge pull request #55606 from jzhu116-bloomberg/wip-64448-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:28:01 +0000 (11:28 -0700)]
Merge pull request #55606 from jzhu116-bloomberg/wip-64448-reef

reef: rgw: do not copy olh attributes in versioning suspended bucket

Reviewed-by: Casey Bodley <cbodley@redhat.com>
16 months agoMerge pull request #55289 from jzhu116-bloomberg/wip-64088-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:26:10 +0000 (11:26 -0700)]
Merge pull request #55289 from jzhu116-bloomberg/wip-64088-reef

reef: rgw/lc: do not add datalog/bilog for some lc actions

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
16 months agoMerge pull request #55094 from cbodley/wip-63960-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:24:11 +0000 (11:24 -0700)]
Merge pull request #55094 from cbodley/wip-63960-reef

reef: rgw: add headers to guide cache update in 304 response

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
16 months agoMerge pull request #55061 from cbodley/wip-63940-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:23:14 +0000 (11:23 -0700)]
Merge pull request #55061 from cbodley/wip-63940-reef

reef: radosgw-admin: 'zone set' won't overwrite existing default-placement

Reviewed-by: Casey Bodley <cbodley@redhat.com>
16 months agoMerge pull request #54866 from trociny/wip-63777-reef
Yuri Weinstein [Mon, 11 Mar 2024 18:21:38 +0000 (11:21 -0700)]
Merge pull request #54866 from trociny/wip-63777-reef

reef: [rgw][lc][rgw_lifecycle_work_time] adjust timing if the configured end time is less than the start time

Reviewed-by: Casey Bodley <cbodley@redhat.com>
16 months agodoc/cephadm: Improve multiple files 56130/head
Anthony D'Atri [Mon, 11 Mar 2024 07:04:47 +0000 (03:04 -0400)]
doc/cephadm: Improve multiple files

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 800dd29e60fcd2bcd27db56d3fe45c58ddf10c8a)

16 months agomgr/dashboard: debugging make check failure 56127/head
Nizamudeen A [Mon, 4 Mar 2024 12:52:48 +0000 (18:22 +0530)]
mgr/dashboard: debugging make check failure

Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 958c60d8a74e3c38abe043e7f2cfbe4224cfb411)

16 months agoMerge pull request #55931 from ceph/reef-release
Yuri Weinstein [Mon, 11 Mar 2024 15:04:09 +0000 (08:04 -0700)]
Merge pull request #55931 from ceph/reef-release

v18.2.2

Reviewed-by: Laura Flores <lflores@redhat.com>
16 months agoMerge pull request #56059 from rhcs-dashboard/wip-64807-reef
Pedro Gonzalez Gomez [Mon, 11 Mar 2024 13:09:52 +0000 (14:09 +0100)]
Merge pull request #56059 from rhcs-dashboard/wip-64807-reef

reef: mgr/dashboard: add snap schedule M, Y frequencies

Reviewed-by: afreen23 <NOT@FOUND>
16 months agomgr/dashboard: fix snap schedule list toggle cols 56115/head
Ivo Almeida [Fri, 8 Mar 2024 11:40:41 +0000 (11:40 +0000)]
mgr/dashboard: fix snap schedule list toggle cols

Added isInvisible property to CdColumnTable interface to hide column
from 'toggle columns' drop down checkboxes.

Fixes: https://tracker.ceph.com/issues/64813
Signed-off-by: Ivo Almeida <ialmeida@redhat.com>
(cherry picked from commit 1b77baea8dd0781fa897ec6c1f1e06c57a265ed7)

16 months agoMerge pull request #56111 from zdover23/wip-doc-2024-03-11-backport-56091-to-reef
Anthony D'Atri [Mon, 11 Mar 2024 07:06:09 +0000 (03:06 -0400)]
Merge pull request #56111 from zdover23/wip-doc-2024-03-11-backport-56091-to-reef

reef: doc/cephadm: improve host-management.rst

16 months agodoc/cephadm: improve host-management.rst 56111/head
Anthony D'Atri [Sun, 10 Mar 2024 19:49:35 +0000 (15:49 -0400)]
doc/cephadm: improve host-management.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 9fb51bb537e8bd9ea60633091acfc48a58262c3b)

16 months agoqa/cephadm: don't test certain workunits with agent 55555/head
Adam King [Thu, 15 Feb 2024 14:24:23 +0000 (09:24 -0500)]
qa/cephadm: don't test certain workunits with agent

There are a handful of workunits that don't work
with or don't make sense with the agent.
The test for the cephadm timeout only works if
the mgr directly runs ceph-volume inventory which
it won't do with the agent present. The adoption
test is just running direct cephadm commands that
are irrelevant to the agent. The test_orch_cli tests
rely on refresh timings that are different with
the agent running, causing spurious failures.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 7953fe1b3920c92c086c981bf4e3d2c41ea7e450)

16 months agocephadm/nvmeof: scrape nvmeof prometheus endpoint 56108/head
Avan Thakkar [Thu, 22 Feb 2024 11:00:06 +0000 (16:30 +0530)]
cephadm/nvmeof: scrape nvmeof prometheus endpoint

Fixes: https://tracker.ceph.com/issues/64536
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 93ec6284fb3002b4778c4e54972ff1d864060922)

Conflicts:
src/cephadm/cephadmlib/constants.py
src/pybind/mgr/cephadm/module.py
src/pybind/mgr/cephadm/templates/services/nvmeof/ceph-nvmeof.conf.j2
src/pybind/mgr/cephadm/tests/test_services.py

16 months agomgr/cephadm: fix placement with label and host pattern 56107/head
Adam King [Wed, 14 Feb 2024 16:28:11 +0000 (11:28 -0500)]
mgr/cephadm: fix placement with label and host pattern

Previously, when both the label and host pattern were
provided, only the label was actually used for the placement

Fixes: https://tracker.ceph.com/issues/64428
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 106f34ba31c82dd87f4c3f9ad82d8ace81e6c689)

16 months agocephadm: remove restriction for crush device classes 56106/head
Seena Fallah [Sun, 11 Feb 2024 21:50:05 +0000 (22:50 +0100)]
cephadm: remove restriction for crush device classes

A restriction has been introduced here (https://github.com/ceph/ceph/commit/6c6cb2f5130dbcf8e42cf03666173948411fc92b) which doesn't let OSDs be created with custom crush device classes.
Crush Device Class is the key that helps the crush distinguish between multiple storage classes, so it must accept any custom names.

Fixes: https://tracker.ceph.com/issues/64382
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 5999196f37bc5cb12de26d5f0aa077229e3ffc42)

16 months agocephadm: rm podman-auth.json if removing last cluster 56105/head
Adam King [Wed, 14 Feb 2024 17:02:09 +0000 (12:02 -0500)]
cephadm: rm podman-auth.json if removing last cluster

We have points in rm-cluster where we check that
there are no other clusters on the host. If that
is the case, we can also clear /etc/ceph/podman-auth.json
which gets written out when we log in to a registry
while using podman

Fixes: https://tracker.ceph.com/issues/64433
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit d203a97e1bf1e06433365ea38e3ab2b6430cefff)

16 months agodoc: adding documentation for secure monitoring stack configuration 56104/head
Redouane Kachach [Tue, 27 Feb 2024 14:52:25 +0000 (15:52 +0100)]
doc: adding documentation for secure monitoring stack configuration
Fixes: https://tracker.ceph.com/issues/64596
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 350401ea0ad129c52f1e2b0adb4747d84cb65dcf)

16 months agocephadm: create ceph-exporter sock dir if it's not present 56102/head
Adam King [Sun, 10 Mar 2024 20:42:51 +0000 (16:42 -0400)]
cephadm: create ceph-exporter sock dir if it's not present

Since this is usually /var/run/ceph/ which ends up getting
created by other daemons as well, it was common to see
ceph-exporter fail to deploy and then deploy fine after
once other daemons were down on the host. I don't see any
reason we can't just try to make the directory here instead
of bailing out.

This patch had to be rewritten for reef, as it depended on
changes in cephadm that will not be backported to reef.

Fixes: https://tracker.ceph.com/issues/64491
Signed-off-by: Adam King <adking@redhat.com>
16 months agomgr/cephadm is not defining haproxy tcp healthchecks for Ganesha 56101/head
avanthakkar [Thu, 5 Oct 2023 12:18:34 +0000 (17:48 +0530)]
mgr/cephadm is not defining haproxy tcp healthchecks for Ganesha

Fixes: https://tracker.ceph.com/issues/62638
Signed-off-by: avanthakkar <avanjohn@gmail.com>
(cherry picked from commit 6a6a9ddd46e5dd2135dfd241fc0dff8ff7472a06)

16 months agocephadm: add testcase to autotuner 56100/head
Paul Cuzner [Wed, 24 Jan 2024 21:22:37 +0000 (10:22 +1300)]
cephadm: add testcase to autotuner

Adds a testcase for the presence of the nvmeof daemon

Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
(cherry picked from commit 2d4bd1604246436136e11d14bc447c31a1e26a97)

16 months agocephadm: Add nvmeof to autotuner calculation
Paul Cuzner [Wed, 24 Jan 2024 21:22:13 +0000 (10:22 +1300)]
cephadm: Add nvmeof to autotuner calculation

Add nvmeof to the list of daemons when calculating the
memory to use for OSDs.

Fixes: https://tracker.ceph.com/issues/64020
Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
(cherry picked from commit 31e4b8de9631eef2b4b5d9865725b0520637d603)

16 months agocephadm: fix get_version for nvmeof 56099/head
Adam King [Mon, 29 Jan 2024 16:23:54 +0000 (11:23 -0500)]
cephadm: fix get_version for nvmeof

This needed to be using the container id it was
passed, instead of ctx.image which is likely to
be `None` when this is run.

Fixes: https://tracker.ceph.com/issues/64229
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 70c00e8ba787d9e9106934cfee0e0afa606ce326)

Conflicts:
src/cephadm/cephadmlib/daemons/nvmeof.py

16 months agoorchestrator: Add summary line to orch device ls 56098/head
Paul Cuzner [Thu, 21 Dec 2023 01:12:45 +0000 (20:12 -0500)]
orchestrator: Add summary line to orch device ls

This patch just adds a summary line to the plain
text output of orch device ls when the --summary
switch is given. This helps to quickly understand your
device countswhen managing hosts with many devices.

Fixes: https://tracker.ceph.com/issues/63864
Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
(cherry picked from commit 50a4cd3a18ce510f25908531d6228e7447f5e72c)

16 months agoorchestrator: Fix representation of threads in host ls 56097/head
Paul Cuzner [Wed, 20 Dec 2023 23:47:51 +0000 (18:47 -0500)]
orchestrator: Fix representation of threads in host ls

This patch fixes the calculation when determining the
number of threads for hosts when using the --detail
parameter.

Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
(cherry picked from commit 5bc735fb6ffbdcacffc3e678b7682f91fe7593c2)

16 months agopython-common/drive_selection: fix limit with existing devices 56096/head
Adam King [Mon, 27 Nov 2023 20:04:42 +0000 (15:04 -0500)]
python-common/drive_selection: fix limit with existing devices

When devices have already been used for OSDs, they are still
allowed to pass filtering as they are still needed for the
resulting ceph-volume lvm batch command. This was causing an
issue with limit however. Limit adds the devices we've found
that match the filter and existing OSD daemons tied to the spec.
This allows double counting of devices that hae been used for
OSDs, as they're counted in terms of being an existing device
and that they match the filter. To avoid this issue, devices
should only be counted towards the limit if they are not already
part of an OSD.

An additional note: The limit feature is only applied for
data devices, so there is no need to worry about the effect
of this change on selection of db, wal, or journal devices.
Also, we would still want to not count these devices if they
did end up passing the data device filter but had been used
for a db/wal/journal device previously.

Fixes: https://tracker.ceph.com/issues/63525
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit d3f1a0e1c0b98b9f1251837ecc8edc367e590dad)

16 months agopython-common: fix osdspec_affinity check 56095/head
Guillaume Abrioux [Tue, 5 Dec 2023 16:58:07 +0000 (17:58 +0100)]
python-common: fix osdspec_affinity check

When no `service_id` is provided to service spec (osd) it results in
OSDs created with "osdspec_affinity" attribute set to a string
containing "None".

The DriveSelection class relies on the comparison of the actual
value of this attribute with the value of the service_id which has
the python type `None` in that case.

If any existing deployments were created without the service_id
attribute, we now have to support this case and make sure the check
won't filter out devices unexpectedly.

Fixes: https://tracker.ceph.com/issues/63729
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit c68b5af0fb639fccc89d26606c7924c6834bf606)

16 months agomgr/cephadm: fix reweighting of OSD when OSD removal is stopped 56094/head
Adam King [Tue, 7 Nov 2023 20:49:57 +0000 (15:49 -0500)]
mgr/cephadm: fix reweighting of OSD when OSD removal is stopped

Previously, when you ran "ceph orch osd rm stop <osd-id>"
cephadm would pass in a new OSD object to the removal
queue that would not have any of the fields set previously
for the OSD. This was mostly fine when removing it from
the queue as those fields were no longer needed, but an
exception was the initial weight, which you need if
you want to set the weight back when you stop removal.

This patch changes it so it will now remove the actual
OSD object the removal queue stores so that we will
get to use the previously set original weight. It also
changes when we grab the original weight to make it
happen earlier and adds it to the to_json so it survives
any potential mgr failovers.

Fixes: https://tracker.ceph.com/issues/63481
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 99fc4a8d406291b65a53f157442bc54bc67e8b0d)