Zac Dover [Mon, 4 Mar 2024 10:41:16 +0000 (20:41 +1000)]
doc/rados: link to pg setting commands
Link to the instructions for manually setting the number of PGs per
pool, from the mention of placement groups. These instructions are
included here in response to a request from Ronen Friedman on the
occasion of the removal of links to the PGcalc (see
https://github.com/ceph/ceph/pull/55899#pullrequestreview-1912940118).
Zac Dover [Sun, 3 Mar 2024 10:28:00 +0000 (20:28 +1000)]
doc/rados: remove PGcalc from docs
Remove mention of the "PG calc" tool from the documentation. I have
removed all mention of this in one fell swoop to help posterity restore
mention of this tool if we decide we need to do so.
Kefu Chai [Tue, 27 Feb 2024 14:25:32 +0000 (22:25 +0800)]
cmake: bump liburing from 0.7 to 2.5
this allows us to use newer liburing features. Seastar is using
some of them which are not provided by liburing 0.7.
in this change, `--use-libc` is passed to configure. otherwise
it does not link against libc, and the symbles like memset()
won't be available when compiling liburing.so with -fPIC using
clang, which does not pull libc in that case.
Dan Mick [Thu, 29 Feb 2024 19:36:51 +0000 (11:36 -0800)]
.github/workflows/create-backport-trackers.yml: update versions of actions
Getting warning about node16 being deprecated. The workflow doesn't use node
directly, but through the external actions. Moving to node20 requires
changing setup-python version; Bhacaz/checkout-files is deprecated and
recommends actions/checkout.
Zac Dover [Fri, 1 Mar 2024 12:11:14 +0000 (22:11 +1000)]
doc/install: add manual RADOSGW install procedure
Add a manual RADOSGW installation procedure to
doc/install/manual-deployment.rst. This procedure was developed by Janne
Johansson and reported to the ceph-users mailing list on 29 Jan 2024
here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/LB3YRIKAPOHXYCW7MKLVUJPYWYRQVARU/
Co-authored-by: Janne Johansson <icepic.dz@gmail.com> Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
The rbd-wnbd daemon currently caches one rados context per cluster.
However, it's registering hooks against the global context
admin socket, which won't be available. For this reason,
the "rbd-wnbd stats" command no longer works.
To address this issue, we'll ensure that rbd-wnbd sets command hooks
against the right admin socket instance, leveraging the image
context.
The "rbd-wnbd unmap" command is currently telling the WNBD driver
to remove the mapping without contacting the rbd-wnbd daemon
and waiting for it to perform its cleanup.
For this reason, attempting to delete the image immediately after
unmapping it can fail due to existing watchers.
As a temporary solution, we'll retry the image remove operation.
At a later time, we'll update the "rbd-wnbd unmap" command to go
through the rbd-wnbd daemon, ensuring that all the necessary
cleanup is performed before returning.
While at it, we're dropping a redundant LOG.error call so that we
won't print expected exceptions.
This commit will store the mapping config in the Windows registry
only after initializing the mapping. This ensures that we aren't
replacing the registry settings for already mapped images.
We'll also check if the registry setting was added by us before
cleaning it up.
Lucian Petrut [Mon, 12 Jun 2023 13:16:39 +0000 (13:16 +0000)]
rbd-wnbd: use one daemon process per host
We're currently using one rbd-wnbd process per image mapping.
Since OSD connections aren't shared across those processes,
we end up with an excessive amount of TCP sessions, potentially
exceeding Windows limits:
https://ask.cloudbase.it/question/3598/ceph-for-windows-tcp-session-count/
In order to improve rbd-wnbd's scalability, we're going to use
a single process per host (unless "-f" is passed when mapping the
image, in which case the daemon will run as part of the same
process). This allows OSD sessions to be shared across image
mappings.
Another advantage is that the "ceph-rbd" service starts faster,
especially when having a large number of image mappings.
Zac Dover [Thu, 29 Feb 2024 08:08:10 +0000 (18:08 +1000)]
doc/glossary: improve "MDS" entry
Improve the entry for "MDS" in doc/glossary.rst by linking to the
"ceph-mds" man page and mentioning the relationship between clients and
MDS (or MDSes).
Ramana Raja [Thu, 29 Feb 2024 17:12:19 +0000 (12:12 -0500)]
qa/suites: add diff-continuous and compare-mirror-image tests
... to rbd and krbd suites respectively.
This allows the compare-mirror-image tests introduced in ea3a567
to be run against various kernel branches, e.g., testing branch.
And allows diff_continuous test in rbd_suite to run against distro
kernel.
Fixes: https://tracker.ceph.com/issues/64574 Signed-off-by: Ramana Raja <rraja@redhat.com>
daijufang [Mon, 4 Dec 2023 07:46:29 +0000 (07:46 +0000)]
src/rgw: fix for the multipart interface in the WORM function
1. Save the WORM configuration information in the initialization chunk information for use when merging chunks.
2. Support x-amz-bypass-governance-retention when merging chunks.
John Mulligan [Thu, 22 Feb 2024 18:49:10 +0000 (13:49 -0500)]
qa/tasks: add templating functions to cephadm module
Add functions to cephadm.py that will be later used to template
strings within the yaml files in the cephadm suites. This will be used
to replace the specific subst_vip call with generic calls that let
tests access "any" variables stored on the test ctx.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 20 Feb 2024 15:09:50 +0000 (10:09 -0500)]
qa/tasks: fix VIPs log line
While testing my previous patches were correct I noticed that the string
here was logged exactly as written, and was thus pretty useless. This
was probably meant to be an f-string. So make it one. Also get rid of
the unnecessary map call, the list and IP address type can repr
themselves just fine IMO.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 20 Feb 2024 00:14:52 +0000 (19:14 -0500)]
qa/tasks: change map_vips to raise exceptions instead of returning None
None of the callers of map_vips ever checks for a None return. So
instead of handling any error conditions it would always just blow
up with a semi-obscure TypeError. Convert the function to always
raise an exception (one that tries to breifly explain the condition)
when something goes wrong. I also take the opportunity to make
more clearer logging and reduce an indentation level.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Xuehan Xu [Wed, 28 Feb 2024 05:42:04 +0000 (13:42 +0800)]
crimson/os/seastore: copy attrs and omaps when cloning objects
At present, we just copy attrs and omaps one by one, which is not
efficient but very important in terms of functionality especially for
the teuthology tests
Venky Shankar [Wed, 28 Feb 2024 04:02:42 +0000 (09:32 +0530)]
Merge PR #52258 into main
* refs/pull/52258/head:
client: check mds down status bofore getting mds_gid_t from mdsmap
mgr/dashboard: allow sending back error status code fetching clients fails
Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Rishabh Dave <ridave@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com> Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Zac Dover [Mon, 26 Feb 2024 10:03:48 +0000 (20:03 +1000)]
doc/rados: add "change public network" procedure
Add a procedure to /doc/rados/operations/add-or-rm-mons.rst that
explains how to change the public_network in a Ceph cluster deployed
with cephadm. This procedure was developed by Eugen Block, and can be
seen in its original form here:
https://heiterbiswolkig.blogs.nde.ag/2024/02/22/cephadm-change-public-network/
Co-authored-by: Eugen Block <eblock@nde.ag> Signed-off-by: Zac Dover <zac.dover@proton.me>
Venky Shankar [Tue, 27 Feb 2024 05:53:13 +0000 (11:23 +0530)]
Merge PR #52859 into main
* refs/pull/52859/head:
qa: test cases to make sure invalid paths don't get updated
mgr/nfs: use helper to validate cephfs path
mgr/nfs: validate path before updating a cephfs export
mgr/nfs: add a helper to validate cephfs path
Adam King [Fri, 16 Feb 2024 16:24:32 +0000 (11:24 -0500)]
mgr/cephadm: catch CancelledError in asyncio timeout handler
Specifically, concurrent.futures.CancelledError. At least on
python 3.9, this error can be raised when certain commands
being run asynchronously fail. Not catching this results in
the whole cephadm module crashing with something like
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/utils.py", line 94, in do_work
return f(*arg)
File "/usr/share/ceph/mgr/cephadm/serve.py", line 267, in refresh
r = self._refresh_facts(host)
File "/usr/share/ceph/mgr/cephadm/serve.py", line 370, in _refresh_facts
val = self.mgr.wait_async(self._run_cephadm_json(
File "/usr/share/ceph/mgr/cephadm/module.py", line 671, in wait_async
return self.event_loop.get_result(coro, timeout)
File "/usr/share/ceph/mgr/cephadm/ssh.py", line 64, in get_result
return future.result(timeout)
File "/lib64/python3.9/concurrent/futures/_base.py", line 444, in result
raise CancelledError()
concurrent.futures._base.CancelledError
Fixes: https://tracker.ceph.com/issues/64473 Signed-off-by: Adam King <adking@redhat.com>
Casey Bodley [Mon, 26 Feb 2024 14:38:52 +0000 (09:38 -0500)]
test/rgw: increase timeouts in unittest_rgw_dmclock_scheduler
1ms sleeps are generally below the timer's resolution. increase run_for()
durations to 50ms to make the tests far less sensitive to timing. in
practice, none of the sleeps actually wait the full 50ms