Dan van der Ster [Tue, 11 Jun 2024 20:31:05 +0000 (13:31 -0700)]
cephadm: disable ms_bind_ipv4 if we will enable ms_bind_ipv6
While bootstrapping an ipv6 cluster with an ipv6 initial mon, cephadm correctly enables ms_bind_ipv6=true.
However it leaves ms_bind_ipv4 as it's default (true).
As a result, daemons (osd, mds, ...) will attempt to bind to both ipv6 and ipv4.
Usually this results in an osdmap and fsmap like the following:
```
osd.2 up in weight 1 up_from 26 up_thru 909 down_at 0 last_clean_interval [0,0) [v2:[xxxx:4f8:d0:4401:3::29]:6800/3680761436,v1:[xxxx:4f8:d0:4401:3::29]:6801/3680761436,v2:0.0.0.0:6802/3680761436,v1:0.0.0.0:6803/3680761436] [v2:[xxxx:4f8:d0:4401:3::29]:6804/3680761436,v1:[xxxx:4f8:d0:4401:3::29]:6805/3680761436,v2:0.0.0.0:6806/3680761436,v1:0.0.0.0:6807/3680761436] exists,up 0978a571-cd00-4eba-b00b-f863603a9a70
```
Dual stack is not support by kernels (https://tracker.ceph.com/issues/49581) which leads to hard to debug issues for the end users. (corrupt map messages in dmesg).
Fix by disabling ms_bind_ipv4 in the case ipv6 is desired.
Fixes: https://tracker.ceph.com/issues/66436 Signed-off-by: Dan van der Ster <dan.vanderster@clyso.com> Signed-off-by: Joshua Blanch <joshua.blanch@clyso.com>
(cherry picked from commit 75f0ba5703200f4420a4b53d1c728167daf19909)
Kefu Chai [Thu, 23 May 2024 23:21:51 +0000 (07:21 +0800)]
cephadm: use importlib.metadata for querying ceph_iscsi's version
use importlib.metadata for querying ceph_iscsi's version and fallback to
pkg_resources. as the former is only available in Python 3.8, while
the latter is deprecated.
Kefu Chai [Thu, 23 May 2024 23:16:14 +0000 (07:16 +0800)]
cephadm: extract python() helper to execute python statement
to prepare for a change to use importlib, then fallback to
pkg_resources. as the former is only available in Python 3.8, while
the latter is deprecated.
Laura Flores [Tue, 16 Jul 2024 16:47:35 +0000 (11:47 -0500)]
qa/suites/rados/thrash-old-clients/0-distros$: test on ubuntu_20.04 and drop nautilus
Centos 8 has gone end of life, so we need to choose a different distro on which
to test thrash-old-clients.
thrash-old-clients tests should only support N-3 releases. Nautilus fits with
this, but unfortunately there is no overlapping distro between nautilus, pacific,
octopus, AND quincy (bionic was dropped from quincy, and nautilus does not build
focal). As such, we are only able to test N-2.
Proof that focal is not available for octopus (this is where the test would search for packages):
https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=default&distros=ubuntu%2F20.04%2Fx86_64&ref=nautilus
Edit the section called "Is mount helper present?", the title of which
prior to this commit was "Is mount helper is present?". Other small
disambiguating improvements have been made to the text in the section.
An unselectable prompt has been added before a command.
Improve "Principles for format change" in doc/dev/encoding.rst. This
commit started as a response to Anthony D'Atri's suggestion here: https://github.com/ceph/ceph/pull/58299/files#r1656985564
Review of this section suggested to me that certain minor English usage
improvements would be of benefit. The numbered lists in this section
could still be made a bit clearer.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 570797e5588b67b8c72e5297b61f84d9aa48dc45)
Ilya Dryomov [Thu, 20 Jun 2024 19:13:56 +0000 (21:13 +0200)]
librbd: make diff-iterate in fast-diff mode aware of encryption
diff-iterate wasn't updated when librbd was being prepared to support
encryption in commit 8d6a47933269 ("librbd: add crypto image dispatch
layer"). This is even noted in [1]:
> The two places I skipped for now are DiffIterate and TrimRequest.
CryptoImageDispatch has since been removed, but diff-iterate in
fast-diff mode is still unaware of encryption and just assumes that all
offsets are raw. This means that the callback gets invoked with
incorrect image offsets when encryption is loaded. For example, for
a LUKS1-formatted image with some data at offsets 0 and 20971520,
diff-iterate with encryption loaded reports
as "exists". For any piece of code that is using diff-iterate to
optimize block-by-block processing (e.g. copy an encrypted source image
to a differently-encrypted destination image), this is fatal: it would
skip processing block 20971520 which has data and instead process block 25165824 which doesn't have any data and was to be skipped, producing
a corrupted destination image.
Conflicts:
src/librbd/api/DiffIterate.cc [ ImageArea support not in
quincy ]
src/test/librbd/test_librbd.cc [ commit 4a5a0a5dd82b ("librbd:
add cloned images encryption API") not in quincy
Currently we are laying data only at the beginning of an object.
Extend the skeletons to write to three different offsets in the middle
and also at the end of the object.
Separately, make C and C++ API test variants slightly different in
terms of offsets being targeted to not go through exactly the same
scenario twice.
qa/tasks/cephadm: don't wait for OSDs in create_rbd_pool()
This fails because teuthology.wait_until_osds_up() wants to use
adjust-ulimits wrapper which isn't available in "cephadm shell"
environment. The whole thing is also redundant because cephadm task
is supposed to wait for OSDs to come up earlier, in ceph_osds().
Laura Flores [Wed, 10 Jul 2024 19:49:25 +0000 (14:49 -0500)]
qa/distros/container-hosts: add centos 9 to container hosts
This is a direct merge to quincy that is based on the following
commit: https://github.com/ceph/ceph/commit/c8873c6591d368e12907669c274fd3d6391e3f68
It is not directly backported due to backport complexities.
Laura Flores [Wed, 10 Jul 2024 19:36:31 +0000 (14:36 -0500)]
qa/distros: replace centos8 and rhel8 with centos9
This commit is based on https://github.com/ceph/ceph/commit/7a1dce1ebd883741b5003b9e18d4765526cbbb3e,
but due to backport complexities, it is a direct merge to quincy.
Centos 8 went end of life (as did rhel 8), so we will now
test with centos 9 for quincy.
Ref: https://docs.ceph.com/en/latest/start/os-recommendations/#platforms
Document how to manually pass the search domain to "mon_dns_srv_name" in
doc/rados/configuration/mon-lookup-dns.rst.
This commit is made in response to a request by Lander Duncan that was made on the [ceph-users] mailing list, and can be seen here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F7V4CWLIYCAJ4JXI2JLNY6QPCFPR4SLA/
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 98938a0312dd0c8e0b293ed9aa2e0760cc9619fa)
Repair the link to cephfs-shell.rst in doc/cephfs/cephfs-shell.rst that
was broken in https://github.com/ceph/ceph/pull/41165/ when
doc/cephfs/cephfs-shell.rst was moved to doc/man/8/cephfs-shell.rst.
This commit is made in response to a request by Lander Duncan that was
made on the [ceph-users] mailing list, and can be seen here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F7V4CWLIYCAJ4JXI2JLNY6QPCFPR4SLA/
Pere Diaz Bou [Wed, 26 Jun 2024 13:57:47 +0000 (15:57 +0200)]
doc/rados: update how to install c++ header files
In this example librados2-devel only install C header files on fedora 40,
therefore I added libradospp-devel to the command to include C++ header files.
Zac Dover [Mon, 24 Jun 2024 10:32:30 +0000 (20:32 +1000)]
doc/rados: edit troubleshooting-osd.rst
Make minor changes to the "Debugging Slow Requests" section of
doc/rados/troubleshooting/troubleshooting-osd.rst in preparation
for an expansion of this section in response to a reqeust from Joel
Davidow.
Ilya Dryomov [Tue, 11 Jun 2024 16:10:47 +0000 (18:10 +0200)]
librbd: diff-iterate shouldn't crash on an empty byte range
Commit 0b5ba5fedf70 ("librbd/object_map: add support for ranged
diff-iterate") introduced a regression for the case when whole_object
parameter is set to true. Despite DiffRequest being called into and
another DiffIterate potentially being spawned recursively, an empty
byte range previously happened to make it.
Bail on an empty byte range early just like we have always done on an
empty snap id range (i.e. when start and end versions are the same).
Zack Cerza [Fri, 14 Jun 2024 19:37:16 +0000 (13:37 -0600)]
qa/tasks/qemu: Fix OS version comparison
See: https://sentry.ceph.com/share/issue/21ed88d705854238bdafbf6711e795ee/
They're strings, not floats.
This surfaced as a result of https://github.com/ceph/teuthology/pull/1953