Jos Collin [Tue, 28 May 2024 14:57:55 +0000 (20:27 +0530)]
cephfs_mirror: Add ErrorListener to maintain blocklisted/failed timestamp in FSMirror
Have FSMirror register a listener with InstanceWatcher/MirrorWatcher which would get invoked when the mirror daemon is blocklisted or failed.
Thus FSMirror can maintain the last blocklisted/failed timestamp and use that for restarting the mirror daemon.
Fixes: https://tracker.ceph.com/issues/64927 Fixes: https://tracker.ceph.com/issues/51964 Fixes: https://tracker.ceph.com/issues/63931 Fixes: https://tracker.ceph.com/issues/63089 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 77ec7bfde7a349b0e06b34ecdf328996c7642d43)
Edit the section called "Is mount helper present?", the title of which
prior to this commit was "Is mount helper is present?". Other small
disambiguating improvements have been made to the text in the section.
An unselectable prompt has been added before a command.
Improve "Principles for format change" in doc/dev/encoding.rst. This
commit started as a response to Anthony D'Atri's suggestion here: https://github.com/ceph/ceph/pull/58299/files#r1656985564
Review of this section suggested to me that certain minor English usage
improvements would be of benefit. The numbered lists in this section
could still be made a bit clearer.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 570797e5588b67b8c72e5297b61f84d9aa48dc45)
Ilya Dryomov [Thu, 20 Jun 2024 19:13:56 +0000 (21:13 +0200)]
librbd: make diff-iterate in fast-diff mode aware of encryption
diff-iterate wasn't updated when librbd was being prepared to support
encryption in commit 8d6a47933269 ("librbd: add crypto image dispatch
layer"). This is even noted in [1]:
> The two places I skipped for now are DiffIterate and TrimRequest.
CryptoImageDispatch has since been removed, but diff-iterate in
fast-diff mode is still unaware of encryption and just assumes that all
offsets are raw. This means that the callback gets invoked with
incorrect image offsets when encryption is loaded. For example, for
a LUKS1-formatted image with some data at offsets 0 and 20971520,
diff-iterate with encryption loaded reports
as "exists". For any piece of code that is using diff-iterate to
optimize block-by-block processing (e.g. copy an encrypted source image
to a differently-encrypted destination image), this is fatal: it would
skip processing block 20971520 which has data and instead process block 25165824 which doesn't have any data and was to be skipped, producing
a corrupted destination image.
Currently we are laying data only at the beginning of an object.
Extend the skeletons to write to three different offsets in the middle
and also at the end of the object.
Separately, make C and C++ API test variants slightly different in
terms of offsets being targeted to not go through exactly the same
scenario twice.
After rollback started being tested in commit b3977c53c930
("test/librbd: make rollback in TestGroup.add_snapshot{,PP}
meaningful"), these tests can fail on comparing post-rollback
data to expected data if run with exclusive lock disabled.
This doesn't occur with exclusive lock enabled because the RBD
cache gets invalidated implicitly before releasing the lock.
While at it, pass LIBRADOS_OP_FLAG_FADVISE_FUA to avoid relying
on any cache settings that happen to be in effect.
Ilya Dryomov [Fri, 14 Jun 2024 12:04:39 +0000 (14:04 +0200)]
librbd: disallow group snap rollback if memberships don't match
Before proceeding with group rollback, ensure that the set of images
that took part in the group snapshot matches the set of images that are
currently part of the group. Otherwise, because we preserve affected
snapshots when an image is removed from the group, data loss can ensue
where an image gets rolled back while part of another group or not part
of any group but long repurposed for something else.
Similarly, ensure that the group snapshot is complete.
Document how to manually pass the search domain to "mon_dns_srv_name" in
doc/rados/configuration/mon-lookup-dns.rst.
This commit is made in response to a request by Lander Duncan that was made on the [ceph-users] mailing list, and can be seen here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F7V4CWLIYCAJ4JXI2JLNY6QPCFPR4SLA/
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 98938a0312dd0c8e0b293ed9aa2e0760cc9619fa)
Laura Flores [Fri, 14 Jun 2024 21:24:20 +0000 (16:24 -0500)]
qa/suites/upgrade/reef-p2p/reef-p2p-parallel: increment upgrade to 18.2.2
Instead of installing 18.2.0, which still contains the osdmap crc bug tracked
in https://tracker.ceph.com/issues/63389, we should install v18.2.2 since this contains
the fix. Then, we upgrade to reef_latest. In this scenario, we do not expect to see the
crc bug. If we test any upgrade path before that, we will hit the warning and the test will fail.
Fixes: https://tracker.ceph.com/issues/66505 Signed-off-by: Laura Flores <lflores@ibm.com>
Casey Bodley [Wed, 26 Jun 2024 16:11:10 +0000 (12:11 -0400)]
qa/rgw/upgrade/pacific: remove centos_8.stream.yaml and rely on ubuntu_20.04.yaml
we can't test this pacific->reef upgrade path on centos because pacific doesn't
have centos 9 builds, and reef no longer has centos 8 builds. only test
this upgrade on ubuntu focal which is still supported for both releases
this commit targets the reef branch directly because this rgw/upgrade/pacific
suite no longer exists on main and squid branches
Adam King [Fri, 14 Jun 2024 15:59:27 +0000 (11:59 -0400)]
qa/crimson-rados: remove centos 8 symlinks
As we're trying to drop centos 8 from the distros we
test on these symlinks are now dead and need to be
cleaned up. In main, there was no replacement for
these symlinks (it just relies on the
crimson-supposted-all-distro dir for its distro)
so I'm just removing them here.
Adam King [Fri, 7 Jun 2024 17:36:31 +0000 (13:36 -0400)]
qa/distros: add ubuntu 22.04 for containerized tests
Partial backport of 0fa3eb67387eaf403b5a6e716a81582949dcecf1
that adds the symlinks for the containerized tests to use
ubuntu 22.04 but leaves out the part dropping ubuntu 20.04
Adam King [Mon, 11 Dec 2023 20:44:30 +0000 (15:44 -0500)]
qa/cephadm: fix iscsi pids limit check for centos 9
Centos 9 uses cgroups v2 which has a slightly
different file location for the pids.max. This commit
updates the test to also check the new location
so the test can pass on centos 9
Adam King [Mon, 11 Dec 2023 18:59:42 +0000 (13:59 -0500)]
qa/cephadm: use quincy for add-repo test
There are no centos 9 build for octopus, so if we
want to start testing on cnetos 9 as a distro we need
the add-repo test to be done on a newer release
for which there are actual builds
the subsuite had a supported-all-distro$/ subdirectory, but that only
contained centos_8.yaml. qa/tasks/rabbitmq.py is hardcoded to use 'yum'
and rpm packages, so replace supported-all-distro$ with a link to
centos_latest.yaml
Zac Dover [Thu, 27 Jun 2024 18:09:50 +0000 (04:09 +1000)]
cephadm: use importlib.metadata for querying ceph_iscsi's version
use importlib.metadata for querying ceph_iscsi's version and fallback to
pkg_resources. as the former is only available in Python 3.8, while
the latter is deprecated.
Refs https://tracker.ceph.com/issues/66201
This commit is functionally equivalent to a Reef-targeted backport of
https://github.com/ceph/ceph/pull/57685.
Repair the link to cephfs-shell.rst in doc/cephfs/cephfs-shell.rst that
was broken in https://github.com/ceph/ceph/pull/41165/ when
doc/cephfs/cephfs-shell.rst was moved to doc/man/8/cephfs-shell.rst.
This commit is made in response to a request by Lander Duncan that was
made on the [ceph-users] mailing list, and can be seen here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F7V4CWLIYCAJ4JXI2JLNY6QPCFPR4SLA/
Dhairya Parmar [Mon, 6 Nov 2023 14:24:20 +0000 (19:54 +0530)]
qa: refactor client upgrade yamls and other minor touchups
* start testing new_ops and stress_tests with both the drivers(i.e. fuse and kclient)
therefore moved 0-clients/ from tasks/3-workload/new_ops/ to tasks/ and renamed it to
2-clients/
* since new_ops/ and stress_tests/ now share the common upgrade yaml, moved the
tests yamls(in stress_tests/1-tests) directly under 3-workload/stress_tests/
* renamed 1-client-sanity.yaml in new_ops/ to newops.yaml
Nizamudeen A [Wed, 26 Jun 2024 13:22:40 +0000 (18:52 +0530)]
mgr/dashboard: fix clone async validators with different groups
Providing a way to dynamically update the async validator based on the
selector field so that when the selected value changes, the depended
field like the clone name gets validated again against the new value