Edit the section called "Is mount helper present?", the title of which
prior to this commit was "Is mount helper is present?". Other small
disambiguating improvements have been made to the text in the section.
An unselectable prompt has been added before a command.
Ilya Dryomov [Thu, 20 Jun 2024 19:13:56 +0000 (21:13 +0200)]
librbd: make diff-iterate in fast-diff mode aware of encryption
diff-iterate wasn't updated when librbd was being prepared to support
encryption in commit 8d6a47933269 ("librbd: add crypto image dispatch
layer"). This is even noted in [1]:
> The two places I skipped for now are DiffIterate and TrimRequest.
CryptoImageDispatch has since been removed, but diff-iterate in
fast-diff mode is still unaware of encryption and just assumes that all
offsets are raw. This means that the callback gets invoked with
incorrect image offsets when encryption is loaded. For example, for
a LUKS1-formatted image with some data at offsets 0 and 20971520,
diff-iterate with encryption loaded reports
as "exists". For any piece of code that is using diff-iterate to
optimize block-by-block processing (e.g. copy an encrypted source image
to a differently-encrypted destination image), this is fatal: it would
skip processing block 20971520 which has data and instead process block 25165824 which doesn't have any data and was to be skipped, producing
a corrupted destination image.
Currently we are laying data only at the beginning of an object.
Extend the skeletons to write to three different offsets in the middle
and also at the end of the object.
Separately, make C and C++ API test variants slightly different in
terms of offsets being targeted to not go through exactly the same
scenario twice.
After rollback started being tested in commit b3977c53c930
("test/librbd: make rollback in TestGroup.add_snapshot{,PP}
meaningful"), these tests can fail on comparing post-rollback
data to expected data if run with exclusive lock disabled.
This doesn't occur with exclusive lock enabled because the RBD
cache gets invalidated implicitly before releasing the lock.
While at it, pass LIBRADOS_OP_FLAG_FADVISE_FUA to avoid relying
on any cache settings that happen to be in effect.
Ilya Dryomov [Fri, 14 Jun 2024 12:04:39 +0000 (14:04 +0200)]
librbd: disallow group snap rollback if memberships don't match
Before proceeding with group rollback, ensure that the set of images
that took part in the group snapshot matches the set of images that are
currently part of the group. Otherwise, because we preserve affected
snapshots when an image is removed from the group, data loss can ensue
where an image gets rolled back while part of another group or not part
of any group but long repurposed for something else.
Similarly, ensure that the group snapshot is complete.
Document how to manually pass the search domain to "mon_dns_srv_name" in
doc/rados/configuration/mon-lookup-dns.rst.
This commit is made in response to a request by Lander Duncan that was made on the [ceph-users] mailing list, and can be seen here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F7V4CWLIYCAJ4JXI2JLNY6QPCFPR4SLA/
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 98938a0312dd0c8e0b293ed9aa2e0760cc9619fa)
Laura Flores [Fri, 14 Jun 2024 21:24:20 +0000 (16:24 -0500)]
qa/suites/upgrade/reef-p2p/reef-p2p-parallel: increment upgrade to 18.2.2
Instead of installing 18.2.0, which still contains the osdmap crc bug tracked
in https://tracker.ceph.com/issues/63389, we should install v18.2.2 since this contains
the fix. Then, we upgrade to reef_latest. In this scenario, we do not expect to see the
crc bug. If we test any upgrade path before that, we will hit the warning and the test will fail.
Fixes: https://tracker.ceph.com/issues/66505 Signed-off-by: Laura Flores <lflores@ibm.com>
Casey Bodley [Wed, 26 Jun 2024 16:11:10 +0000 (12:11 -0400)]
qa/rgw/upgrade/pacific: remove centos_8.stream.yaml and rely on ubuntu_20.04.yaml
we can't test this pacific->reef upgrade path on centos because pacific doesn't
have centos 9 builds, and reef no longer has centos 8 builds. only test
this upgrade on ubuntu focal which is still supported for both releases
this commit targets the reef branch directly because this rgw/upgrade/pacific
suite no longer exists on main and squid branches
Adam King [Fri, 14 Jun 2024 15:59:27 +0000 (11:59 -0400)]
qa/crimson-rados: remove centos 8 symlinks
As we're trying to drop centos 8 from the distros we
test on these symlinks are now dead and need to be
cleaned up. In main, there was no replacement for
these symlinks (it just relies on the
crimson-supposted-all-distro dir for its distro)
so I'm just removing them here.
Adam King [Fri, 7 Jun 2024 17:36:31 +0000 (13:36 -0400)]
qa/distros: add ubuntu 22.04 for containerized tests
Partial backport of 0fa3eb67387eaf403b5a6e716a81582949dcecf1
that adds the symlinks for the containerized tests to use
ubuntu 22.04 but leaves out the part dropping ubuntu 20.04
Adam King [Mon, 11 Dec 2023 20:44:30 +0000 (15:44 -0500)]
qa/cephadm: fix iscsi pids limit check for centos 9
Centos 9 uses cgroups v2 which has a slightly
different file location for the pids.max. This commit
updates the test to also check the new location
so the test can pass on centos 9
Adam King [Mon, 11 Dec 2023 18:59:42 +0000 (13:59 -0500)]
qa/cephadm: use quincy for add-repo test
There are no centos 9 build for octopus, so if we
want to start testing on cnetos 9 as a distro we need
the add-repo test to be done on a newer release
for which there are actual builds
the subsuite had a supported-all-distro$/ subdirectory, but that only
contained centos_8.yaml. qa/tasks/rabbitmq.py is hardcoded to use 'yum'
and rpm packages, so replace supported-all-distro$ with a link to
centos_latest.yaml
Repair the link to cephfs-shell.rst in doc/cephfs/cephfs-shell.rst that
was broken in https://github.com/ceph/ceph/pull/41165/ when
doc/cephfs/cephfs-shell.rst was moved to doc/man/8/cephfs-shell.rst.
This commit is made in response to a request by Lander Duncan that was
made on the [ceph-users] mailing list, and can be seen here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F7V4CWLIYCAJ4JXI2JLNY6QPCFPR4SLA/
Dhairya Parmar [Mon, 6 Nov 2023 14:24:20 +0000 (19:54 +0530)]
qa: refactor client upgrade yamls and other minor touchups
* start testing new_ops and stress_tests with both the drivers(i.e. fuse and kclient)
therefore moved 0-clients/ from tasks/3-workload/new_ops/ to tasks/ and renamed it to
2-clients/
* since new_ops/ and stress_tests/ now share the common upgrade yaml, moved the
tests yamls(in stress_tests/1-tests) directly under 3-workload/stress_tests/
* renamed 1-client-sanity.yaml in new_ops/ to newops.yaml
Nizamudeen A [Wed, 26 Jun 2024 13:22:40 +0000 (18:52 +0530)]
mgr/dashboard: fix clone async validators with different groups
Providing a way to dynamically update the async validator based on the
selector field so that when the selected value changes, the depended
field like the clone name gets validated again against the new value
Pere Diaz Bou [Wed, 26 Jun 2024 13:57:47 +0000 (15:57 +0200)]
doc/rados: update how to install c++ header files
In this example librados2-devel only install C header files on fedora 40,
therefore I added libradospp-devel to the command to include C++ header files.
Zac Dover [Mon, 24 Jun 2024 10:32:30 +0000 (20:32 +1000)]
doc/rados: edit troubleshooting-osd.rst
Make minor changes to the "Debugging Slow Requests" section of
doc/rados/troubleshooting/troubleshooting-osd.rst in preparation
for an expansion of this section in response to a reqeust from Joel
Davidow.
Nizamudeen A [Fri, 24 May 2024 15:16:17 +0000 (20:46 +0530)]
mgr/dashboard: add dueTime to rgw bucket validator
the unique async validator which checks if the typed bucket is existing
or not in the bucket creation form sends a request to the backend on
each keystroke. Each keystroke will raise an exception if the bucket is
not found.
Nizamudeen A [Fri, 7 Jun 2024 13:49:42 +0000 (19:19 +0530)]
mgr/dashboard: fix edit bucket failing in other selected gateways
even if I select gateway 8002, the bucket policy req seems to go through 8000 and doesn't find the bucket
```
2024-06-07T13:40:33.161+0000 7f563be00700 0 [dashboard DEBUG rest_client] RGW REST API GET req: /hello?policy data: None
2024-06-07T13:40:33.164+0000 7f563be00700 0 [dashboard DEBUG urllib3.connectionpool] http://172.20.0.5:8000 "GET /hello?policy HTTP/1.1" 404 174
2024-06-07T13:40:33.164+0000 7f563be00700 0 [dashboard ERROR rest_client] RGW REST API failed GET req status: 404
2024-06-07T13:40:33.164+0000 7f563be00700 0 [dashboard ERROR exception] Internal Server Error
Traceback (most recent call last):
File "/ceph/src/pybind/mgr/dashboard/services/exception.py", line 47, in dashboard_exception_handler
return handler(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in __call__
return self.callable(*self.args, **self.kwargs)
File "/ceph/src/pybind/mgr/dashboard/controllers/_base_controller.py", line 263, in inner
ret = func(*args, **kwargs)
File "/ceph/src/pybind/mgr/dashboard/controllers/_rest_controller.py", line 193, in wrapper
return func(*vpath, **params)
File "/ceph/src/pybind/mgr/dashboard/controllers/rgw.py", line 463, in get
result['bucket_policy'] = self._get_policy(bucket_name)
File "/ceph/src/pybind/mgr/dashboard/controllers/rgw.py", line 381, in _get_policy
return rgw_client.get_bucket_policy(bucket)
File "/ceph/src/pybind/mgr/dashboard/rest_client.py", line 543, in func_wrapper
**kwargs)
File "/ceph/src/pybind/mgr/dashboard/services/rgw_client.py", line 957, in get_bucket_policy
raise e
File "/ceph/src/pybind/mgr/dashboard/services/rgw_client.py", line 949, in get_bucket_policy
request = request()
File "/ceph/src/pybind/mgr/dashboard/rest_client.py", line 325, in __call__
data, raw_content, headers)
File "/ceph/src/pybind/mgr/dashboard/rest_client.py", line 428, in do_request
resp.content)
dashboard.rest_client.RequestException: RGW REST API failed request with status code 404
(b'{"Code":"NoSuchBucket","Message":"","BucketName":"hello","RequestId":"tx0000'
b'0d73bbbad485175ea-0066630dd1-18785-zone1-zg1-realm1","HostId":"18785-zone1-z'
b'g1-realm1-zg1-realm1"}')
```
But for the same bucket the encryption and other req goes through the correct gateway
```
2024-06-07T13:40:32.704+0000 7f563be00700 0 [dashboard DEBUG urllib3.connectionpool] http://172.20.0.5:8002 "GET /hello?versioning HTTP/1.1" 200 2
2024-06-07T13:40:32.745+0000 7f563be00700 0 [dashboard DEBUG rest_client] RGW REST API GET res status: 200 content: {}
2024-06-07T13:40:32.745+0000 7f563be00700 0 [dashboard INFO rgw_client] Found RGW daemon with configuration: host=172.20.0.5, port=8000, ssl=False
2024-06-07T13:40:32.746+0000 7f563be00700 0 [dashboard INFO rgw_client] Found RGW daemon with configuration: host=172.20.0.5, port=8002, ssl=False
2024-06-07T13:40:32.746+0000 7f563be00700 0 [dashboard DEBUG rest_client] RGW REST API GET req: /hello?encryption data: None
2024-06-07T13:40:32.747+0000 7f563be00700 0 [dashboard DEBUG urllib3.connectionpool] http://172.20.0.5:8002 "GET /hello?encr
```
commit [1] introduced a behavior change.
`ceph-volume lvm prepare` used to create VGs/LVs when it was passed partitions
for db and/or wal devices. Since commit [1] has been introduced, it made ceph-volume
consume the partition directly, it doesn't create LV anymore. Although this
doesn't prevent from creating OSDs, this is a behavior change.
Ilya Dryomov [Tue, 11 Jun 2024 16:10:47 +0000 (18:10 +0200)]
librbd: diff-iterate shouldn't crash on an empty byte range
Commit 0b5ba5fedf70 ("librbd/object_map: add support for ranged
diff-iterate") introduced a regression for the case when whole_object
parameter is set to true. Despite DiffRequest being called into and
another DiffIterate potentially being spawned recursively, an empty
byte range previously happened to make it.
Bail on an empty byte range early just like we have always done on an
empty snap id range (i.e. when start and end versions are the same).
After the rollback assert in TestGroup.add_snapshot{,PP} was made
meaningful in the previous commit, it fails in mock tests which means
that rollback has never been exercised properly...
While I confess to not following file->snap_id == CEPH_NOSNAP branch
especially given how file variable is shadowed, it's pretty clear that
get_snap_read() doesn't belong here -- the snapshot selected for reads
has nothing to do with rollback. Replacing it with the rollback snap
ID makes sense of the other branches and makes the tests in question
pass.
Ilya Dryomov [Thu, 13 Jun 2024 14:24:43 +0000 (16:24 +0200)]
test/librbd: make rollback in TestGroup.add_snapshot{,PP} meaningful
The rollback assert doesn't really test anything -- because orig_data
and test_data are written to non-overlapping areas, the test would pass
even if rbd_group_snap_rollback() does nothing (i.e. rollback isn't
performed) as long as the call returns 0.