Ilya Dryomov [Mon, 6 May 2024 06:16:01 +0000 (08:16 +0200)]
qa/workunits/rbd: wait for replaying status in bootstrap tests
wait_for_replay_complete() doesn't wait for image status to get
updated. This didn't matter previously because these tests are run on
two different pools and nothing else was following.
Zac Dover [Tue, 28 May 2024 16:27:53 +0000 (02:27 +1000)]
doc/dev: add note about intro of perf counters
Add a note to the "perf counter" section of doc/dev/perf_counters.rst
that explains that this feature was introduced in the Reef release of
Ceph. This note will prevent us from accidentally backporting
perf-counter-related PRs to Quincy.
Rishabh Dave [Mon, 27 May 2024 19:37:35 +0000 (01:07 +0530)]
doc/developer_guide: update doc about installing teuthology
There are 2 more ways to install teuthology. Approach with boostrap
script is easier and more convenient while other approach is more
elaborate but manual, document both of them. Don't delete the currently
documented approach because it lets users install teuthology
conveniently in a custom virtual environment. So, keep all three.
Zac Dover [Mon, 27 May 2024 11:09:40 +0000 (21:09 +1000)]
doc/cephfs: s/subvolumegroups/subvolume groups
Use the term "subvolume groups" instead of "subvolumegroups" where the
term appears in plain English. The string "subvolumegroups" is correct
in commands, and remains unchanged.
Also add formatting to command output, to make clearer that the output
is indeed output.
Zac Dover [Tue, 28 May 2024 00:49:35 +0000 (10:49 +1000)]
doc/dev: add target links to perf_counters.rst
Add target links to perf_counters.rst, to remedy the failure to backport
the docs changes in https://github.com/ceph/ceph/pull/53003.
(https://github.com/ceph/ceph/pull/53003 mixed code and docs changes, so
it is understandable why the backport was not achieved back in October,
when the merge to main occurred.)
Ilya Dryomov [Thu, 23 May 2024 16:15:08 +0000 (18:15 +0200)]
.github: expand tests label to all files under qa
The test job definition under qa/suites is an integral part of almost
any test. Often, the test logic is split between the task or workunit
and respective snippet(s) under qa/suites.
Other files under qa are less used, but still related to nothing but
testing, so just add the label on all of it.
Zac Dover [Mon, 20 May 2024 11:55:16 +0000 (21:55 +1000)]
doc/cephfs: edit "Cloning Snapshots" in fs-volumes.rst
Edit the "Cloning Snapshots" section in doc/cephfs/fs-volumes.rst. This
commit represents only a grammar pass. A future commit (and future PR)
will separate this section into subsections by command.
Zac Dover [Mon, 20 May 2024 06:29:44 +0000 (16:29 +1000)]
doc/cephfs: separate commands into sections
Separate commands so that each command has its own subsection in the
section "FS Subvolumes" in the file doc/cephfs/fs-volumes.rst.
Previously, the list of commands for manipulating subvolumes was one
long, unbroken list and the beginning of one section could easily be
mistaken for the end of the previous section.
Matthew Vernon [Wed, 22 May 2024 15:31:33 +0000 (16:31 +0100)]
doc: clarify use of location: in host spec
It wasn't clear that you can specify more than one element of the CRUSH hierarchy in a spec file, nor that it might be useful to do so (e.g. to ensure the host ends up beneath the default root).
So update the text to make it clearer, and similarly the example.
Dan Mick [Wed, 22 May 2024 22:25:51 +0000 (15:25 -0700)]
doc/dev/release-process.rst: note new 'project' arguments
Support added to the release scripts (from ceph-build.git) to
work for ceph-iscsi, so 'project' must be passed to these scripts,
and will appear in the prerelease pathnames. See also
https://github.com/ceph/ceph-build/pull/2243 and
https://github.com/ceph/ceph-container/pull/2210
Ilya Dryomov [Sun, 12 May 2024 09:15:36 +0000 (11:15 +0200)]
qa/suites/krbd: drop pre-single-major test
Single-major mapping scheme was introduced in 2014 and became the
default in 2017. It's getting increasingly difficult to build and,
more importantly, to boot a 10 year old kernel with recent userspace
(systemd, etc). If someone is still running such a kernel, it's
really unlikely that they would have the most recent rbd CLI tool
installed.
Zac Dover [Sun, 12 May 2024 01:39:34 +0000 (11:39 +1000)]
doc/cephfs: edit fs-volumes.rst (1 of x) followup
Include the suggestions for improving doc/cephfs/fs-volumes.rst made by
Anthony D'Atri here
https://github.com/ceph/ceph/pull/57415#discussion_r1597362110
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit cb700d804b4390fd9f55444dcfc04dfebac3a1bf)
Patrick Donnelly [Sat, 11 May 2024 01:27:08 +0000 (21:27 -0400)]
Merge PR #57343 into reef
* refs/pull/57343/head:
reef: qa: do not use `fs authorize` for two fs
PendingReleaseNotes: add note on the client incompatibility health warning and feature bit
doc/cephfs: add client_mds_auth_caps client feature bit
doc/cephfs: add missing client feature bits
doc/cephfs: document MDS_CLIENTS_BROKEN_ROOTSQUASH health error
qa: add tests for MDS_CLIENTS_BROKEN_ROOTSQUASH
mds: raise health warning if client lacks feature for root_squash
mon/MDSMonitor: add note about missing metadata inclusion
mds: check relevant caps for fs include root_squash
mds: refactor out fs_name match in MDSAuthCaps
qa: test for root_squash with multiple caps
qa: pass kwargs to mount from remount
qa: simplify update_attrs and only update relevant keys
client: allow overriding client features
Tested-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
mds: raise health warning if client lacks feature for root_squash
Rather than evict all clients lacking this feature bit, raise a health error
that pushes the administrator to address it. This avoids the surprise of having
all affected clients suddenly evicted in the cluster.
Fixes: https://tracker.ceph.com/issues/65733 Fixes: 954ed30 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 66ff5c9fc8d4664f18b2fa462e96e5548c35951f)
Conflicts:
src/messages/MMDSBeacon.h: missing health beacon type
mon/MDSMonitor: add note about missing metadata inclusion
There is a "client_count" metadata on the health warning that apparently was
intended to be used for aggregating warnings but never was. Add a TODO item for
that.
mds: check relevant caps for fs include root_squash
When denying client reconnects because the MDS caps include root_squash and the
client features do not include CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK, ensure those
caps are only for the file system the MDS is joined to.
Fixes: https://tracker.ceph.com/issues/65733 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit f79ae86f2c23388f6ecc3177764735e071998e09)
John Mulligan [Fri, 29 Mar 2024 18:04:33 +0000 (14:04 -0400)]
ceph.spec.in: remove command-with-macro line
A comment clearly left as a breadcrumb for a node-proxy manpage is
causing (intermittent) build failures. Remove the line and hope
the manpage is added if/when appropriate.
Fixes: 0dd73643649ddc2366e60de4fe6c078b6e112091 Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 5f25005dfbff51531989d121f26ecae308409356)
rbd-mirror: remove callout when destroying pool replayer
If a pool replayer is removed in an error state (e.g. after failing to
connect to the remote cluster), its callout should be removed as well.
Otherwise, the error would persist causing "daemon health: ERROR"
status to be reported even after a new pool replayer is created and
started successfully.
rbd-mirror: shut down and remove pool replayer if peer changes
The code in Mirror::update_pool_replayers() responsible for shutting
down and removing stale pool replayers kicks in only in case the peer
is removed, but not if the peer changes. However, the code responsible
for (re)starting pool replayers in the same method _does_ create and
start a new pool replayer in that case. As a result, we can end up
with nearly identical pool replayers running at the same time, hogging
OS resources and confusing instance_id tracking logic and mirror status
reporting at the very least.
The root cause is that PeerSpec is matched normally (i.e. based on all
fields) when it comes to m_pool_replayers, and based only on UUID when
it comes to pool_peers. This was missed in commit 5463e1a1e1b7
("rbd-mirror: extract optional peer mon_host/key values from MON").