mon: validate SERVER_REEF on set-require-min-compat-client
Unit testing
-------------
```
[rzarzynski@o06 build]$ bin/unittest_features
...
[ RUN ] features.release_features
1 argonaut features 0x40000 looks like argonaut
2 bobtail features 0x40000 looks like argonaut
3 cuttlefish features 0x40000 looks like argonaut
4 dumpling features 0x42040000 looks like dumpling
5 emperor features 0x42040000 looks like dumpling
6 firefly features 0x20842040000 looks like firefly
7 giant features 0x20842040000 looks like firefly
8 hammer features 0x1020842040000 looks like hammer
9 infernalis features 0x1020842040000 looks like hammer
10 jewel features 0x401020842040000 looks like jewel
11 kraken features 0xc01020842040000 looks like kraken
12 luminous features 0xe01020842240000 looks like luminous
13 mimic features 0xe01020842240000 looks like luminous
14 nautilus features 0xe01020842240000 looks like luminous
15 octopus features 0xe01020842240000 looks like luminous
16 pacific features 0xe01020842240000 looks like luminous
17 quincy features 0xe01020842240000 looks like luminous
18 reef features 0xe010208d2240000 looks like reef
19 squid features 0xe010208d2240000 looks like reef
[ OK ] features.release_features (0 ms)
```
Manual testing
--------------
\### 'quincy` client connected to `main` cluster
There was `ceph -w` from `quincy` running in the background.
```
[rzarzynski@o06 build]$ bin/ceph osd set-require-min-compat-client reef
Error EPERM: cannot set require_min_compat_client to reef: 1 connected client(s) look like luminous (missing 0x80000000); add --yes-i-really-mean-it to do it anyway
```
mon, osd, *: expose upmap-primary in OSDMap::get_features()
This is a minimal fix to ensure only peers understanding
`pg-upmap-primary` are able to connect, and thus to exclude
the possibility of running into the `pg_upmap_primaries.empty()`
assertion in encoders.
Fixes for other problems will follow up.
The intention is to ship this patch in the very next minor
release of reef.
Manual testing
--------------
\### start using upmap-primar is presence of `quincy` client
NOTE: incompatible clients aren't disconnected but this is
known and expected as we lack the machinery.
\### `main` client is still able to connect
```
[rzarzynski@o06 build]$ bin/ceph -w
cluster:
id: d570a7cd-84ca-4fd0-aafb-80138762c6af
health: HEALTH_WARN
11 mgr modules have failed dependencies
1 pool(s) do not have an application enabled
services:
mon: 1 daemons, quorum a (age 64m)
mgr: x(active, since 64m)
osd: 3 osds: 3 up (since 64m), 3 in (since 64m)
\### `quincy` client may connect again
```
[rzarzynski@o06 build-quincy]$ bin/ceph -s -c /home/rzarzynski/ceph2/build/ceph.conf
cluster:
id: d570a7cd-84ca-4fd0-aafb-80138762c6af
health: HEALTH_WARN
11 mgr modules have failed dependencies
1 pool(s) do not have an application enabled
services:
mon: 1 daemons, quorum a (age 77m)
mgr: x(active, since 77m)
osd: 3 osds: 3 up (since 76m), 3 in (since 76m)
Zac Dover [Tue, 28 May 2024 00:49:35 +0000 (10:49 +1000)]
doc/dev: add target links to perf_counters.rst
Add target links to perf_counters.rst, to remedy the failure to backport
the docs changes in https://github.com/ceph/ceph/pull/53003.
(https://github.com/ceph/ceph/pull/53003 mixed code and docs changes, so
it is understandable why the backport was not achieved back in October,
when the merge to main occurred.)
Ilya Dryomov [Thu, 23 May 2024 16:15:08 +0000 (18:15 +0200)]
.github: expand tests label to all files under qa
The test job definition under qa/suites is an integral part of almost
any test. Often, the test logic is split between the task or workunit
and respective snippet(s) under qa/suites.
Other files under qa are less used, but still related to nothing but
testing, so just add the label on all of it.
Zac Dover [Mon, 20 May 2024 11:55:16 +0000 (21:55 +1000)]
doc/cephfs: edit "Cloning Snapshots" in fs-volumes.rst
Edit the "Cloning Snapshots" section in doc/cephfs/fs-volumes.rst. This
commit represents only a grammar pass. A future commit (and future PR)
will separate this section into subsections by command.
Zac Dover [Mon, 20 May 2024 06:29:44 +0000 (16:29 +1000)]
doc/cephfs: separate commands into sections
Separate commands so that each command has its own subsection in the
section "FS Subvolumes" in the file doc/cephfs/fs-volumes.rst.
Previously, the list of commands for manipulating subvolumes was one
long, unbroken list and the beginning of one section could easily be
mistaken for the end of the previous section.
Matthew Vernon [Wed, 22 May 2024 15:31:33 +0000 (16:31 +0100)]
doc: clarify use of location: in host spec
It wasn't clear that you can specify more than one element of the CRUSH hierarchy in a spec file, nor that it might be useful to do so (e.g. to ensure the host ends up beneath the default root).
So update the text to make it clearer, and similarly the example.
Dan Mick [Wed, 22 May 2024 22:25:51 +0000 (15:25 -0700)]
doc/dev/release-process.rst: note new 'project' arguments
Support added to the release scripts (from ceph-build.git) to
work for ceph-iscsi, so 'project' must be passed to these scripts,
and will appear in the prerelease pathnames. See also
https://github.com/ceph/ceph-build/pull/2243 and
https://github.com/ceph/ceph-container/pull/2210
Ilya Dryomov [Sun, 12 May 2024 09:15:36 +0000 (11:15 +0200)]
qa/suites/krbd: drop pre-single-major test
Single-major mapping scheme was introduced in 2014 and became the
default in 2017. It's getting increasingly difficult to build and,
more importantly, to boot a 10 year old kernel with recent userspace
(systemd, etc). If someone is still running such a kernel, it's
really unlikely that they would have the most recent rbd CLI tool
installed.
Zac Dover [Sun, 12 May 2024 01:39:34 +0000 (11:39 +1000)]
doc/cephfs: edit fs-volumes.rst (1 of x) followup
Include the suggestions for improving doc/cephfs/fs-volumes.rst made by
Anthony D'Atri here
https://github.com/ceph/ceph/pull/57415#discussion_r1597362110
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit cb700d804b4390fd9f55444dcfc04dfebac3a1bf)
Patrick Donnelly [Sat, 11 May 2024 01:27:08 +0000 (21:27 -0400)]
Merge PR #57343 into reef
* refs/pull/57343/head:
reef: qa: do not use `fs authorize` for two fs
PendingReleaseNotes: add note on the client incompatibility health warning and feature bit
doc/cephfs: add client_mds_auth_caps client feature bit
doc/cephfs: add missing client feature bits
doc/cephfs: document MDS_CLIENTS_BROKEN_ROOTSQUASH health error
qa: add tests for MDS_CLIENTS_BROKEN_ROOTSQUASH
mds: raise health warning if client lacks feature for root_squash
mon/MDSMonitor: add note about missing metadata inclusion
mds: check relevant caps for fs include root_squash
mds: refactor out fs_name match in MDSAuthCaps
qa: test for root_squash with multiple caps
qa: pass kwargs to mount from remount
qa: simplify update_attrs and only update relevant keys
client: allow overriding client features
Tested-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
mds: raise health warning if client lacks feature for root_squash
Rather than evict all clients lacking this feature bit, raise a health error
that pushes the administrator to address it. This avoids the surprise of having
all affected clients suddenly evicted in the cluster.
Fixes: https://tracker.ceph.com/issues/65733 Fixes: 954ed30 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 66ff5c9fc8d4664f18b2fa462e96e5548c35951f)
Conflicts:
src/messages/MMDSBeacon.h: missing health beacon type
mon/MDSMonitor: add note about missing metadata inclusion
There is a "client_count" metadata on the health warning that apparently was
intended to be used for aggregating warnings but never was. Add a TODO item for
that.
mds: check relevant caps for fs include root_squash
When denying client reconnects because the MDS caps include root_squash and the
client features do not include CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK, ensure those
caps are only for the file system the MDS is joined to.
Fixes: https://tracker.ceph.com/issues/65733 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit f79ae86f2c23388f6ecc3177764735e071998e09)
John Mulligan [Fri, 29 Mar 2024 18:04:33 +0000 (14:04 -0400)]
ceph.spec.in: remove command-with-macro line
A comment clearly left as a breadcrumb for a node-proxy manpage is
causing (intermittent) build failures. Remove the line and hope
the manpage is added if/when appropriate.
Fixes: 0dd73643649ddc2366e60de4fe6c078b6e112091 Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 5f25005dfbff51531989d121f26ecae308409356)
Zac Dover [Thu, 2 May 2024 12:54:25 +0000 (22:54 +1000)]
doc/cephadm: Reef default images procedure
Address Adam King's request for version-specific
cephadm-container-image-retrieval procedures, which he requested here:
https://github.com/ceph/ceph/pull/57208#discussion_r1586614140
Co-authored-by: Adam King <adking@redhat.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
Add a list of default monitor images to the documentation. This commit
is made in response to a request from Eugen Block, and is made using the
information developed by Mr Block here:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/QGC66QIFBKRTPZAQMQEYFXOGZJ7RLWBN/.
Explain that an error message received in response to
"redirect_resolve_ip_addr True" might be caused by having an
insufficiently recent release of Ceph running in your cluster.