Patrick Donnelly [Thu, 29 May 2025 15:04:00 +0000 (11:04 -0400)]
mon: provide emergency mechanism to use mon keyring
If they key is lost for the `mon.` credential, it's very inconvenient to get it
out of the "auth" database in the mon store. So, allow the operator to create a
new keyring for the mons and use it instead to get mons in quorum again.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Thu, 29 May 2025 14:13:40 +0000 (10:13 -0400)]
mon: cycle through keyring or key_server for auth with mons
After commit `mon: use key_server for looking up mon key`, the mons will now
use the key_server to lookup the `mon.` key when a mon connects. We need to
make the mons prefer using that key with authenticating during probing other
mons. However, the protocol doesn't allow falling back to another key. This is
necessary if what's in the key_server database is out-of-date due to an earlier
loss of quorum. In that case, the operator should update the local keyring file
and the mon should give that a try if auth fails.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Thu, 29 May 2025 14:07:52 +0000 (10:07 -0400)]
mon: use key_server for looking up mon key
Note: the key_server is already configured to fallback (via
KeyServerData::extra_secrets) to the Monitor::keyring which is sourced from the
mon's keyring file.
Using the Monitor::key_server allows us to maintain the mon's secret in the
auth database alongside all other secrets. This makes rotating the mons' keys
the same as all other entities in Ceph. Before this, to rotate the mons' key
you would need to turn off all montitors and then rotate the key files
manually. This is obviously disruptive since it's not a rolling upgrade.
If the key is sourced from the Monitor::key_server, then the key can be rotated
and all mons are aware of the new key. The mons can then proceed to restart as
needed in a non-disruptive fashion.
A followup commit will cleanup the monitor to try either its local keyring key
or the key in the key_server (if present) when authenticating with other mons.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Wed, 26 Mar 2025 02:05:09 +0000 (22:05 -0400)]
tools/ceph_authtool: allow configuring a preferred cipher
This makes testing easier as we can configure all keys in the cluster to be the
given "old" type without modifying each location that ceph-authtool is used.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
mon/MonClient: wipe secrets and invalidate tickets on auth epoch change
* This causes service daemons to drop all known service tickets and request new
ones from the auth server.
* This causes the clients (and service daemons) to request new tickets from the
auth server which will include tickets signed with the new service keys.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
This will be used to indicate to clients / service daemons that the auth
service keys have been rotated. Clients and service daemons are expected to
invalidate their tickets and reauth. Service daemons should wipe their service
keys.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Wed, 26 Mar 2025 01:59:34 +0000 (21:59 -0400)]
mon/AuthMonitor: add dump-keys and wipe-rotating-service-keys
`auth dump-keys` allows examining the key types for each entity and also the
rotating session keys. This lets us confirm key upgrades are done as expected.
`wipe-rotating-service-keys` clears out existing non-auth service keys so that we do not
need to wait for the rotating key expiration. It is not disruptive so long as clients
renew their tickets when prompted by the auth epoch change.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Matan Breizman [Mon, 9 Jun 2025 12:07:49 +0000 (12:07 +0000)]
include/common_fwd: Include Crypto classes
CryptoManager::cct is now used in CephContext ctor. To provide this
defintion
any ceph_context.cc target must also include Crypto.cc.
crimson-alien-common library which only had ceph_context.cc must now
also include Crypto.cc.
However, the fact that crimson-common also includes Crypto.cc would
cause multiple defintions
to any Crypto classes methods.
To resolve this, let's wrap all Crypto classes with TOPNSPC::common that
would be forwarded using common_fwd logic.
Yehuda Sadeh [Wed, 28 May 2025 19:51:19 +0000 (15:51 -0400)]
cephx: sign messages using hmac_sha256
if key type is newer than the original AES, calculate message
hash by using HMAC-SHA256.
We cannot use plain aes256k like we do with the aes key because
of the confounder. The other option would be to inject a
confounder, but that would weaken the cipher.
Yehuda Sadeh [Fri, 7 Mar 2025 18:20:58 +0000 (13:20 -0500)]
auth: add a configurable to control rotating keys cipher type
auth_service_cipher: a mon configurable that determines what type of cipher
the rotating keys are using. The configurable can change at runtime. Note
that the change does not invalidate existing keys, these would expire
based on their ttl.
Yehuda Sadeh [Thu, 27 Feb 2025 21:14:06 +0000 (16:14 -0500)]
auth/cephx: modify client + server challenges hashing
This applies when using ciphers that are not the original
AES-128 one. Use the hmac-sha256 hash now. With AES256KRB5
the original method of encrypting the combined challenges
doesn't work as the confounder randomizes the result.
Yehuda Sadeh [Thu, 27 Feb 2025 16:55:37 +0000 (11:55 -0500)]
ceph-authtool: support --key-type param
Also move the encryption handlers out of the ceph_context.
Handlers are now returned as a shared_ptr, to support the
creation of new handlers with different params (such as
the usage param).
Ilya Dryomov [Fri, 23 Jan 2026 13:48:53 +0000 (14:48 +0100)]
qa: don't assume that /dev/sda or /dev/vda is present in unmap.t
Instead of hard-coding the block device name, use the block device that
is backing the filesystem that the test is running on. We can be quite
sure it won't be an RBD device ;)
Disable OSD bench from benchmarking the OSDs for teuthology tests. This is to
help prevent a cluster warning pertaining to the IOPS value not lying within
a typical threshold range from being raised.
The tests can rely on the built-in static values as defined by
osd_mclock_max_capacity_iops_[ssd|hdd] which should be good enough.
Ilya Dryomov [Wed, 21 Jan 2026 18:41:41 +0000 (19:41 +0100)]
qa: krbd_blkroset.t: eliminate a race in the open_count test
Even at QD=1, dd may take less than 10 seconds to work its way to the
end of a 10M image, producing "No space left on device" error instead
of the expected "Operation not permitted" error which is supposed to
arise from the device getting marked read-only while opened.
With https://github.com/ceph/ceph-build/pull/2497 merged we no loger
build Tentacle+Crimson regularly. As Crimson no longer backport changes
into Tentacle, there's no reason to keep testing it.
Matan Breizman [Thu, 22 Jan 2026 10:00:25 +0000 (12:00 +0200)]
container/build.sh: Use dedicated debug tags
https://github.com/ceph/ceph-build/pull/2497 introduced a debug flavor.
This seems to cause conflicts with the image being pushed to quay as one
of the flavors might override the other.
Tag debug build containers explicitly.
Alternative solution would be to skip debug containers all together.
However. these might be useful for development purposes.
Note, prune-quay might also need to be updated once this is merged.
Kefu Chai [Thu, 22 Jan 2026 03:57:37 +0000 (11:57 +0800)]
cmake: fix undefined PY_LDFLAGS in distutils_install_cython_module
The distutils_install_cython_module() function was using ${PY_LDFLAGS}
without defining it, causing the linker to fail with:
/opt/rh/gcc-toolset-13/root/usr/libexec/gcc/x86_64-redhat-linux/13/ld:
cannot find -lrados: No such file or directory
This bug was introduced in commit d22734f6cb0 which changed:
set(ENV{LDFLAGS} "-L${CMAKE_LIBRARY_OUTPUT_DIRECTORY}")
to:
set(ENV{LDFLAGS} "${PY_LDFLAGS}")
However, PY_LDFLAGS was only defined in distutils_add_cython_module(),
not in distutils_install_cython_module(). This meant that during the
install phase, LDFLAGS was set to an empty string, and the linker
couldn't find librados.so and other Ceph libraries in the build
directory.
The bug was exposed by commit 719b74984605b490f23004eb41583a22c934c5fb
which changed rados.pxd to use C preprocessor conditionals (#ifdef
BUILD_DOC) instead of Cython's compile-time IF statements. This meant
the build now required proper linking during the install phase.
Fix by defining PY_LDFLAGS in distutils_install_cython_module():
Ville Ojamo [Fri, 16 Jan 2026 09:43:31 +0000 (16:43 +0700)]
doc/radosgw: change all intra-docs links to use ref (2 of 6)
Part 2 of 6 to make backporting easier. Depends on part 1.
Use the the ref role for all remaining links in doc/radosgw/ with the
exception of config-ref.rst which will depend on changes to rgw.yaml.in.
The external link definitions syntax being removed is intended for
linking to external websites and not for intra-docs links. Validity of
ref links will be checked during the docs build process.
Add labels for links targets if necessary.
Remove unused external link definitions in the modified files.
Use confval instead of literal text for 2 configuration keys in
vault.rst.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ville Ojamo [Fri, 16 Jan 2026 08:55:27 +0000 (15:55 +0700)]
doc/radosgw: change all intra-docs links to use ref (1 of 6)
Part 1 of 6 to make backporting easier. Many of the following parts
depend on this.
Use the the ref role for all remaining links in doc/radosgw/ with the
exception of config-ref.rst which will depend on changes to rgw.yaml.in.
The external link definitions syntax being removed is intended for
linking to external websites and not for intra-docs links. Validity of
ref links will be checked during the docs build process.
Add labels for links targets if necessary.
Remove unused external link definitions in the modified files.
Use confval instead of literal text for 2 configuration keys in
vault.rst.
Use Ceph Object Gateway consistently in multisite.rst.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ronen Friedman [Wed, 21 Jan 2026 12:37:24 +0000 (14:37 +0200)]
Merge pull request #66626 from ronen-fr/wip-rf-aborthp-justdoc
doc/ceph.rst: scrub-related 'tell pgid' commands
Related to https://github.com/ceph/ceph/pull/66515 Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com> Reviewed-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>