This was motivated by confusing persistence of some health warnings during
testing of health warnings for cephx upgrades. Some services are only doing
health checks during ::encode_pending and others during ::tick. Make it
consistent.
Patrick Donnelly [Thu, 29 May 2025 15:57:55 +0000 (11:57 -0400)]
auth: check service key is valid before decryption
CryptoKey::empty is the correct mechanism to check for an invalid key (and this
is codified elsewhere, fixed in this commit). Decryption would fail with an
abort if the key handler was unset. This would happen after rotating the "mon."
key and then restarting one of the mons.
Patrick Donnelly [Thu, 29 May 2025 15:04:00 +0000 (11:04 -0400)]
mon: provide emergency mechanism to use mon keyring
If they key is lost for the `mon.` credential, it's very inconvenient to get it
out of the "auth" database in the mon store. So, allow the operator to create a
new keyring for the mons and use it instead to get mons in quorum again.
Patrick Donnelly [Thu, 29 May 2025 14:13:40 +0000 (10:13 -0400)]
mon: cycle through keyring or key_server for auth with mons
After commit `mon: use key_server for looking up mon key`, the mons will now
use the key_server to lookup the `mon.` key when a mon connects. We need to
make the mons prefer using that key with authenticating during probing other
mons. However, the protocol doesn't allow falling back to another key. This is
necessary if what's in the key_server database is out-of-date due to an earlier
loss of quorum. In that case, the operator should update the local keyring file
and the mon should give that a try if auth fails.
Patrick Donnelly [Thu, 29 May 2025 14:07:52 +0000 (10:07 -0400)]
mon: use key_server for looking up mon key
Note: the key_server is already configured to fallback (via
KeyServerData::extra_secrets) to the Monitor::keyring which is sourced from the
mon's keyring file.
Using the Monitor::key_server allows us to maintain the mon's secret in the
auth database alongside all other secrets. This makes rotating the mons' keys
the same as all other entities in Ceph. Before this, to rotate the mons' key
you would need to turn off all montitors and then rotate the key files
manually. This is obviously disruptive since it's not a rolling upgrade.
If the key is sourced from the Monitor::key_server, then the key can be rotated
and all mons are aware of the new key. The mons can then proceed to restart as
needed in a non-disruptive fashion.
A followup commit will cleanup the monitor to try either its local keyring key
or the key in the key_server (if present) when authenticating with other mons.
Patrick Donnelly [Wed, 26 Mar 2025 02:05:09 +0000 (22:05 -0400)]
tools/ceph_authtool: allow configuring a preferred cipher
This makes testing easier as we can configure all keys in the cluster to be the
given "old" type without modifying each location that ceph-authtool is used.
mon/MonClient: wipe secrets and invalidate tickets on auth epoch change
* This causes service daemons to drop all known service tickets and request new
ones from the auth server.
* This causes the clients (and service daemons) to request new tickets from the
auth server which will include tickets signed with the new service keys.
This will be used to indicate to clients / service daemons that the auth
service keys have been rotated. Clients and service daemons are expected to
invalidate their tickets and reauth. Service daemons should wipe their service
keys.
Patrick Donnelly [Wed, 26 Mar 2025 01:59:34 +0000 (21:59 -0400)]
mon/AuthMonitor: add dump-keys and wipe-rotating-service-keys
`auth dump-keys` allows examining the key types for each entity and also the
rotating session keys. This lets us confirm key upgrades are done as expected.
`wipe-rotating-service-keys` clears out existing non-auth service keys so that we do not
need to wait for the rotating key expiration. It is not disruptive so long as clients
renew their tickets when prompted by the auth epoch change.
Matan Breizman [Mon, 9 Jun 2025 12:07:49 +0000 (12:07 +0000)]
include/common_fwd: Include Crypto classes
CryptoManager::cct is now used in CephContext ctor. To provide this
defintion
any ceph_context.cc target must also include Crypto.cc.
crimson-alien-common library which only had ceph_context.cc must now
also include Crypto.cc.
However, the fact that crimson-common also includes Crypto.cc would
cause multiple defintions
to any Crypto classes methods.
To resolve this, let's wrap all Crypto classes with TOPNSPC::common that
would be forwarded using common_fwd logic.
Yehuda Sadeh [Wed, 28 May 2025 19:51:19 +0000 (15:51 -0400)]
cephx: sign messages using hmac_sha256
if key type is newer than the original AES, calculate message
hash by using HMAC-SHA256.
We cannot use plain aes256k like we do with the aes key because
of the confounder. The other option would be to inject a
confounder, but that would weaken the cipher.
Yehuda Sadeh [Fri, 7 Mar 2025 18:20:58 +0000 (13:20 -0500)]
auth: add a configurable to control rotating keys cipher type
auth_service_cipher: a mon configurable that determines what type of cipher
the rotating keys are using. The configurable can change at runtime. Note
that the change does not invalidate existing keys, these would expire
based on their ttl.
Yehuda Sadeh [Thu, 27 Feb 2025 21:14:06 +0000 (16:14 -0500)]
auth/cephx: modify client + server challenges hashing
This applies when using ciphers that are not the original
AES-128 one. Use the hmac-sha256 hash now. With AES256KRB5
the original method of encrypting the combined challenges
doesn't work as the confounder randomizes the result.
Yehuda Sadeh [Thu, 27 Feb 2025 16:55:37 +0000 (11:55 -0500)]
ceph-authtool: support --key-type param
Also move the encryption handlers out of the ceph_context.
Handlers are now returned as a shared_ptr, to support the
creation of new handlers with different params (such as
the usage param).
Jon Bailey [Wed, 20 Aug 2025 10:11:09 +0000 (11:11 +0100)]
osd: Reduce the amount of status invalidations when rolling shards forwards during peering
Currently stats invalidations happen during peering when rolling forward shards.
We can reduce this so we only invalidate the stats when we don't have any other shards at the version we want to roll the stats forwards to.
In the cases where we have a shard with the stats at the correct version, we use those stats instead of invalidating.
If we do not have any shards with the correct version of stats, we do the invalidate as before.
* Current primary shard has been absent so has missed the latest few writes
* All the recent writes are partial writes that have not updated shard X
* All the recent writes have completed
The authorative shard is chosen from the set of primary-capable shards
that have the highest last epoch started, these have all got log entries
for the recent writes.
The get log shard is chosen from the set of shards that have the highest
last epoch started, this chooses shard X because its furthest behind
The primary shard last update is not less than get log shard last
update so this if statement decides that it has a good enough log:
We then proceed through peering using the primary log and the
log from shard X. Neither have details about the recent writes
which are then incorrectly rolled back.
The if statement should be looking at last_update for the
authorative shard rather than the get_log_shard, the code
would then realize that it needs to get the log from the
authorative shard first and then have a second pass
where it gets the log from the get log shard.
Peering would then have information about the partial writes
(obtained from the authorative shards log) and could correctly
roll these writes forward by deducing that the get_log_shard
didn't have these log entries because they were partial writes.
Alex Ainscow [Fri, 8 Aug 2025 09:25:53 +0000 (10:25 +0100)]
osd: Fix segfault in EC debug string
The old debug_string implementation was potentially reading up to 3
bytes off the end of an array. It was also doing lots of unnecessary
bufferlist reconstructs. This refactor of this function fixes both
issues.
Bill Scales [Fri, 8 Aug 2025 08:58:14 +0000 (09:58 +0100)]
osd: Optimized EC backfill interval has wrong versions
Bug in the optimized EC code creating the backfill
interval on the primary. It is creating a map with
the object version for each backfilling shard. When
there are multiple backfill targets the code was
overwriting oi.version with the version
for a shard that has had partial writes which
can result in the object not being backfilled.
Can manifest as a data integirty issue, scrub
error or snapshot corruption.