]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ci.git/log
ceph-ci.git
2 weeks agoceph-dev-pipeline: configure wip-pdonnell-testing-20260210.213732 testing/wip-pdonnell-testing-20260210.213732
Patrick Donnelly [Tue, 10 Feb 2026 21:37:33 +0000 (16:37 -0500)]
ceph-dev-pipeline: configure

See documentation: https://github.com/ceph/ceph-build/tree/main/ceph-trigger-build#git-trailer-parameters

DISTROS: centos10 noble jammy centos9 windows
ARCHS: arm64 x86_64
FLAVORS: debug default

2 weeks agoqa: add osd debugging
Patrick Donnelly [Tue, 10 Feb 2026 21:36:06 +0000 (16:36 -0500)]
qa: add osd debugging

To tune for OSD full warnings.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa: add tentacle for cephx upgrade tests
Patrick Donnelly [Tue, 27 Jan 2026 02:04:16 +0000 (21:04 -0500)]
qa: add tentacle for cephx upgrade tests

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa: do not install ceph-osd-classic on older ceph releases
Patrick Donnelly [Tue, 27 Jan 2026 02:03:55 +0000 (21:03 -0500)]
qa: do not install ceph-osd-classic on older ceph releases

This is a new package for Umbrella.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomgr/cephadm: rotate keyring for core ceph daemons during upgrade
Adam King [Wed, 14 May 2025 17:16:43 +0000 (13:16 -0400)]
mgr/cephadm: rotate keyring for core ceph daemons during upgrade

Specifically, this causes us to rotate the mgr, mon, OSD,
and mds keyrings. The mgr and mon keyring are done as soon
as we see all the mons have been upgraded and OSD/mds happens
when we reach them in the upgrade order.

Signed-off-by: Adam King <adking@redhat.com>
2 weeks agoauth,mon: print key loading failures once at init
Patrick Donnelly [Tue, 6 Jan 2026 15:50:09 +0000 (10:50 -0500)]
auth,mon: print key loading failures once at init

The routine was structured to print warnings frequently whenever the
config is refreshed (manually or normally). This confused Rook with
verbose debugging warnings so I tried to clean that up in an earlier
commit. This unfortunately broke command-line keyfile/key arguments.
I've resolved that now robustly by putting the error message in a
throwaway buffer that can be printed only from the init routines.

Lastly, this commit changes the preference for sourcing the key to
configs "key", then "keyfile", then "keyring". Before, it checked the
keyring file but, as this is usually set in the ceph.conf, that prevents
overriding with another key.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocommon/config: print error when keyfile cannot be read
Patrick Donnelly [Tue, 6 Jan 2026 23:04:59 +0000 (18:04 -0500)]
common/config: print error when keyfile cannot be read

Fixes: 91e8da14312b62f07d68b3aebca3bfb00a9cfc6e
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth/cephx: use defines for magic usage values
Patrick Donnelly [Wed, 26 Nov 2025 18:25:33 +0000 (13:25 -0500)]
auth/cephx: use defines for magic usage values

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth/Crypto: refactor class definition
Patrick Donnelly [Tue, 25 Nov 2025 19:18:16 +0000 (14:18 -0500)]
auth/Crypto: refactor class definition

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agodoc/dev/cephx: clarify which session key is used
Patrick Donnelly [Sat, 6 Dec 2025 00:20:33 +0000 (19:20 -0500)]
doc/dev/cephx: clarify which session key is used

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agodoc/dev/cephx: use consistent capitalization
Patrick Donnelly [Wed, 3 Dec 2025 19:35:52 +0000 (14:35 -0500)]
doc/dev/cephx: use consistent capitalization

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth/cephx: fix typo
Patrick Donnelly [Sat, 6 Dec 2025 00:20:56 +0000 (19:20 -0500)]
auth/cephx: fix typo

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth: CryptoKey, use dynamic usage keys for sts too
Marcus Watts [Wed, 19 Nov 2025 08:11:26 +0000 (03:11 -0500)]
auth: CryptoKey, use dynamic usage keys for sts too

Implement non-zero usage constants for sts too.

14 sts token

Signed-off-by: Marcus Watts <mwatts@redhat.com>
2 weeks agoauth: CryptoKey, use dynamic usage keys
Marcus Watts [Sat, 15 Nov 2025 08:05:59 +0000 (03:05 -0500)]
auth: CryptoKey, use dynamic usage keys

Use new extended api to implement non-zero usage constants.

3 std::string connection_secret
4 CephXServiceTicket
5 encode(CephXTicketBlob)
10 CephXServiceTicketInfo
11 CephXAuthorize
13 CephXAuthorizeChallenge
15 CephXAuthorizeReply
16 RotatingSecrets

Generally speaking, these keys are constructed by
"CryptoKey::decode" which does not know the context for how the
key will be used, so usage can't be set here.  In a brief
experiment, these usages for keys were invoked by keys decoded
under these routines:
3 4 5 CephxClientHandler::handle_response
11 13 15 CephXTicketHandler::verify_service_ticket_reply

Signed-off-by: Marcus Watts <mwatts@redhat.com>
2 weeks agoauth: CryptoKey, dynamic usage keys
Marcus Watts [Sat, 15 Nov 2025 02:57:53 +0000 (21:57 -0500)]
auth: CryptoKey, dynamic usage keys

Make it possible to specify usage that was not initially declared
when the secret was set in CryptoKey.  As a prerequisite, simplify
the per-usage key storage structures so that is is less heavy-weight
and thread-safe.

Also, extend various encrypt and decrypt apis to enable passing
suage down to CryptoKey.

Signed-off-by: Marcus Watts <mwatts@redhat.com>
2 weeks agotools/monmaptool: allow monmap ciphers to be modified on an existing monmap
Marcus Watts [Tue, 11 Nov 2025 21:17:58 +0000 (16:17 -0500)]
tools/monmaptool: allow monmap ciphers to be modified on an existing monmap

With this change, the following options
--auth-allowed_ciphers
--auth-service-cipher
--auth-preferred-cipher
can now be set in an existing monmap.

Signed-off-by: Marcus Watts <mwatts@redhat.com>
2 weeks agoauth: CryptoKey, use secret in CryptoKeyHandler
Marcus Watts [Tue, 4 Nov 2025 01:50:24 +0000 (20:50 -0500)]
auth: CryptoKey, use secret in CryptoKeyHandler

Keep only one copy of secret in CryptoKeyHandler.  This will reduce
the number of copies made in memory.  Also introduce bool() and ==
opeartors so we can hide implementation details.

Signed-off-by: Marcus Watts <mwatts@redhat.com>
2 weeks agorgw/sts: changing sts key to be compliant with AES256KRB5
Pritha Srivastava [Fri, 25 Jul 2025 08:51:54 +0000 (14:21 +0530)]
rgw/sts: changing sts key to be compliant with AES256KRB5
in vstart script.

Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit 0aef1b2387ac258b3a80e048cf48a730ea01403c)

2 weeks agorgw/sts: changes for using new AES256KRB5 cryptohandler
Pritha Srivastava [Fri, 25 Jul 2025 08:47:52 +0000 (14:17 +0530)]
rgw/sts: changes for using new AES256KRB5 cryptohandler
for encrypting/decrypting session tokens.

Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit 2a20d43fb4b3fc04ea03b911da71d600ba0d6837)

2 weeks agoauth: extend crypto API to support multiple usages per key
Yehuda Sadeh [Tue, 29 Jul 2025 19:56:14 +0000 (15:56 -0400)]
auth: extend crypto API to support multiple usages per key

Signed-off-by: Yehuda Sadeh <ysadehwe@ibm.com>
(cherry picked from commit 0876f64ea7da4e77e0f3bd9fbcafb260ccf23329)

2 weeks agoauth: remove superfluous error log message
Patrick Donnelly [Tue, 16 Sep 2025 20:02:05 +0000 (16:02 -0400)]
auth: remove superfluous error log message

It's also possible that _refresh_config can be called multiple times before the
keyring config has been set (by an arg/env for instance). This would pollute
the log with erroneous error warnings.

MonClient::authenticate already warns about this.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/MonClient: add assertions for monc lock in MonConnection
Patrick Donnelly [Wed, 20 Aug 2025 01:42:14 +0000 (21:42 -0400)]
mon/MonClient: add assertions for monc lock in MonConnection

When handling auth, we want to be sure these methods hold the monc_lock
which protects, in particular, the client authorizer.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agoauth: add debugging for client cephx methods
Patrick Donnelly [Wed, 20 Aug 2025 01:36:34 +0000 (21:36 -0400)]
auth: add debugging for client cephx methods

In particular, to see when an auth helper is created/destroyed.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agoauth: add debugging for keyring methods
Patrick Donnelly [Wed, 20 Aug 2025 01:35:24 +0000 (21:35 -0400)]
auth: add debugging for keyring methods

In particular, to see when a rotating key ring is created/destroyed.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agoauth: use explicit default destructor
Patrick Donnelly [Tue, 19 Aug 2025 21:01:18 +0000 (17:01 -0400)]
auth: use explicit default destructor

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agomsg/async: move v1 member init to header
Patrick Donnelly [Fri, 29 Aug 2025 13:35:15 +0000 (09:35 -0400)]
msg/async: move v1 member init to header

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agomsg: use MessageRef to manage pointer lifetime
Patrick Donnelly [Tue, 19 Aug 2025 21:27:29 +0000 (17:27 -0400)]
msg: use MessageRef to manage pointer lifetime

To simplify reasoning about upcoming changes to incoming/pending
messages.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agomsg/DispatchQueue: add debugging for queue discard
Patrick Donnelly [Wed, 20 Aug 2025 16:22:50 +0000 (12:22 -0400)]
msg/DispatchQueue: add debugging for queue discard

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agomsg/Connection: move destructor to object file
Patrick Donnelly [Tue, 19 Aug 2025 21:28:41 +0000 (17:28 -0400)]
msg/Connection: move destructor to object file

To ensure vtable is embedded in Connection object file.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agomds: move messages to be sent
Patrick Donnelly [Tue, 19 Aug 2025 21:02:24 +0000 (17:02 -0400)]
mds: move messages to be sent

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 weeks agoPendingReleaseNotes: add note for cephx upgrade
Patrick Donnelly [Wed, 30 Jul 2025 02:31:05 +0000 (22:31 -0400)]
PendingReleaseNotes: add note for cephx upgrade

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agodoc: update cephx details for upgrade procedure
Patrick Donnelly [Wed, 30 Jul 2025 02:33:14 +0000 (22:33 -0400)]
doc: update cephx details for upgrade procedure

And add miscellaneous clarity / wording improvements.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth/AuthRegistry: refresh config on startup
Patrick Donnelly [Wed, 30 Jul 2025 02:38:21 +0000 (22:38 -0400)]
auth/AuthRegistry: refresh config on startup

I don't think this makes a functional difference but these configs should be
loaded at startup otherwise it relies on obs startup to load them.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocommon/options: remove auth_supported
Patrick Donnelly [Tue, 22 Jul 2025 20:51:32 +0000 (16:51 -0400)]
common/options: remove auth_supported

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa: check health warnings in cephx upgrade
Patrick Donnelly [Mon, 7 Jul 2025 19:10:31 +0000 (15:10 -0400)]
qa: check health warnings in cephx upgrade

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa/tasks/ceph: allow configuring key settings for initial monmap
Patrick Donnelly [Mon, 7 Jul 2025 19:19:55 +0000 (15:19 -0400)]
qa/tasks/ceph: allow configuring key settings for initial monmap

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa/tasks/ceph.key_rotate: provide mechanism to rotate client keys
Patrick Donnelly [Mon, 7 Jul 2025 19:18:38 +0000 (15:18 -0400)]
qa/tasks/ceph.key_rotate: provide mechanism to rotate client keys

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa/tasks/ceph.healthy: indicate expected failing checks
Patrick Donnelly [Mon, 7 Jul 2025 19:15:31 +0000 (15:15 -0400)]
qa/tasks/ceph.healthy: indicate expected failing checks

We will want to confirm the cluster is healthy despite some checks that we
expect to be failing.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa/tasks/ceph: add key pruning task
Patrick Donnelly [Mon, 7 Jul 2025 19:11:55 +0000 (15:11 -0400)]
qa/tasks/ceph: add key pruning task

To remove keys we don't care about and will raise warnings if left behind.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomsg: constify getter
Patrick Donnelly [Tue, 22 Jul 2025 02:50:47 +0000 (22:50 -0400)]
msg: constify getter

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth/cephx: do not special case caps for mons
Patrick Donnelly [Tue, 22 Jul 2025 02:50:01 +0000 (22:50 -0400)]
auth/cephx: do not special case caps for mons

Yes, the mons always fill in the caps with what is in its KeyServer but it's
confusing to see this special case.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agotools/monmaptool: enable configuring monmap ciphers
Patrick Donnelly [Mon, 9 Jun 2025 15:20:44 +0000 (11:20 -0400)]
tools/monmaptool: enable configuring monmap ciphers

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: provide emergency mechanism to rescue allowed_ciphers
Patrick Donnelly [Tue, 24 Jun 2025 03:27:31 +0000 (23:27 -0400)]
mon: provide emergency mechanism to rescue allowed_ciphers

If the administrator accidentally revokes auth to client.admin, they cannot fix
it because the setting is stored in the monmap. Provide a config to restore
access in such an emergency.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: convert auth configs to monmap settings
Patrick Donnelly [Fri, 6 Jun 2025 19:51:53 +0000 (15:51 -0400)]
mon: convert auth configs to monmap settings

This serves a few purposes:

- Makes sure mons agreen on these settings (cannot have differing configs)
- Allows us to set secure defaults for a brand new cluster.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agodoc: add new cephx health warnings
Patrick Donnelly [Tue, 24 Jun 2025 02:34:30 +0000 (22:34 -0400)]
doc: add new cephx health warnings

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa: add upgrade suite for cephx
Patrick Donnelly [Thu, 29 May 2025 16:02:38 +0000 (12:02 -0400)]
qa: add upgrade suite for cephx

To test upgrade paths for "aes" key type to "aes256k" including the expected
flows for service key updates and entity rotation.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa/tasks/ceph: add task to rotate entity keys
Patrick Donnelly [Thu, 29 May 2025 16:11:49 +0000 (12:11 -0400)]
qa/tasks/ceph: add task to rotate entity keys

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa/tasks/ceph: allow cluster to be brought up with particular cephx key type
Patrick Donnelly [Thu, 29 May 2025 16:11:22 +0000 (12:11 -0400)]
qa/tasks/ceph: allow cluster to be brought up with particular cephx key type

For testing cephx upgrades from older key types.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa/tasks/radosbench: add extra_args conf
Patrick Donnelly [Mon, 19 May 2025 19:02:48 +0000 (15:02 -0400)]
qa/tasks/radosbench: add extra_args conf

So we can easily add extra debug flags or whatever.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa/tasks/radosbench: add auth_exit_on_failure arg
Patrick Donnelly [Tue, 25 Mar 2025 17:49:13 +0000 (13:49 -0400)]
qa/tasks/radosbench: add auth_exit_on_failure arg

To cause `rados bench` to exit immediately when an auth failure occurs.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoqa: add sequential_yield task
Patrick Donnelly [Wed, 26 Mar 2025 01:53:08 +0000 (21:53 -0400)]
qa: add sequential_yield task

This is identical to the sequential task except it yields after entering each
sub-task.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoinclude/encoding: add encoder helpers for sized ints
Patrick Donnelly [Tue, 24 Jun 2025 02:37:16 +0000 (22:37 -0400)]
include/encoding: add encoder helpers for sized ints

When the raw type may not match the required encoded size, this helper makes
intent clear and avoids a common verbose pattern:

    intX_t t = val;
    encode(t, bl);

and

    intX_t t;
    decode(t, p);
    val = t;

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: add health checks for insecure keys
Patrick Donnelly [Sun, 1 Jun 2025 00:54:30 +0000 (20:54 -0400)]
mon: add health checks for insecure keys

This commit prompted the previous refactor as it was inconvenient to check for
health warnings as part of AuthMonitor::tick and then pass those up via
PaxosService::encode_health.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: cleanup for loop
Patrick Donnelly [Fri, 30 May 2025 18:47:07 +0000 (14:47 -0400)]
mon: cleanup for loop

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/HealthMonitor: refactor quorum_checks/leader_checks as PaxosMap
Patrick Donnelly [Tue, 24 Jun 2025 16:21:55 +0000 (12:21 -0400)]
mon/HealthMonitor: refactor quorum_checks/leader_checks as PaxosMap

To codify protocol and catch bugs.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: refactor health check map through PaxosMap
Patrick Donnelly [Sun, 1 Jun 2025 00:53:55 +0000 (20:53 -0400)]
mon: refactor health check map through PaxosMap

This was motivated by confusing persistence of some health warnings during
testing of health warnings for cephx upgrades. Some services are only doing
health checks during ::encode_pending and others during ::tick. Make it
consistent.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/PaxosMap: add map template for managing Paxos structures
Patrick Donnelly [Tue, 24 Jun 2025 16:14:19 +0000 (12:14 -0400)]
mon/PaxosMap: add map template for managing Paxos structures

To protect access and codify protocol. Based loosely on PaxosFSMap which can be
refactored to use this later.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth: improve programmability of key dumps
Patrick Donnelly [Mon, 7 Jul 2025 18:55:57 +0000 (14:55 -0400)]
auth: improve programmability of key dumps

Notably:

- improve names (avoid repeated "keys")
- output type_str

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocommon/entity_name: dump type name as string
Patrick Donnelly [Fri, 13 Jun 2025 20:52:23 +0000 (16:52 -0400)]
common/entity_name: dump type name as string

For easier selection without hard-coded constants.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocommon/entity_name: remove dead method
Patrick Donnelly [Mon, 7 Jul 2025 18:02:05 +0000 (14:02 -0400)]
common/entity_name: remove dead method

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocommon/entity_name: cleanup entity_name::type
Patrick Donnelly [Sat, 31 May 2025 23:52:33 +0000 (19:52 -0400)]
common/entity_name: cleanup entity_name::type

This should use the entity_type_t from the msg headers. The only awkwardness is
that the encode/decode of the type needs to continue using a uint32_t.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocephx: add note to address technical debt
Patrick Donnelly [Thu, 29 May 2025 16:01:41 +0000 (12:01 -0400)]
cephx: add note to address technical debt

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth: check service key is valid before decryption
Patrick Donnelly [Thu, 29 May 2025 15:57:55 +0000 (11:57 -0400)]
auth: check service key is valid before decryption

CryptoKey::empty is the correct mechanism to check for an invalid key (and this
is codified elsewhere, fixed in this commit). Decryption would fail with an
abort if the key handler was unset. This would happen after rotating the "mon."
key and then restarting one of the mons.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth: add more debugging for service tickets
Patrick Donnelly [Thu, 29 May 2025 15:57:13 +0000 (11:57 -0400)]
auth: add more debugging for service tickets

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth/cephx: set error message when decryption fails
Patrick Donnelly [Thu, 29 May 2025 15:53:04 +0000 (11:53 -0400)]
auth/cephx: set error message when decryption fails

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth/cephx: provide more debugging when sig checks fail
Patrick Donnelly [Thu, 29 May 2025 15:52:34 +0000 (11:52 -0400)]
auth/cephx: provide more debugging when sig checks fail

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: provide emergency mechanism to use mon keyring
Patrick Donnelly [Thu, 29 May 2025 15:04:00 +0000 (11:04 -0400)]
mon: provide emergency mechanism to use mon keyring

If they key is lost for the `mon.` credential, it's very inconvenient to get it
out of the "auth" database in the mon store. So, allow the operator to create a
new keyring for the mons and use it instead to get mons in quorum again.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: cycle through keyring or key_server for auth with mons
Patrick Donnelly [Thu, 29 May 2025 14:13:40 +0000 (10:13 -0400)]
mon: cycle through keyring or key_server for auth with mons

After commit `mon: use key_server for looking up mon key`, the mons will now
use the key_server to lookup the `mon.` key when a mon connects.  We need to
make the mons prefer using that key with authenticating during probing other
mons. However, the protocol doesn't allow falling back to another key. This is
necessary if what's in the key_server database is out-of-date due to an earlier
loss of quorum. In that case, the operator should update the local keyring file
and the mon should give that a try if auth fails.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: use key_server for looking up mon key
Patrick Donnelly [Thu, 29 May 2025 14:07:52 +0000 (10:07 -0400)]
mon: use key_server for looking up mon key

Note: the key_server is already configured to fallback (via
KeyServerData::extra_secrets) to the Monitor::keyring which is sourced from the
mon's keyring file.

Using the Monitor::key_server allows us to maintain the mon's secret in the
auth database alongside all other secrets. This makes rotating the mons' keys
the same as all other entities in Ceph. Before this, to rotate the mons' key
you would need to turn off all montitors and then rotate the key files
manually. This is obviously disruptive since it's not a rolling upgrade.

If the key is sourced from the Monitor::key_server, then the key can be rotated
and all mons are aware of the new key. The mons can then proceed to restart as
needed in a non-disruptive fashion.

A followup commit will cleanup the monitor to try either its local keyring key
or the key in the key_server (if present) when authenticating with other mons.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: dout chosen addrs after startup
Patrick Donnelly [Thu, 29 May 2025 14:05:55 +0000 (10:05 -0400)]
mon: dout chosen addrs after startup

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/MonClient: improve error message when failing to auth
Patrick Donnelly [Wed, 14 May 2025 23:33:43 +0000 (19:33 -0400)]
mon/MonClient: improve error message when failing to auth

Currently you just see:

    2025-05-14T23:07:37.244+0000 7f00dedd1640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

which is terrible at communicating the problem.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth,mon: add _exit config when auth fails
Patrick Donnelly [Wed, 26 Mar 2025 02:02:26 +0000 (22:02 -0400)]
auth,mon: add _exit config when auth fails

This is largely for testing: we want a client to exit immediately if auth
failures occur. Presently, those clients will try to reconnect forever.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agotools/ceph_authtool: allow configuring a preferred cipher
Patrick Donnelly [Wed, 26 Mar 2025 02:05:09 +0000 (22:05 -0400)]
tools/ceph_authtool: allow configuring a preferred cipher

This makes testing easier as we can configure all keys in the cluster to be the
given "old" type without modifying each location that ceph-authtool is used.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/AuthMonitor: shutdown session connection on auth failure
Patrick Donnelly [Tue, 13 May 2025 16:28:39 +0000 (12:28 -0400)]
mon/AuthMonitor: shutdown session connection on auth failure

Currently the mons will allow the session to persist even though an auth
failure has occurred, probably while trying to obtain new tickets.

A sequence to easily trigger this:

    ceph auth rotate osd.0
    ceph auth wipe-rotating-service-keys

The osd.0 will continue interacting with the mons until restart or a network
interruption occurs.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomsg: add interface to shutdown Connection
Patrick Donnelly [Tue, 13 May 2025 16:26:48 +0000 (12:26 -0400)]
msg: add interface to shutdown Connection

Unfortunately this doesn't work as-is because I couldn't find primitives to
flush the out_queue. It's left as a to-do for now.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocrimson/mon/MonClient: call _wipe_secrets_and_tickets when needed
Matan Breizman [Thu, 12 Jun 2025 09:23:37 +0000 (09:23 +0000)]
crimson/mon/MonClient: call _wipe_secrets_and_tickets when needed

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocrimson/mon/MonClient: refacor Client::handle_monmap
Matan Breizman [Thu, 12 Jun 2025 09:22:22 +0000 (09:22 +0000)]
crimson/mon/MonClient: refacor Client::handle_monmap

Use coroutines, should help with future changes.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocrimson/mon/MonClient: introduce handle_auth_failure
Matan Breizman [Wed, 11 Jun 2025 12:28:26 +0000 (12:28 +0000)]
crimson/mon/MonClient: introduce handle_auth_failure

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocrimson/mon/MonClient: add asock TODO comment
Matan Breizman [Wed, 11 Jun 2025 12:26:59 +0000 (12:26 +0000)]
crimson/mon/MonClient: add asock TODO comment

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocrimson/osd/MonClient: Introduce Client::_wipe_secrets_and_tickets())
Matan Breizman [Wed, 11 Jun 2025 09:38:59 +0000 (09:38 +0000)]
crimson/osd/MonClient: Introduce Client::_wipe_secrets_and_tickets())

Similar to MonClient::_wipe_secrets_and_tickets())

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocrimson/mon/MonClient: imitate Classic's _check_auth_tickets
Matan Breizman [Wed, 11 Jun 2025 09:34:30 +0000 (09:34 +0000)]
crimson/mon/MonClient: imitate Classic's _check_auth_tickets

Imitating this interface from Classicals MonClient::_check_auth_tickets()
should make it easier to understand Crimson's counterpart.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocrimson/mon/MonClient: cleanup redundant private
Matan Breizman [Wed, 11 Jun 2025 09:33:20 +0000 (09:33 +0000)]
crimson/mon/MonClient: cleanup redundant private

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/MonClient: wipe secrets and invalidate tickets on auth epoch change
Patrick Donnelly [Fri, 9 May 2025 18:56:10 +0000 (14:56 -0400)]
mon/MonClient: wipe secrets and invalidate tickets on auth epoch change

* This causes service daemons to drop all known service tickets and request new
  ones from the auth server.

* This causes the clients (and service daemons) to request new tickets from the
  auth server which will include tickets signed with the new service keys.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/AuthMonitor: bump auth epoch when wiping service keys
Patrick Donnelly [Fri, 9 May 2025 18:54:47 +0000 (14:54 -0400)]
mon/AuthMonitor: bump auth epoch when wiping service keys

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/MonmapMonitor: wire up interface to bump auth epoch
Patrick Donnelly [Fri, 9 May 2025 18:19:18 +0000 (14:19 -0400)]
mon/MonmapMonitor: wire up interface to bump auth epoch

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/MonMap: add auth epoch
Patrick Donnelly [Fri, 9 May 2025 18:15:09 +0000 (14:15 -0400)]
mon/MonMap: add auth epoch

This will be used to indicate to clients / service daemons that the auth
service keys have been rotated. Clients and service daemons are expected to
invalidate their tickets and reauth. Service daemons should wipe their service
keys.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/AuthMonitor: add dump-keys and wipe-rotating-service-keys
Patrick Donnelly [Wed, 26 Mar 2025 01:59:34 +0000 (21:59 -0400)]
mon/AuthMonitor: add dump-keys and wipe-rotating-service-keys

`auth dump-keys` allows examining the key types for each entity and also the
rotating session keys. This lets us confirm key upgrades are done as expected.

`wipe-rotating-service-keys` clears out existing non-auth service keys so that we do not
need to wait for the rotating key expiration. It is not disruptive so long as clients
renew their tickets when prompted by the auth epoch change.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/AuthMonitor: add key-type switch
Patrick Donnelly [Fri, 21 Mar 2025 16:56:06 +0000 (12:56 -0400)]
mon/AuthMonitor: add key-type switch

So it's possible to test with various key-types.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocommon/cmdparse: add another template cmd_getval_or helper
Patrick Donnelly [Fri, 21 Mar 2025 16:57:25 +0000 (12:57 -0400)]
common/cmdparse: add another template cmd_getval_or helper

To mimic the conventional signature where you pass the lvalue you want to set.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/Monitor: perfect forward universal ref of lambda
Patrick Donnelly [Fri, 9 May 2025 18:16:55 +0000 (14:16 -0400)]
mon/Monitor: perfect forward universal ref of lambda

This method doesn't currently work for std::move of a lambda.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon/Monitor: add debugging for monmap handling
Patrick Donnelly [Fri, 9 May 2025 18:19:56 +0000 (14:19 -0400)]
mon/Monitor: add debugging for monmap handling

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agomon: notify_new_monmap via MonmapMonitor::init
Patrick Donnelly [Fri, 13 Jun 2025 19:14:55 +0000 (15:14 -0400)]
mon: notify_new_monmap via MonmapMonitor::init

Otherwise, configurations are not updated during startup.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agodoc/man: document new --key-type option for ceph-authtool
Patrick Donnelly [Thu, 29 May 2025 15:11:43 +0000 (11:11 -0400)]
doc/man: document new --key-type option for ceph-authtool

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agotools/ceph_authtool: add help message for key-type switch
Patrick Donnelly [Fri, 21 Mar 2025 16:54:33 +0000 (12:54 -0400)]
tools/ceph_authtool: add help message for key-type switch

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agocommon/buffer: accept "-" as stdin
Patrick Donnelly [Fri, 21 Mar 2025 16:53:38 +0000 (12:53 -0400)]
common/buffer: accept "-" as stdin

These methods are used for reading files from tools like "authtool". Read from
stdin if the conventional "-" filename is passed.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth/cephx: make some parameters const
Patrick Donnelly [Thu, 29 May 2025 14:01:37 +0000 (10:01 -0400)]
auth/cephx: make some parameters const

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth: cleanup error message formatting
Patrick Donnelly [Tue, 27 May 2025 23:25:42 +0000 (19:25 -0400)]
auth: cleanup error message formatting

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth,mon: lookup ticket ttl at runtime
Patrick Donnelly [Wed, 26 Mar 2025 02:04:20 +0000 (22:04 -0400)]
auth,mon: lookup ticket ttl at runtime

and improve debugging.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth: add API to invalidate all tickets
Patrick Donnelly [Fri, 9 May 2025 18:52:52 +0000 (14:52 -0400)]
auth: add API to invalidate all tickets

This will prompt the client to request new ones from the auth server.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoauth: add API to wipe rotating secrets
Patrick Donnelly [Fri, 9 May 2025 18:52:13 +0000 (14:52 -0400)]
auth: add API to wipe rotating secrets

This is for the service daemon's store of rotating service secrets.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>