auth/cephx: option to disallow unauthorized global_id (re)use
global_id is a cluster-wide unique id that must remain stable for the
lifetime of the client instance. The cephx protocol has a facility to
allow clients to preserve their global_id across reconnects:
(1) the client should provide its global_id in the initial handshake
message/frame and later include its auth ticket proving previous
possession of that global_id in CEPHX_GET_AUTH_SESSION_KEY request
(2) the monitor should verify that the included auth ticket is valid
and has the same global_id and, if so, allow the reclaim
(3) if the reclaim is allowed, the new auth ticket should be
encrypted with the session key of the included auth ticket to
ensure authenticity of the client performing reclaim. (The
included auth ticket could have been snooped when the monitor
originally shared it with the client or any time the client
provided it back to the monitor as part of requesting service
tickets, but only the genuine client would have its session key
and be able to decrypt.)
Unfortunately, all (1), (2) and (3) have been broken for a while:
- (1) was broken in 2016 by commit
a2eb6ae3fb57 ("mon/monclient:
hunt for multiple monitor in parallel") and is addressed in patch
"mon/MonClient: preserve auth state on reconnects"
- it turns out that (2) has never been enforced. When cephx was
being designed and implemented in 2009, two changes to the protocol
raced with each other pulling it in different directions: commits
0669ca21f4f7 ("auth: reuse global_id when requesting tickets")
and
fec31964a12b ("auth: when renewing session, encrypt ticket")
added the reclaim mechanism based strictly on auth tickets, while
commit
5eeb711b6b2b ("auth: change server side negotiation a bit")
allowed the client to provide global_id in the initial handshake.
These changes didn't get reconciled and as a result a malicious
client can assign itself any global_id of its choosing by simply
passing something other than 0 in MAuth message or AUTH_REQUEST
frame and not even bother supplying any ticket. This includes
getting a global_id that is being used by another client.
- (3) was broken in 2019 with addition of support for msgr2, where
the new auth ticket ends up being shared unencrypted. However the
root cause is deeper and a malicious client can coerce msgr1 into
the same. This also goes back to 2009 and is addressed in patch
"auth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys".
Because (2) has never been enforced, no one noticed when (1) got
broken and we began to rely on this flaw for normal operation in
the face of reconnects due to network hiccups or otherwise. As of
today, only pre-luminous userspace clients and kernel clients are
not exercising it on a daily basis.
Bump CephXAuthenticate version and use a dummy v3 to distinguish
between legacy clients that don't (may not) include their auth ticket
and new clients. For new clients, unconditionally disallow claiming
global_id without a corresponding auth ticket. For legacy clients,
introduce a choice between permissive (current behavior, default for
the foreseeable future) and enforcing mode.
If the reclaim is disallowed, return EACCES. While MonClient does
have some provision for global_id changes and we could conceivably
implement enforcement by handing out a fresh global_id instead of
the provided one, those code paths have never been tested and there
are too many ways a sudden global_id change could go wrong.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit
abebd643cc60fa8a7cb82dc29a9d5041fb3c3d36)
Conflicts:
src/auth/cephx/CephxProtocol.h [ bufferlist vs
ceph::buffer::list ]
src/auth/cephx/CephxServiceHandler.h [ ditto ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]