This protocol revision has several goals relative to the original protocol:
-* *Multiplexing*. We will have multiple server entities (e.g.,
- multiple OSDs and clients) coexisting in the same process. We would
- like to share the transport connection (e.g., TCP socket) whenever
- possible.
-* *Signing*. We will allow for traffic to be signed (but not
- necessarily encrypted).
-* *Encryption*. We will incorporate encryption over the wire.
* *Flexible handshaking*. The original protocol did not have a
sufficiently flexible protocol negotiation that allows for features
that were not required.
+* *Encryption*. We will incorporate encryption over the wire.
* *Performance*. We would like to provide for protocol features
(e.g., padding) that keep computation and memory copies out of the
fast path where possible.
+* *Signing*. We will allow for traffic to be signed (but not
+ necessarily encrypted). This may not be implemented in the initial version.
Definitions
-----------
* *entity*: a ceph entity instantiation, e.g. 'osd.0'. each entity
has one or more unique entity_addr_t's by virtue of the 'nonce'
field, which is typically a pid or random value.
-* *stream*: an exchange, passed over a connection, between two unique
- entities. in the future multiple entities may coexist within the
- same process.
* *session*: a stateful session between two entities in which message
exchange is ordered and lossless. A session might span multiple
- connections (and streams) if there is an interruption (TCP connection
- disconnect).
+ connections if there is an interruption (TCP connection disconnect).
* *frame*: a discrete message sent between the peers. Each frame
- consists of a tag (type code), stream id, payload, and (if signing
+ consists of a tag (type code), payload, and (if signing
or encryption is enabled) some other fields. See below for the
structure.
-* *stream id*: a 32-bit value that uniquely identifies a stream within
- a given connection. the stream id implicitly instantiated when the send
- sends a frame using that id.
-* *tag*: a single-byte type code associated with a frame. The tag
+* *tag*: a type code associated with a frame. The tag
determines the structure of the payload.
Phases
------
-A connection has two distinct phases:
+A connection has four distinct phases:
#. banner
-#. frame exchange for one or more strams
-
-A stream has three distinct phases:
-
-#. authentication
-#. message flow handshake
-#. message exchange
+#. authentication frame exchange
+#. message flow handshake frame exchange
+#. message frame exchange
Banner
------
|<-----------+ |
| |
-Frame format and Stream establishment
--------------------------------------
+Frame format
+------------
All further data sent or received is contained by a frame. Each frame has
the form::
- stream_id (le32)
frame_len (le32)
- tag (TAG_* byte)
+ tag (TAG_* le32)
payload
[payload padding -- only present after stream auth phase]
[signature -- only present after stream auth phase]
-* stream_id is generated by the client.
-
* frame_len includes everything after the frame_len le32 up to the end of the
frame (all payloads, signatures, and padding).
* The payload format and length is determined by the tag.
-* The signature portion is only present in a given stream if the
- authentication phase has completed (TAG_AUTH_DONE has been sent) and
- signatures are enabled.
-
-A new stream is created when the client sends a frame with the following tag
-message:
-
-* TAG_NEW_STREAM (client only): starts a new stream::
-
- __u8 my_type (CEPH_ENTITY_TYPE_*)
-
-
-.. ditaa:: +---------+ +--------+
- | Client | | Server |
- +---------+ +--------+
- | send new stream |
- |------------------>|
- | |
+* The signature portion is only present if the authentication phase
+ has completed (TAG_AUTH_DONE has been sent) and signatures are
+ enabled.
Authentication
--------------
-* TAG_AUTH_SET_METHOD (client only): set auth method for this connection::
-
- __le32 method;
+* TAG_AUTH_REQUEST: client->server::
- - The selected auth method determines the sig_size and block_size in any
- subsequent messages (TAG_AUTH_DONE and non-auth messages).
+ __le32 method; // CEPH_AUTH_{NONE, CEPHX, ...}
+ __le32 len;
+ method specific payload
* TAG_AUTH_BAD_METHOD (server only): reject client-selected auth method::
__le32 method
__le32 num_methods
- __le32 allowed_methods[num_methods] // CEPH_AUTH_{NONE, CEPHX}
+ __le32 allowed_methods[num_methods] // CEPH_AUTH_{NONE, CEPHX, ...}
- Returns the unsupported/forbidden method along with the list of allowed
authentication methods.
-* TAG_AUTH_REQUEST: client->server::
+* TAG_AUTH_BAD_AUTH: server->client::
+ __le32 error code (e.g., EPERM, EACCESS)
__le32 len;
- method specific payload
+ error string;
+
+ - Sent when the authentication fails
-* TAG_AUTH_REPLY: server->client::
+* TAG_AUTH_MORE: server->client or client->server::
__le32 len;
method specific payload
-* TAG_AUTH_BAD_AUTH: server->client:
-
- - Sent when the authentication fails
-
-
-* TAG_AUTH_DONE::
+* TAG_AUTH_DONE: (server->client)::
confounder (block_size bytes of random garbage)
__le64 flags
FLAG_SIGNED 2
signature
- - The client first says AUTH_DONE, and the server replies to
- acknowledge it.
+ - The server is the one to decide authentication has completed.
Example of authentication phase interaction when the client uses an
.. ditaa:: +---------+ +--------+
| Client | | Server |
+---------+ +--------+
- | set method |
- |---------------->|
| auth request |
|---------------->|
|<----------------|
- | auth reply|
+ | auth more|
| |
- | auth done |
+ |auth more |
|---------------->|
|<----------------|
- | auth done ack |
+ | auth done|
Example of authentication phase interaction when the client uses a forbidden
.. ditaa:: +---------+ +--------+
| Client | | Server |
+---------+ +--------+
- | set method |
+ | auth request |
|---------------->|
- | +---|
- | auth request| |
- |-------------+-->|
- | | |
- |<------------+ |
+ |<----------------|
| bad method |
| |
- | set method |
- |---------------->|
| auth request |
|---------------->|
|<----------------|
- | auth reply|
+ | auth more|
| |
- | auth done |
+ | auth more |
|---------------->|
|<----------------|
- | auth done ack |
+ | auth done|
Message frame format
* If neither FLAG_SIGNED or FLAG_ENCRYPTED is specified, things are simple::
- stream_id
frame_len
tag
payload
* If FLAG_SIGNED has been specified::
- stream_id
frame_len
tag
payload
* If FLAG_ENCRYPTED has been specified::
- stream_id
frame_len
+ tag
{
- payload_sig_length
payload
payload_padding (out to auth block_size)
} ^ stream cipher