blob: 47591a7d228d197621f494186637d4e96ed9ea6f [file] [log] [blame]
Framing Format Changes For Native Protocol V5
---------------------------------------------
The framing format for V5 is based on the internode messaging system as it was redesigned for 4.0 in https://issues.apache.org/jira/browse/CASSANDRA-15066[CASSANDRA-15066]. In prior versions, the relationship between `Message` and `Frame` is 1:1, each CQL message being contained in exactly one frame and every frame containing only a single message. The changes introduced here enable frames to contain multiple small messages and for oversized messages to be broken into multiple frames.
The introduction of this new format, brings an unfortunate naming clash between frames from the previous format, with a 1:1 relationship to messages, and the new outer frames. Taking inspiration from SMTP (RFC-5321), what was previously referred to as a `Frame` is better described as an `Envelope` which comprises a `CQL message` along with some attendant metadata.
New Frame Format
----------------
In general, a `frame` comprises a `header`, `payload` and `trailer`; this proposal introduces two specific `frame` formats, `compressed` and `uncompressed`. In both cases, the `payload` is a stream of `CQL Envelopes`, each containing a single `CQL Message`. In effect, the new framing format is a simple wrapper around the previous protocol.
In all cases, a `frame` may or may not be `self contained`. If `self contained`, the `payload` includes
one or more complete `CQL envelopes` and can be fully processed immediately. Otherwise, the `payload` contains some part of a large `CQL envelope`, which has been split into its own sequence of `outer frames`. These are expected to be transmitted/received in order, so a processor can accumulate them as they arrive and process them once all have been received.
The header contains length information for the `payload`, whether or not the frame is `self contained` and a `CRC` to protect the integrity of the `header` itself. There are slight variations in the `header` format between the compressed and uncompressed variants.
The payload is opaque as far as the framing format is concerned, modulo the `self contained` variation.
The `trailer` contains a `CRC` to protect the integrity of the `payload`. As the `payload` includes the `CQL envelopes` including both the message and the attached metadata, such as the `stream ID`, `flags` and `opcode`, this adds a level of protection that previously wasn't available (for example from the implementation in https://issues.apache.org/jira/browse/CASSANDRA-13304[CASSANDRA-13304]).
Uncompressed Format
-------------------
The uncompressed variant uses a 6 byte `header` containing `payload length`, `self contained flag` and `CRC24` for the `header`. The max size for the payload is `128KiB`, and is followed by its `CRC32`.
....
1. Payload length (17 bits)
2. isSelfContained flag (1 bit)
3. Header padding (6 bits)
4. CRC24 of the header (24 bits)
5. Payload (up to 2 ^ 17 - 1 bits)
6. Payload CRC32 (32 bits)
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload Length |C| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
CRC24 of Header | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+ +
| Payload |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CRC32 of Payload |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
....
LZ4 Compressed Format
---------------------
The variant with `LZ4` compression uses an `8 byte header`, containing both the compressed and uncompressed lengths of the `payload`, the `self contained flag` and a `CRC24` for the `header`. As with uncompressed frames, the max payload size is `128KiB` and is followed by a `CRC32` trailer. This is the `CRC` of the *compressed* `payload`.
....
1. Compressed length (17 bits)
2. Uncompressed length (17 bits)
3. {@code isSelfContained} flag (1 bit)
4. Header padding (5 bits)
5. CRC24 of Header contents (24 bits)
6. Compressed Payload (up to 2 ^ 17 - 1 bits)
7. CRC32 of Compressed Payload (32 bits)
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Compressed Length | Uncompressed Length
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|C| | CRC24 of Header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| Compressed Payload |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CRC32 of Compressed Payload |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
....
Protocol Changes
----------------
Other than the enclosure of `CQL envelopes` in `frames`, the changes required to the protocol itself are minimal. All message exchanges follow the same `REQUEST/RESPONSE` specifications as before and no changes are required to either the `CQL envelope (frame)` or `CQL message` formats. `CQL envelope` headers contain some information which is now redundant and currently ignored. The `CQL envelope` header contains the protocol version, which was always to some extent redundant as the version is set and enforced at the connection level. It was also previously possible to enable compression at the individual `CQL envelope` level. This is no longer an option, the outer framing format being responsible for compression, which is set for the lifetime of a connection and applies to all messages transmitted throughout it. To that end, the `compression` flag on the `CQL envelope` header is ignored in V5.
** TODO - note on conditional compression
Protocol Negotiation
--------------------
In order to support both V5 and earlier formats, the V5 `outer framing` format is not applied to message exchanges *before* a `STARTUP` exchange is completed. This means that the initial `STARTUP` message and any `OPTIONS` messages which precede it are expected *without* the `outer framing`. Likewise, the responses returned by the server (`SUPPORTED` for `OPTIONS` and either `READY` or `AUTHENTICATE` for `STARTUP`) are transmitted without the `outer framing`.
After sending the response to a `STARTUP` (`READY` or `AUTHENTICATE`), the server will begin encoding and decoding further transmissions according to the protocol version of that `STARTUP` message. Compression of the `outer frames` is dictated by the `COMPRESSION` option sent in the `STARTUP` message. Only `LZ4` compression is currently supported for V5.
Note: `OPTIONS` requests may be sent by the client at any time in the connection lifecycle, both before and after the `STARTUP` exchange. As mentioned, those transmitted _before_ `STARTUP`, as well as the `SUPPORTED` responses the server returns are *not* enclosed in `outer frames`. Any `OPTIONS/SUPPORTED` exchanges _after_ the `STARTUP` exchange *are* formatted according to the protocol version. So in V5, these *will* be enclosed in `outer frames`.