blob: f7327933ec801bc56418ffefbd521c96afae9123 [file] [log] [blame] [view]
---
title: Apache Mesos - RecordIO Data Format
layout: documentation
---
# RecordIO Data Format
## Overview
RecordIO is the name for a set of binary data exchange formats.
The basic idea is to divide the data into individual chunks, called 'records',
and then to prepend to every record its length in bytes, followed by the data.
Since there is no formal specification of the RecordIO format, there tend
to be slight incompatibilities between RecordIO implementations. Common
differences include, for example, a magic value at the beginning of the
stream to indicate the stream format, encoding the record length using
fixed-size fields with native integers or extra fields to suppord compressed
record content.
Therefore, when using a third-party library, one needs to verify that
the implementation matches the RecordIO format used by Mesos, as described
below.
## Mesos RecordIO Format
The [BNF grammar](http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.1)
for a RecordIO-encoded streaming response is:
```
records = *record
record = record-size LF record-data
record-size = 1*DIGIT
record-data = record-size(OCTET)
```
`record-size` should be interpreted as an unsigned 64-bit integer (`uint64`).
For example, a stream may look like:
```
128\n
{"type": "SUBSCRIBED","subscribed": {"framework_id": {"value":"12220-3440-12532-2345"},"heartbeat_interval_seconds":15.0}20\n
{"type":"HEARTBEAT"}675\n
...
```
In pseudo-code, this could be parsed with something like the following:
```
while (true) {
do {
lengthBytes = readline()
} while (lengthBytes.length < 1)
messageLength = parseInt(lengthBytes);
messageBytes = read(messageLength);
process(messageBytes);
}
```
Network intermediaries (e.g., proxies) are free to change the chunk
boundaries; this should not have any effect on the recipient application.
## Streaming HTTP requests
Within Mesos, RecordIO is used as data format for API endpoints
that support streaming responses. The implementation expects the
client to be aware of the following non-standard HTTP header fields
when using such endpoints:
The `Message-Accept` header must be set by clients that request a
streaming response from the server by setting the `Accept` header to
`application/recordio`. It specifies which media types the client
accepts encoded in the RecordIO stream, and its content must have the
same format as the standard HTTP `Accept` header.
The `Message-Content-Type` is set by the server when streaming the
response to an API call. It specifies the content type of the data
encoded in the RecordIO stream and has the same format as the
standard HTTP `Content-Type` header.
Currently, the only supported MIME types for both header fields are
`application/json` and `application/x-protobuf`.