blob: 450c5296975aeeb52195a848c495c0de0851ebcc [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<chapter xml:id="guacamole-protocol" xmlns="http://docbook.org/ns/docbook" version="5.0"
xml:lang="en" xmlns:xl="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude">
<title>The Guacamole protocol</title>
<indexterm>
<primary>Guacamole protocol</primary>
</indexterm>
<para>This chapter is an overview of the Guacamole protocol, describing its design and general
use. While a few instructions and their syntax will be described here, this is not an
exhaustive list of all available instructions. The intent is only to list the general types
and usage. If you are looking for the syntax or purpose of a specific instruction, consult
the protocol reference included with the appendices.</para>
<section xml:id="guacamole-protocol-design">
<title>Design</title>
<para>The Guacamole protocol consists of instructions. Each instruction is a comma-delimited
list followed by a terminating semicolon, where the first element of the list is the
instruction opcode, and all following elements are the arguments for that
instruction:</para>
<informalexample>
<programlisting><replaceable>OPCODE</replaceable>,<replaceable>ARG1</replaceable>,<replaceable>ARG2</replaceable>,<replaceable>ARG3</replaceable>,<replaceable>...</replaceable>;</programlisting>
</informalexample>
<para>Each element of the list has a positive decimal integer length prefix separated by the
value of the element by a period. This length denotes the number of Unicode characters
in the value of the element, which is encoded in UTF-8:</para>
<informalexample>
<programlisting><replaceable>LENGTH</replaceable>.<replaceable>VALUE</replaceable></programlisting>
</informalexample>
<para>Any number of complete instructions make up a message which is sent from client to
server or from server to client. Client to server instructions are generally control
instructions (for connecting or disconnecting) and events (mouse and keyboard). Server
to client instructions are generally drawing instructions (caching, clipping, drawing
images), using the client as a remote display.</para>
<para>For example, a complete and valid instruction for setting the display size to 1024x768
would be:</para>
<informalexample>
<programlisting>4.size,1.0,4.1024,3.768;</programlisting>
</informalexample>
<para>Here, the instruction would be decoded into four elements: "size", the opcode of the
size instruction, "0", the index of the default layer, "1024", the desired width in
pixels, and "768", the desired height in pixels.</para>
<para>The structure of the Guacamole protocol is important as it allows the protocol to be
streamed while also being easily parsable by JavaScript. JavaScript does have native
support for conceptually-similar structures like XML or JSON, but neither of those
formats is natively supported in a way that can be streamed; JavaScript requires the
entirety of the XML or JSON message to be available at the time of decoding. The
Guacamole protocol, on the other hand, can be parsed as it is received, and the presence
of length prefixes within each instruction element means that the parser can quickly
skip around from instruction to instruction without having to iterate over every
character.</para>
</section>
<section xml:id="guacamole-protocol-handshake">
<title>Handshake phase</title>
<para>The handshake phase is the phase of the protocol entered immediately upon connection.
It begins with a "select" instruction sent by the client which tells the server which
protocol will be loaded:</para>
<informalexample>
<programlisting>6.select,3.vnc;</programlisting>
</informalexample>
<para>After receiving the "select" instruction, the server will load the associated client
support and respond with a list of accepted parameter names using an "args"
instruction:</para>
<informalexample>
<programlisting>4.args,8.hostname,4.port,8.password,13.swap-red-blue,9.read-only;</programlisting>
</informalexample>
<para>After receiving the list of arguments, the client is required to respond with the list
of supported audio, video, and image mimetypes, the optimal display size and resolution,
and the values for all arguments available, even if blank. If any of these requirements
are left out, the connection will close:</para>
<informalexample>
<programlisting>4.size,4.1024,3.768,2.96;
5.audio,9.audio/ogg;
5.video;
5.image,9.image/png,10.image/jpeg;
7.connect,9.localhost,4.5900,0.,0.,0.;</programlisting>
</informalexample>
<para>For clarity, we've put each instruction on its own line, but in the real protocol, no
newlines exist between instructions. In fact, if there is anything after an instruction
other than the start of a new instruction, the connection is closed.</para>
<para>Here, the client is specifying that the optimal display size is 1024x768 at 96 DPI and
it supports Ogg Vorbis audio, but no video, and can accept both PNG and JPEG images. It
wants to connect to localhost at port 5900, and is leaving the three other parameters
blank.</para>
<para>Once these instructions have been sent by the client, the server will attempt to
initialize the connection with the parameters received and, if successful, respond with
a "ready" instruction. This instruction contains the ID of the new client connection and
marks the beginning of the interactive phase. The ID is an arbitrary string, but is
guaranteed to be unique from all other active connections, as well as from the names of
all supported protocols:</para>
<informalexample>
<programlisting>5.ready,37.$260d01da-779b-4ee9-afc1-c16bae885cc7;</programlisting>
</informalexample>
<para>The actual interactive phase begins immediately after the "ready" instruction is sent.
Drawing and event instructions pass back and forth until the connection is
closed.</para>
<section xml:id="guacamole-protocol-joining">
<title>Joining an existing connection</title>
<para>Once the handshake phase has completed, that connection is considered active and
can be joined by other connections if the ID is provided instead of a protocol name
via the "select" instruction:</para>
<informalexample>
<programlisting>6.select,37.$260d01da-779b-4ee9-afc1-c16bae885cc7;</programlisting>
<para>The rest of the handshake phase for a joining connection is identical. Just as
with a new connection, the restrictions or features which apply to the joining
connection are dictated by the parameter values supplied during the
handshake.</para>
</informalexample>
</section>
</section>
<section xml:id="guacamole-protocol-drawing">
<title>Drawing</title>
<section xml:id="guacamole-protocol-compositing">
<title>Compositing</title>
<para>The Guacamole protocol provides compositing operations through the use of "channel
masks". The term "channel mask" is simply a description of the mechanism used while
designing the protocol to conceptualize and fully enumerate all possible compositing
operations based on four different sources of image data: source image data where
the destination is opaque, source image data where the destination is transparent,
destination image data where the source is opaque, and destination image data where
the source is transparent. Assigning a binary value to each of these "channels"
creates a unique integer ID for every possible compositing operation, where these
operations parallel the operations described by Porter and Duff in their paper. As
the HTML5 canvas tag also uses Porter/Duff to describe their compositing operations
(as do other graphical APIs), the Guacamole protocol is conveniently similar to the
compositing support already present in web browsers, with some operations not yet
supported. The following operations are all implemented and known to work correctly
in all browsers:</para>
<variablelist>
<varlistentry>
<term>B out A (0x02)</term>
<listitem>
<para>Clears the destination where the source is opaque, but otherwise draws
nothing. This is useful for masking.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A atop B (0x06)</term>
<listitem>
<para>Fills with the source where the destination is opaque only.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A xor B (0x0A)</term>
<listitem>
<para>As with logical XOR. Note that this is a compositing operation, not a
bitwise operation. It draws the source where the destination is
transparent, and draws the destination where the source is
transparent.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>B over A (0x0B)</term>
<listitem>
<para>What you would typically expect when drawing, but reversed. The source
appears only where the destination is transparent, as if you were
attempting to draw the destination over the source, rather than the
source over the destination.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A over B (0x0E)</term>
<listitem>
<para>The most common and sensible compositing operation, this draws the
source everywhere, but includes the destination where the source is
transparent.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A + B (0x0F)</term>
<listitem>
<para>Simply adds the components of the source image to the destination
image, capping the result at pure white.</para>
</listitem>
</varlistentry>
</variablelist>
<para>The following operations are all implemented, but may work incorrectly in WebKit
browsers which always include the destination image where the source is
transparent:</para>
<variablelist>
<varlistentry>
<term>B in A (0x01)</term>
<listitem>
<para>Draws the destination only where the source is opaque, clearing
anywhere the source or destination are transparent.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A in B (0x04)</term>
<listitem>
<para>Draws the source only where the destination is opaque, clearing
anywhere the source or destination are transparent.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A out B (0x08)</term>
<listitem>
<para>Draws the source only where the destination is transparent, clearing
anywhere the source or destination are opaque.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>B atop A (0x09)</term>
<listitem>
<para>Fills with the destination where the source is opaque only.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A (0x0C)</term>
<listitem>
<para>Fills with the source, ignoring the destination entirely.</para>
</listitem>
</varlistentry>
</variablelist>
<para>The following operations are defined, but not implemented, and do not exist as
operations within the HTML5 canvas:</para>
<variablelist>
<varlistentry>
<term>Clear (0x00)</term>
<listitem>
<para>Clears all existing image data in the destination.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>B (0x03)</term>
<listitem>
<para>Does nothing.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A xnor B (0x05)</term>
<listitem>
<para>Adds the source to the destination where the destination or source are
opaque, clearing anywhere the source or destination are transparent.
This is similar to A + B except the aspect of transparency is also
additive.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>(A + B) atop B (0x07)</term>
<listitem>
<para>Adds the source to the destination where the destination is opaque,
preserving the destination otherwise.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>(A + B) atop A (0x0D)</term>
<listitem>
<para>Adds the destination to the source where the source is opaque, copying
the source otherwise.</para>
</listitem>
</varlistentry>
</variablelist>
</section>
<section xml:id="guacamole-protocol-images">
<title>Image data</title>
<para>The Guacamole protocol, like many remote desktop protocols, provides a method of
sending an arbitrary rectangle of image data and placing it either within a buffer
or in a visible rectangle of the screen. Raw image data in the Guacamole protocol is
streamed as PNG, JPEG, or WebP data over a stream allocated with the "img"
instruction. Depending on the format used, image updates sent in this manner can be
RGB or RGBA (alpha transparency) and are automatically palettized if sent using
libguac. The streaming system used for image data is generalized and used by
Guacamole for other types of streams, including audio and file transfer. For more
information about streams in the Guacamole protocol, see <xref
xmlns:xlink="http://www.w3.org/1999/xlink"
linkend="guacamole-protocol-streaming"/>.</para>
<para>Image data can be sent to any specified rectangle within a layer or buffer.
Sending the data to a layer means that the image becomes immediately visible, while
sending the data to a buffer allows that data to be reused later.</para>
</section>
<section xml:id="guacamole-protocol-copying-images">
<title>Copying image data between layers</title>
<para>Image data can be copied from one layer or buffer into another layer or buffer.
This is often used for scrolling (where most of the result of the graphical update
is identical to the previous state) or for caching parts of an image.</para>
<para>Both VNC and RDP provide a means of copying a region of screen data and placing it
somewhere else within the same screen. RDP provides an additional means of copying
data to a cache, or recalling data from that cache and placing it on the screen.
Guacamole takes this concept and reduces it further, as both on-screen and
off-screen image storage is the same. The Guacamole "copy" instruction allows you to
copy a rectangle of image data, and place it within another layer, whether that
layer is the same as the source layer, a different visible layer, or an off-screen
buffer.</para>
</section>
<section xml:id="guacamole-graphical-primitives">
<title>Graphical primitives</title>
<para>The Guacamole protocol provides basic graphics operations similar to those of
Cairo or the HTML5 canvas. In many cases, these primitives are useful for remote
drawing, and desirable in that they take up less bandwidth than sending
corresponding PNG images. Beware that excessive use of primitives leads to an
increase in client-side processing, which may reduce the performance of a connected
client, especially if that client is on a lower-performance machine like a mobile
phone or tablet.</para>
</section>
<section xml:id="guacamole-protocol-layers">
<title>Buffers and layers</title>
<para>All drawing operations in the Guacamole protocol affect a layer, and each layer
has an integer index which identifies it. When this integer is negative, the layer
is not visible, and can be used for storage or caching of image data. In this case,
the layer is referred to within the code and within documentation as a "buffer".
Layers are created automatically when they are first referenced in an
instruction.</para>
<para>There is one main layer which is always present called the "default layer". This
layer has an index of 0. Resizing this layer resizes the entire remote display.
Other layers default to the size of the default layer upon creation, while buffers
are always created with a size of 0x0, automatically resizing themselves to fit
their contents.</para>
<para>Non-buffer layers can be moved and nested within each other. In this way, layers
provide a simple means of hardware-accelerated compositing. If you need a window to
appear above others, or you have some object which will be moving or you need the
data beneath it automatically preserved, a layer is a good way of accomplishing
this. If a layer is nested within another layer, its position is relative to that of
its parent. When the parent is moved or reordered, the child moves with it. If the
child extends beyond the parents bounds, it will be clipped.</para>
</section>
</section>
<section xml:id="guacamole-protocol-streaming">
<title>Streams and objects</title>
<para>Guacamole supports transfer of clipboard contents, audio, video, and image data, as
well as files and arbitrary named pipes.</para>
<para>Streams are allocated directly with instructions that associate the new stream with
particular semantics and metadata, such as the "audio" or "video" instructions used for
playing media, the "file" instruction used for file transfer, and the "pipe" instruction
for transfer of completely arbitrary data between client and server. In some cases, the
availability and semantics of streams may be explicitly advertised using structured sets
of named streams known as "objects".</para>
<para>Once a stream is allocated, data is sent along the stream in chunks using "blob"
instructions, which may be acknowledged by the receiving end by "ack" instructions. The
end of the stream is finally signalled with an "end" instruction.</para>
</section>
<section xml:id="guacamole-protocol-events">
<title>Events</title>
<para>When something changes on either side, client or server, such as a key being pressed,
the mouse moving, or clipboard data changing, an instruction describing the event is
sent.</para>
</section>
<section xml:id="guacamole-protocol-disconnecting">
<title>Disconnecting</title>
<para>The server and client can end the connection at any time. There is no requirement for
the server or the client to communicate that the connection needs to terminate. When the
client or server wish to end the connection, and the reason is known, they can use the
"disconnect" or "error" instructions.</para>
<para>The disconnect instruction is sent by the client when it is disconnecting. This is
largely out of politeness, and the server must be written knowing that the disconnect
instruction may not always be sent in time (guacd is written this way).</para>
<para>If the client does something wrong, or the server detects a problem with the client
plugin, the server sends an error instruction, including a description of the problem in
the parameters. This informs the client that the connection is being closed.</para>
</section>
</chapter>