blob: 1c36b73379f7ae11f1869c04ffe20a819ee863e4 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>PLC4X &#x2013; </title>
<script src="../../js/jquery.slim.min.js" type="text/javascript"></script>
<!--script src="../../js/popper.min.js" type="javascript"></script-->
<script src="../../js/bootstrap.bundle.min.js" type="text/javascript"></script>
<!-- The tooling for adding images and links to Apache events -->
<script src="https://www.apachecon.com/event-images/snippet.js" type="text/javascript"></script>
<!-- FontAwesome -->
<link rel="stylesheet" href="../../css/all.min.css" type="text/css"/>
<!-- Bootstrap -->
<link rel="stylesheet" href="../../css/bootstrap.min.css" type="text/css"/>
<!-- Some Maven Site defaults -->
<link rel="stylesheet" href="../../css/maven-base.css" type="text/css"/>
<link rel="stylesheet" href="../../css/maven-theme.css" type="text/css"/>
<!-- The PLC4X version of a bootstrap theme -->
<link rel="stylesheet" href="../../css/themes/plc4x.css" type="text/css" id="pagestyle"/>
<!-- A custom style for printing content -->
<link rel="stylesheet" href="../../css/print.css" type="text/css" media="print"/>
<meta http-equiv="Content-Language" content="en"/>
</head>
<body class="composite">
<nav class="navbar navbar-light navbar-expand-md bg-faded justify-content-center border-bottom">
<!--a href="/" class="navbar-brand d-flex w-50 mr-auto">Navbar 3</a-->
<a href="https://plc4x.apache.org/" id="bannerLeft"><img src="../../images/apache_plc4x_logo_small.png" alt="Apache PLC4X"/></a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#collapsingNavbar3">
<span class="navbar-toggler-icon"></span>
</button>
<div class="navbar-collapse collapse w-100" id="collapsingNavbar3">
<ul class="navbar-nav w-100 justify-content-center">
<li class="nav-item">
<a class="nav-link" href="../../index.html">Home</a>
</li>
<li class="nav-item">
<a class="nav-link" href="../../users/index.html">Users</a>
</li>
<li class="nav-item">
<a class="nav-link" href="../../developers/index.html">Developers</a>
</li>
<li class="nav-item">
<a class="nav-link" href="../../apache/index.html">Apache</a>
</li>
</ul>
<ul class="nav navbar-nav ml-auto justify-content-end">
<li class="nav-item row valign-middle">
<a class="acevent" data-format="wide" data-mode="light" data-event="random" style="width:240px;height:60px;"></a>
</li>
</ul>
</div>
</nav>
<div class="container-fluid">
<div class="row h-100">
<main role="main" class="ml-sm-auto px-4 w-100 h-100">
<div class="sect1">
<h2 id="reverse_engineering_the_deltav_protocol">Reverse Engineering the DeltaV protocol</h2>
<div class="sectionbody">
<div class="paragraph">
<p>This document should describe what we did in order to reverse-engineer the DeltaV protocol.</p>
</div>
<div class="paragraph">
<p>The sole purpose of this document, is to write document our path in order to eventually protect ourselves against accusation of using illegal measures in gaining the information we got.</p>
</div>
<div class="sect2">
<h3 id="starting_point">Starting point</h3>
<div class="paragraph">
<p>We kindly were provided with access to a DeltaV system used for training.</p>
</div>
<div class="paragraph">
<p>Therefore we were able to change a few things without fearing to break anything important.</p>
</div>
<div class="paragraph">
<p>This system consisted of three DeltaV controllers, two Operator-System instances
So in general there were three types of systems involved:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>DeltaV Controller</p>
</li>
<li>
<p>DeltaV Operator-System</p>
</li>
<li>
<p>Laptop running WireShark</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>The first change here, was to replace the network switch with a network hub, so we could capture the entire network traffic.
(By the way &#8230;&#8203; if you happen to still have a working Hub, don&#8217;t give that away. It turns out it&#8217;s really hard to get you hands on hubs these days)</p>
</div>
<div class="paragraph">
<p>With this we let the system run and did a pretty long network capture.</p>
</div>
<div class="paragraph">
<p>This 29MB WireShark capture was what we started working with.</p>
</div>
</div>
<div class="sect2">
<h3 id="identifying_the_protocol">Identifying the protocol</h3>
<div class="paragraph">
<p>As WireShark didn&#8217;t have a DeltaV disector, we had to trace it down ourselves.</p>
</div>
<div class="paragraph">
<p>When looking at the capture, we very quickly filtered out the usual network protocols and the Windows communication and found a lot of binary <code>UDP</code> traffic running on port <code>18507</code>.</p>
</div>
<div class="paragraph">
<p>Special with all of these was that they all started with a payload of <code>0xFACE</code>.</p>
</div>
</div>
<div class="sect2">
<h3 id="first_steps_in_understanding_the_protocol">First steps in understanding the protocol</h3>
<div class="paragraph">
<p>Here it appeared that <code>UDP</code> packets are sent with some sort of payload, which are obviously responded to by packets with a fixed and very small size (64 bytes).</p>
</div>
<div class="paragraph">
<p>So we are assuming these short messages, that seem to partially repeat some of the information of the first packet (but not all), are probably acknowledge packets.</p>
</div>
<div class="paragraph">
<p>From looking at the nature of the traffic and knowing the IP addresses of the DeltaV controllers (PLC) and the Operator Systems (Controll-Software), we could see that this communication is probably subscription based as we could see a lot of the Controller sending packets to the OS, which the OS then acknowleges in regular intervals.</p>
</div>
<div class="paragraph">
<p>When filtering only the <code>UDP</code> packets on port <code>18507</code> we could scroll though the byte representation, we could quickly identify some things.</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Directly after the <code>0xFACE</code> head, came two bytes (short) which was identical for similar packet content, so we at first thought it somehow correlates to a type.</p>
</li>
<li>
<p>After this came another short value which also seemed identical for similar packets. However as there were differences, we assumed this was some sort of sub-type.</p>
</li>
<li>
<p>This was followed by one byte that seemed identical for all messages sent from one node to another, but we did find up to 3 different values in the dumps. So we at first thought this was some sort of connection id</p>
</li>
<li>
<p>The most obvious thing about the byte after that was, that it seemed to increment by one for messages with the same "connection id", so we assumed this was a message-id.</p>
</li>
<li>
<p>There were 3 bytes between the message-id and an obviously constant sequence of 3 bytes which, at first we had no idea what they might be used for.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>In order to prove any assumption we made, we used little programs that used Pcap4J to programmatically check assumptions.</p>
</div>
</div>
<div class="sect2">
<h3 id="decoding_the_wrapper_packet_header">Decoding the wrapper packet header</h3>
<div class="paragraph">
<p>As mentioned before, it appeared obvious that for almost identical packets the first two short values were very similar too, which led us to the assumption that these are <code>type</code> and `sub-type'.
So we wrote a little program, that scanned all the packets and counted how many types we had.
This was a very large number. Also did we output some additional information about packets on the console &#8230;&#8203; one of these pieces of information was the length of the packet.</p>
</div>
<div class="paragraph">
<p>When outputting a sorted list of the type values (maybe it&#8217;s strange, but I like data sorted) it seemed quite obvious, that for a larger presumed <code>type</code> value, also the length of the packet was larger.
Especially when comparing the difference of these values also showed that of a <code>type</code> had a value that is larger than another packet, that also the length was exactly this number of bytes larger.</p>
</div>
<div class="paragraph">
<p>So in the end it turned out that the first short value is not the <code>type</code>, but a packet length-related value.
It even turned out that the number encoded here was constantly 2 bytes short the number of bytes after the 3 constant bytes of <code>0x800400</code>, ending leaving two bytes at the end.</p>
</div>
<div class="paragraph">
<p>These two bytes seemed to be extremely jumpy.
Here we could see that for every packet this value seemed to change quite dramatically, even if the payload was almost identical.
We did see values repeated, but couldn&#8217;t see any correlation between the content of the colliding packets.
So in the end we assume this is some sort of checksum value.</p>
</div>
<div class="paragraph">
<p>With this assumption, we now think that the first short value is the <code>payload-length</code> value.</p>
</div>
<div class="paragraph">
<p>Knowing the first short value no longer is the type, upgraded the second short value to being a new candidate for being a full type.</p>
</div>
<div class="paragraph">
<p>Scanning the communication, it pretty quickly became obvious that here most communication packets have a value of <code>0x0002</code> and some have <code>0x0006</code>.
Only when starting a new OS instance while capturing, made some others pop up: <code>0x0003</code> and <code>0x0004</code>.</p>
</div>
<div class="paragraph">
<p>So we are assuming the <code>0x0003</code> and <code>0x0004</code> packets are related to establishing a connection.
Also we could see the node initiating a connection sends a <code>0x0003</code> packet, which is then replied to by a <code>0x0004</code> packet.</p>
</div>
<div class="paragraph">
<p>Then most information is sent via <code>0x0002</code> packets.
However every now and then there comes a pair of <code>0x0006</code> packets, who&#8217;s length seems to prevent being used to transfer any data.
We are therefore assuming these packets are used for syncing.</p>
</div>
<div class="paragraph">
<p>As mentioned before we did a lot of automatically splitting up PCAPNG files into separate ones depending on different conditions, because this allowed to browse through similar packets using simple human pattern recognition for decoding.</p>
</div>
<div class="paragraph">
<p>Pretty soon after starting to splitting up the streams, so the communication between two nodes was separated into one separate file, we did notice what we assumed to be a <code>connection-id</code> and a <code>mesasage-counter</code> instantly following the type.</p>
</div>
<div class="paragraph">
<p>However when doing the statistical counting and listing of these values, we noticed that after the message-id <code>0xFF</code> came the <code>0x00</code>, as we would have expected, but the assumed connection-id was also incremented by one.
So it turned out that the similarity of the first byte were not due to a connection-id, but rather the counter of just having the same first byte.
After the <code>type</code> byte seems to come a <code>message-id</code> counter.
It seems as if this initialized to some random value during the connection establishment and is then incremented with every message.</p>
</div>
<div class="paragraph">
<p>However our little check did reveal, that this id is identical for a packet following a <code>0x0006</code> packet.
Therefore we are assuming the <code>0x0006</code> packets are used to sync the connection between two nodes.</p>
</div>
<div class="paragraph">
<p>The short value (or a byte value following a 0x00 byte), after this message id seems to be constant for every message sent from ine node to another, but we think to have seen one node use different values for communicating with different nodes.
Also it seems that all Controllers seem to have values from <code>0x0000</code> to <code>0x0009</code> and OSes seem to have values from <code>0x000A</code> to <code>0x000F</code>.
We are assuming this is some sort of <code>sender-id</code> value.</p>
</div>
<div class="paragraph">
<p>The following 3 bytes were quite a mystery to us.
We could see that it seems to be an ever increasing number stored here, however the increment was not linear.
However when interpreting these 3 bytes as a number and filtering only packets initiated by the controller to one particular OS, filtering out Ack messages by accident, we did notice a certain pattern.
When correlating that to the capture file, we could see that the pattern correlated to the pattern in the communication.
We were told, that the OS is configured to update the values once a second and could see that pattern in the capture.
Here was when we noticed that the numeric value in the 3 bytes followed this pattern.
In the end it turns out that if we interpret this 3 byte number where the first 14 bits encoding seconds and the following 10 bits the sub-second part, the values in the timestamp matched the timestamps of the pacet capture.</p>
</div>
<div class="paragraph">
<p>We have to admit that we were a little surprised about this.
Currently we are assuming that this timestamp is used for tracking transmission times.
4 bytes would allow quite a time-range, but would come at a cost of one additional byte per packet, that wouldn&#8217;t be needed for simple run-time checking.
Two bytes however only would allow to encode only about 63 seconds, which is probably too little when thinking about packet-losses and re-transmissions.
So in the end we assume these 3 bytes are a simple timestamp.</p>
</div>
<div class="paragraph">
<p>As mentioned before, the following byte is usually <code>0x80</code> except for the first packet every party exchanges with the other side.
In this case the value is <code>0x00</code>.
So we&#8217;ll just keep that in mind an not worry about interpreting a meaning into this.
The last two bytes of the wrapper packet header are simply constantly <code>0x8000</code> in every packet.
Here too, we&#8217;ll just live with knowing that and not wondering what it could mean.</p>
</div>
</div>
<div class="sect2">
<h3 id="decoding_the_internal_packet">Decoding the internal packet</h3>
<div class="paragraph">
<p>After having decoded what we think is the wrapper-protocol packet structure, we knew decoding the internal protocol would be a bigger challenge.</p>
</div>
<div class="paragraph">
<p>Therefore the test-setup was changed a little.</p>
</div>
<div class="paragraph">
<p>The number of controllers and operator-systems were reduced to one each.
The OS was changed to display only one single value of the controller.
A simulator was introduced, which was connected to the controller and communicated to this via ProfiNet protocol.
This simulator allowed us to provide values to the controller via the backend ProfiNet connection.</p>
</div>
<div class="paragraph">
<p>With this setup we started working on the payload decoding.</p>
</div>
<div class="paragraph">
<p>The first two bytes of the payload seem again to relate to a type as similar packets usually had identical values here.</p>
</div>
<div class="paragraph">
<p>Especially we did a lot of separate captures for all sorts of different operations.
During this we managed to reverse engineer the connection on the wrapper-protocol layer as well how the communication looks inside the internal protocol.</p>
</div>
<div class="sect3">
<h4 id="connecting">Connecting</h4>
<div class="paragraph">
<p>When an OS is booted, the following communication could be observed:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>The OS sends a wrapper-protocol type <code>0x0003</code> to the controller.</p>
</li>
<li>
<p>The controller responds with a <code>0x0004</code> message back to the controller.</p>
</li>
<li>
<p>After receiving the <code>0x0004</code> message, the OS sends a type <code>0x0002</code> message with a wrapped type <code>0x0501</code></p>
</li>
<li>
<p>This packet is acknowledged by the controller</p>
</li>
<li>
<p>The controller sends a very similar packet as the <code>0x0501</code> packet back to the OS</p>
</li>
<li>
<p>The OS acknowledges this packet back to the controller</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>From looking at the packet in WireShark&#8217;s byte and ASCII view, we could see ASCII characters that seem to relate to test strings.
A quick consultation of the guy in charge of these systems made us realize that these strings refer to parts of the controllers software.
Digging even deeper made us realize that the 4 byte values relate to unix timestamps - at least we could decode the byte values to absolutely sensible dates.
So it seems that In these packets each participant compares it&#8217;s software state in order to know all are using compatible versions.</p>
</div>
<div class="paragraph">
<p>This does make sense.
When connecting to a remote system, checking the version compatibility is surely an important thing to do.</p>
</div>
</div>
<div class="sect3">
<h4 id="normal_operation">Normal operation</h4>
<div class="paragraph">
<p>When just starting the system and not changing anything, we did notice those previously mentioned <code>0x0006</code> type messages we think are responsible for keeping the connections in sync.</p>
</div>
<div class="paragraph">
<p>What is also noticeable when just starting a System and not logging in to Windows yet, is that the DeltaV Operator System seems to connect before a user logs in to Windows.
After the connection is established, the Controller almost immediately starts sending alarm telegrams (0x03XX types), but no value telegrams (0x04xx).</p>
</div>
<div class="paragraph">
<p>It also appears that there is some upper bound for the size of packets.
As when seeing a lot of alarms being sent in 0x0304 messages, sometimes these are split up into multiple messages sent in very short intervals.
The last of these usually being smaller than the rest.</p>
</div>
</div>
<div class="sect3">
<h4 id="changing_values">Changing values</h4>
<div class="paragraph">
<p>So now we started changing values.
First we started changing valid values into other valid values.</p>
</div>
<div class="paragraph">
<p>So whenever we changed a value immediately a packet with wrapper-type <code>0x0002</code> was written from the controller to the os with a payload-type <code>0x0403</code>.</p>
</div>
<div class="paragraph">
<p>Whenever we changed values to values that raise alarms, we could see the same <code>0x0403</code> messages being sent, but the payload of these being a lot larger, but additionally packets with payload-types starting with <code>0x03</code> being sent.</p>
</div>
<div class="paragraph">
<p>So after increasing a value to a value that raises alarms, we got a payload-type <code>0x0304</code> message containing a string-like representation of the Name of the alarm raised as well as probably a name of a user or role.
Also did we notice the <code>0x0403</code> packets seem to contain String constants that relate exactly to the text displayed at the bottom of each screen window.</p>
</div>
<div class="paragraph">
<p>We assume payload-types starting with <code>0x04</code> relate to normal data exchange and ones with <code>0x03</code> relating to events and alarms.</p>
</div>
</div>
<div class="sect3">
<h4 id="decoding_value_changes">Decoding value changes</h4>
<div class="paragraph">
<p>After coming to the conclusion that wrapper-type <code>0x0002</code> messages with a payload-type <code>0x0403</code> seem to relate to value changes, we started filtering for exactly these packages.</p>
</div>
<div class="paragraph">
<p>Next we created some captures that contain all sorts of documented value changes.
So we started with the value 0 and incremented that in steps of 1 till 20 and then in steps of 10 up to 200, going back to 0 and then down to the minimum value of -40.</p>
</div>
<div class="paragraph">
<p>And we could see that there were certain parts of the packet that seem to change when changing values, most of the packet remaining constant.
When changing a value back to the original value, the packets payload was identical in this part.
As the packets didn&#8217;t all have the same content and some times blocks were added and left away in regular intervals, lead us to the impression, that these packets too consist of a fixed packet header followed by blocks.
In our case the changing blocks all had two indicator bytes <code>0x0b08</code> with the 4 bytes after that changing according to the input.
Sometimes a block of bytes was added before this block, then the entire content was just pushed back.</p>
</div>
<div class="paragraph">
<p>The last of such blocks would always be a sequence of 9 bytes: <code>0x01000000000000000000</code>, so we assume this is sort of the <code>terminating block</code> of such a packet.</p>
</div>
<div class="paragraph">
<p>One thing quite obvious is that the header does again seem to contain a counter of one byte which is incremented for every <code>0x0403</code> packet.</p>
</div>
<div class="paragraph">
<p>As we currently wanted to understand how the data is encoded we decided to ignore all other parts and concentrate on the block we identified as containing the information we were looking for.</p>
</div>
<div class="paragraph">
<p>Now how should we interpret this information. At first I was told we were working with an unsigned 16 bit integer.
Unfortunately we could see 4 bytes changing.
It took us quite a while.
As mentioned before we created a table containing the value we input and what we could see in the packet.
Unfortunately we started seeing patterns, but couldn&#8217;t quite get the meaning, however we did notice the "end" of the hex-values alternating around <code>0x00001</code> and <code>0xFFFFF</code>.
This didn&#8217;t help much, so we converted these 4 bytes into a numeric value.
And we did notice the numbers being a lot bigger for negative values, which made us think, maybe the first bit is used for the sign.
We could confirm that a value of +20 and -20 did produce a value that was equal, except for the first bit.
Unfortunately still we didn&#8217;t quite understand how to interpet the other 31 bits.
So we converted the payload to it&#8217;s binary representation and this is when we finally found out what it meant.
So in reality this value was not a 16 bit integer, but a <code>IEEE-754 Floating Point</code> value.
Here the first bit is interpreted as sign, the following an 8 bit exponent and a 23 bit mantisse.
This explained the end of the value alternating around 0.</p>
</div>
<div class="paragraph">
<p>With this we were able to write a simple console application that was able to instantly display value changes.</p>
</div>
<div class="paragraph">
<p>However when using what we learnt on the old captures we did, it turned out that unfortunately we could no longer find these <code>0x0b08</code> sequences, however we did find a lot of <code>0x08</code> followed by a 32 bit floating point value.
So it seems that <code>0x08</code> indicates the type of an <code>signed 32 bit IEE 754 floating point value</code>.
Perhaps the <code>0x0b</code> part referred to some sort subscription-id.</p>
</div>
<div class="paragraph">
<p>As we seem to be doing a subscription based communication, the OS has to tell the controller what information it is interested in.</p>
</div>
<div class="paragraph">
<p>Right now we are assuming that in one of the other <code>0x04</code> packets a subscription is initiated and assigned with some sort of id and this id is used to identify which value is actually changed.</p>
</div>
<div class="paragraph">
<p>We will hopefully be able to decode this addressing problem in one of our next reverse-engineering sessions.</p>
</div>
</div>
<div class="sect3">
<h4 id="decoding_strings">Decoding Strings</h4>
<div class="paragraph">
<p>As we mentioned before, that we could see content in the packages that were sometimes readable from just looking at the payload, we decided to have another look at these.</p>
</div>
<div class="paragraph">
<p>After some time, we did notice, that it seems that Strings have a type of <code>0x00</code> and are followed by one byte or are a short value containing the length of the string value.
This is then followed by each character being output using two bytes.
We assume this is in order to allow transferring <code>UTF-16</code> (two byte) big characters, but never found any.
Right now the first byte of every block is simply <code>0x00</code> followed by the <code>UTF-8</code> character value.</p>
</div>
</div>
</div>
</div>
</div>
</main>
<footer class="pt-4 my-md-5 pt-md-5 w-100 border-top">
<div class="row justify-content-md-center" style="font-size: 13px">
<div class="col col-6 text-center">
Copyright &#169; 2017&#x2013;2022 <a href="https://www.apache.org/">The Apache Software Foundation</a>.
All rights reserved.<br/>
Apache PLC4X, PLC4X, Apache, the Apache feather logo, and the Apache PLC4X project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries. All other marks mentioned may be trademarks or registered trademarks of their respective owners.
<br/><div style="text-align:center;">Home screen image taken from <a
href="https://flic.kr/p/chEftd">Flickr</a>, "Tesla Robot Dance" by Steve Jurvetson, licensed
under <a href="https://creativecommons.org/licenses/by/2.0/">CC BY 2.0 Generic</a>, image cropped
and blur effect added.</div>
</div>
</div>
</footer>
</div>
</div>
<!-- Bootstrap core JavaScript
================================================== -->
<!-- Placed at the end of the document so the pages load faster -->
<script src="../../js/jquery.slim.min.js"></script>
<script src="../../js/popper.min.js"></script>
<script src="../../js/bootstrap.min.js"></script>
<script type="text/javascript">
$('.carousel .carousel-item').each(function(){
var next = $(this).next();
if (!next.length) {
next = $(this).siblings(':first');
}
next.children(':first-child').clone().appendTo($(this));
for (let i = 0; i < 3; i++) {
next=next.next();
if (!next.length) {
next = $(this).siblings(':first');
}
next.children(':first-child').clone().appendTo($(this));
}
});
</script>
</body>
</html>