blob: afe66250efc0500356a2b9795f2cfb555529a3f2 [file] [log] [blame]
//
// Licensed to the Apache Software Foundation (ASF) under one or more
// contributor license agreements. See the NOTICE file distributed with
// this work for additional information regarding copyright ownership.
// The ASF licenses this file to You under the Apache License, Version 2.0
// (the "License"); you may not use this file except in compliance with
// the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
:imagesdir: ../../../images/
== The MSpec format
The `MSpec` format (Message Specification) was a result of a brainstorming session after evaluating a lot of other options.
We simply sat down and started to write some imaginary format (`imaginary` was even the initial Name we used) and created parses for this afterwards and fine tuned spec and parsers as part of the process of implementing first protocols and language templates.
It's a text-based format.
=== Messages
At the root level of these specs are a set of `type`, `discriminatedType` blocks.
Additionally `enum` and `dataIo` elements can be specified.
`type` elements are objects that are independent of the input.
You can think of these as message structures which do not have variable parts.
In case of protocols usually these are "envelopes" which allows to specify some sort of header and data.
An example would be the `TPKTPacket` of the S7 format:
....
[type 'TPKTPacket'
[const uint 8 'protocolId' '0x03']
[reserved uint 8 '0x00']
[implicit uint 16 'len' 'payload.lengthInBytes + 4']
[field COTPPacket 'payload']
]
....
In above example we have a `TPKT` packet which starts from fixed value, usually it is a magic byte meaning start of the frame or packet,
followed by another marker byte with length field and payload.
Most of the messages get encoded as a sequence of Type-Length-Value (so-called TLV), above is one example of it.
The reason for constructing messages in such a way is historical as well as practical.
The receiver of such a message can decide if a received message is interesting for him based on the type, and skip it if not.
The author of code will also have a chance to allocate enough memory for data to come, as well as will know if a value matches a predefined contract.
What happens when we have a variable kind of value or different fields depending on the type?
That's where `discriminated type` comes in the play.
A `discriminatedType` type is a message who's content is dependent in the input.
Every discriminated type can contain at most one `discriminator` field and exactly one `typeSwitch` element.
For example part of the spec for the S7 format looks like this:
....
[discriminatedType 'S7Message'
[const uint 8 'protocolId' '0x32']
[discriminator uint 8 'messageType']
[reserved uint 16 '0x0000']
[field uint 16 'tpduReference']
[implicit uint 16 'parameterLength' 'parameter.lengthInBytes']
[implicit uint 16 'payloadLength' 'payload.lengthInBytes']
[typeSwitch 'messageType'
['0x01' S7MessageRequest
]
['0x03' S7MessageResponse
[field uint 8 'errorClass']
[field uint 8 'errorCode']
]
['0x07' S7MessageUserData
]
]
[field S7Parameter 'parameter' ['messageType']]
[field S7Payload 'payload' ['messageType', 'parameter']]
]
....
Similarly, like with the above example of a `TPKT` packet, we have a bunch of fields plus a `type switch`.
The `type switch` is located at the place where additional fields from other message variants might come.
Types start as declared by an opening square bracket `[` and ended with a closing one `]`.
Each variant of a message is constructed from two values - the value of discriminator and its name followed by 0 or more field declarations.
=== Field types
`Type` and `discriminatedType` define kinds of messages given protocol ship.
You probably noticed that these top-level elements contain children.
These are the fields that declare parts of the message.
Top elements and top-level packets of protocols form, as said above, an envelope.
When we communicate we use it, but what matters for a receiver is a payload, a data which is passed.
Fields are sections of payload.
Each field declaration consists of a field type, its data type, and a name.
There are a couple of types of fields which are defined by `MSpec` format which are used to describe data coming together with the message.
- const: a fixed value which should be in place, this is a field that always expects a given value, breaking this contract will cause a message to be marked as invalid and dropped.
- reserved: similarly, as above, that's the field which expects a given value, but only warns if payload does not match.
- field: a value of given type which might be a simple or complex typed object, we'll explain types of fields below.
- array: a sequence of elements of a given type.
- optional: the field which is only present in some conditions.
- implicit: a field which is required for parsing or writing, but is usually defined through other parts of a message, a common use is using a length of "data" section to be placed before actual data.
- discriminator: a special kind of a field in which value is later used to determine a type of concrete message.
When `discriminator` field is defined you will have to provide a `typeSwitch` field.
`MSpec` permits max one discriminator per type and always.
The `disciminator` field can only be declared in `discriminatedType`.
- typeSwitch: a field that is a placeholder for sections of data dependant on sub-types, which are declared inline.
Type switch may be defined only under `discriminatedType`.
The full syntax and explanations of these types follow in the below chapters.
Another thing we have to explain is how types are specified.
You probably noticed basic principle for definitions, which looks like this:
```
[<field-type> <data-type> <name>]
```
In general, we distinguish between two kinds of data types used in field declarations:
- simple types, refereed also as simple data types
- complex types, named also complex data types
==== Simple Types
Simple data types are ones that can be read and interpreted directly using standard notations.
Usually, raw data which comes in a message can be described using the format:
{base-type} {size}
The base types available in `MSpec` are currently:
- *bit*: Simple boolean value.
- *uint*: The input is treated as unsigned integer value.
- *int*: The input is treated as signed integer value.
- *float*: The input is treated as signed floating point number.
- *ufloat*: The input is treated as unsigned floating point number.
- *string*: The input is treated as string.
- *time*: The input is treated as time.
- *date*: The input is treated as date.
- *datetime*: The input is treated as date time.
The size value then provides how many `bits` should be read.
In case of `string` types, it refers to the number of characters.
At the physical level, everything can be seen as a sequence of bits forming bytes.
When we move to the message level we want to define payloads to start working with necessary logic.
To let implement logic relatively straight forward, we turn communicated bytes underneaths into values that they represent.
Thanks to that MSpec can provide to a programmer, who starts to implement a protocol, standard types representing numbers, strings, dates instead of byte and bit-level operations.
Still, if a programmer wants to implement his own logic necessary to work with a given part of a message he can, for example, read a signed byte by stating `int 8` in a data type definition.
If protocol declares that given part of message is a positive (unsigned) number coded on two bytes then it can be declared as `uint 16`.
Note that `MSpec` does not use notation of `short`, `long` and such. It is intentional. We want to avoid confusion caused by different handling of such types in the programming languages.
Instead, all simple data types are defined using type name and its length.
==== Complex Types
Because simple data types are limited to a very narrow set, and messages can contain complex data, `MSpec` allows a programmer to define own types.
In contrast to simple data types, complex data types can be "constructed", meaning that they can join multiple fields of simple type, mix simple and complex data types or use just complex data types.
These types can also accept additional parameters to group common parts of different messages used across the protocol.
Complex data types are parsed and serialized using the same logic as other messages.
Complex data types are `type` and `discriminatedType` defined at the root of `MSpec`.
Be aware that the primary difference in the declaration of fields that refer to complex data types is lack of type length, as combined type might be flexible and maybe only determined only at the runtime when data arrives.
In the example above, for example, the `S7Parameter` is defined in another part of the spec.
==== Field Types and their Syntax
===== array Field
An `array` field is exactly what you expect.
It generates an field which is not a single-value element but an array or list of elements.
[array {simple-type} {size} '{name}' {'count', 'length', 'terminated'} '{expression}']
[array {complex-type} '{name}' {'count', 'length', 'terminated'} '{expression}']
Array types can be both simple and complex typed and have a name.
An array field must specify the way it's length is determined as well as an expression defining it's length.
Possible values are:
- `count`: This means that exactly the number of elements are parsed as the `expression` specifies.
- `length`: In this case a given number of bytes are being read. So if an element has been parsed and there are still bytes left, another element is parsed.
- `terminated`: In this case the parser will continue reading elements until it encounters a termination sequence.
===== checksum Field
A checksum field can only operate on simple types.
[checksum {simple-type} {size} '{name}' '{checksum-expression}']
When parsing a given simple type is parsed and then the result is compared to the value the `checksum-expression` provides.
If they don't match an exception is thrown.
When serializing, the `checksum-expression` is evaluated and the result is then output.
This field doesn't keep any data in memory.
See also:
- implicit field: A checksum field is similar to an implicit field, however the `checksum-expression` is evaluated are parsing time and throws an exception if the values don't match.
===== const Field
A const field simply reads a given simple type and compares to a given reference value.
[const {simple-type} {size} '{name}' '{reference}']
When parsing it makes the parser throw an Exception if the parsed value does not match.
When serializing is simply outputs the expected constant.
This field doesn't keep any data in memory.
See also:
- implicit field: A const field is similar to an implicit field, however it compares the parsed input to the reference value and throws an exception if the values don't match.
===== discriminator Field
Discriminator fields are only used in `discriminatedType`s.
[discriminator {simple-type} {size} '{name}']
When parsing a discriminator fields result just in being a locally available variable.
When serializing is accesses the discriminated types constants and uses these as output.
See also:
- implicit field: A discriminator field is similar to an implicit field, however doesn't provide a serialization expression as it uses the discrimination constants of the type it is.
- discriminated types
===== implicit Field
Implicit types are fields that get their value implicitly from the data they contain.
[implicit {simple-type} {size} '{name}' '{serialization-expression}']
When parsing an implicit type is available as a local variable and can be used by other expressions.
When serializing the serialization-expression is executed and the resulting value is output.
This type of field is generally used for fields that handle numbers of elements or length values as these can be implicitly calculated at serialization time.
This field doesn't keep any data in memory.
===== manualArray Field
[manualArray {simple-type} {size} '{name}' {'count', 'length', 'terminated'} '{loop-expression}' '{serialization-expression}' '{deserialization-expression}' '{length-expression}']
[manualArray {complex-type} '{name}' {'count', 'length', 'terminated'} '{loop-expression}' '{serialization-expression}' '{deserialization-expression}' '{length-expression}']
===== manual Field
[manual {simple-type} {size} '{name}' '{serialization-expression}' '{deserialization-expression}' '{length-expression}']
[manual {complex-type} '{name}' '{serialization-expression}' '{deserialization-expression}' '{length-expression}']
===== optional Field
An optional field is a type of field that can also be `null`.
[optional {simple-type} {size} '{name}' '{optional-expression}']
[optional {complex-type} '{name}' '{optional-expression}']
When parsing the `optional-expression` is evaluated. If this results in`false` nothing is output, if it evaluates to `true` it is serialized as a `simple` field.
When serializing, if the field is `null` nothing is output, if it is not `null` it is serialized normally.
See also:
- simple field: In general `optional` fields are identical to `simple` fields except the ability to be `null` or be skipped.
===== padding Field
A padding field outputs additional padding data, if an expression evaluates to `true`.
[padding {simple-type} {size} '{pading-value}' '{padding-expression}']
When parsing a `padding` field is just consumed without being made available as property or local valiable if the `padding-expression` evaluates to true.
If it doesn't, it is just skipped.
This field doesn't keep any data in memory.
===== reserved Field
Reserved fields are very similar to `const` fields, however they don't throw exceptions, but instead log messages if the values don't match.
The reason for this is that in general reserved fields have the given value until they start to be used.
If the field starts to be used this shouldn't break existing applications, but it should raise a flag as it might make sense to update the drivers.
[reserved {simple-type} {size} '{name}' '{reference}']
When parsing the values is parsed and the result is compared to the reference value.
If the values don't match, a log message is sent.
This field doesn't keep any data in memory.
See also:
- `const` field
===== simple Field
Simple fields are the most common types of fields.
A `simple` field directly mapped to a normally typed field.
[simple {simple-type} {size} '{name}']
[simple {complex-type} '{name}']
When parsing, the given type is parsed (can't be `null`) and saved in the corresponding model instance's property field.
When serializing it is serialized normally.
===== virtual Field
Virtual fields have no impact on the input or output.
They simply result in creating artificial get-methods in the generated model classes.
[virtual {simple-type} {size} '{name}' '{value-expression}']
[virtual {complex-type} '{name}' '{value-expression}']
Instead of being bound to a property, the return value of a `virtual` property is created by evaluating the `value-expression`.
===== typeSwitch Field
These types of fields can only occur in discriminated types.
A `discriminatedType` must contain *exactly one* `typeSwitch` field, as it defines the sub-types.
[typeSwitch '{arument-1}', '{arument-2}', ...
['{argument-1-value-1}' {subtype-1-name}
... Fields ...
]
['{vargument-1-value-2}', '{argument-2-value-1}' {subtype-2-name}
... Fields ...
]
['{vargument-1-value-3}', '{argument-2-value-2}' {subtype-2-name} [uint 8 'existing-attribute-1', uint 16 'existing-attribute-2']
... Fields ...
]
A type switch element must contain a list of at least one argument expression.
Only the last option can stay empty, which results in a default type.
Each sub-type declares a comma-separated list of concrete values.
It must contain at most as many elements as arguments are declared for the type switch.
The matching type is found during parsing by starting with the first argument.
If it matches and there are no more values, the type is found, if more values are provided, they are compared to the other argument values.
If no type is found, an exception is thrown.
Inside each sub-type can declare fields using a subset of the types (`discriminator` and `typeSwitch` can't be used here)
The third case in above code-snippet also passes a named attribute to the sub-type.
The name must be identical to any argument or named field parsed before the switchType.
These arguments are then available for expressions or passing on in the subtypes.
See also:
- `discriminatedType`
===== Parameters
Some times it is necessary to pass along additional parameters.
If a complex type requires parameters, these are declared in the header of that type.
....
[discriminatedType 'S7Payload' [uint 8 'messageType', S7Parameter 'parameter']
[typeSwitch 'parameter.discriminatorValues[0]', 'messageType'
['0xF0' S7PayloadSetupCommunication]
['0x04','0x01' S7PayloadReadVarRequest]
['0x04','0x03' S7PayloadReadVarResponse
[arrayField S7VarPayloadDataItem 'items' count 'CAST(parameter, S7ParameterReadVarResponse).numItems']
]
['0x05','0x01' S7PayloadWriteVarRequest
[arrayField S7VarPayloadDataItem 'items' count 'COUNT(CAST(parameter, S7ParameterWriteVarRequest).items)']
]
['0x05','0x03' S7PayloadWriteVarResponse
[arrayField S7VarPayloadStatusItem 'items' count 'CAST(parameter, S7ParameterWriteVarResponse).numItems']
]
['0x00','0x07' S7PayloadUserData
]
]
]
....
Therefore wherever a complex type is referenced an additional list of parameters can be passed to the next type.
Here comes an example of this in above snippet:
[field S7Payload 'payload' ['messageType', 'parameter']]