| Title: 4 - ASN/1 |
| NavPrev: 3-building.html |
| NavPrevText: 3 - Building |
| NavUp: ../internal-design-guide.html |
| NavUpText: Internal Design Guide |
| NavNext: 4.1-asn1-tlv.html |
| NavNextText: 4.1 - ASN/1 TLV |
| Notice: Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| . |
| http://www.apache.org/licenses/LICENSE-2.0 |
| . |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| |
| # 4 - ASN/1 |
| |
| To be completed... |
| |
| |
| The **LDAP** protocol is based on an **ASN/1** description. We will notexplain in detail what is **ASN/1** about, you would rather check [This page](https://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One) for a very limited introduction, or if you feel teh need to understand what is **ASN/1** in detail, just read the [Olivier Dubuisson's book on ASN.1](http://www.oss.com/asn1/resources/books-whitepapers-pubs/dubuisson-asn1-book.PDF) (This is probably the best reference !) |
| |
| Anyway, we use a subset of **ASN/1**, as what we have to deal with is the **BER/DER** encoding. (**BER** or **DER** stands for **B**asic **E**ncoding **R**ule and **D**istinguished **E**ncoding **R**ule. There are other possible encoding, like **PER**, **XER**, **CER**, but they are irrelevant for **LDAP**) |
| |
| What is needed to know is that **ASN/1** is just a notation used to describe the messages being exchanged between a client and a server, and in order to use it, we need an encoder and a decoder on both sides : |
| |
| ![Client/Server communication](images/asn1-codec.png) |
| |
| ## ASN/1 implementation in Apache LDAP API |
| |
| It took a long time to get it right ! And it's not perfect :-) |
| |
| The very first iteration was using a proprietary library (**IBM SNACC**), but that was before **ApacheDS** became a **TLP** ! The next iteration was based on a rewriting system, which was pretty slow. Then came **Snicker**, a _State Machine_ based decoder, which is currently what we use. We might change for a faster implementation, like what **Kerby** is using... |
| |
| ### ASN/1 messages |
| |
| Let's start with the basic information. |
| |
| An encoded ASN/1 message is a tuple contianing two or three elements : a **T**ype, a **L**ength and optionally - ie if the length is not 0 - a **V**alue. This tuple is called a **TLV**. Every message is a **TLV**. |
| |
| But a message can be have complex structure, so a **TLV** itself can encapsulate some **TLV**s. Actually the **V** part can be a list of **TLV**s. This is recursive... |
| |
| A typical encoded message can therefore represented this way : |
| |
| ::: |
| [TL [TLV] [TL [TLV] [TLV]]] |
| |
| Here, the message **TLV** value is a set of two **TLV**s, teh second one being itself a composition of 2 **TLV**s. |
| |
| The **T** describe the type of value, the **L** gives the length of this value (can be 0) and of course the **V** is the value, which can itself be a **TLV**. |
| |
| ### Encoder/Decoder |
| |
| There are two aspects we have to deal with : |
| |
| * encoding messages |
| * decoding messages |
| |
| Those are two different things, and we don't use the same mechanism. **Encoding** is done using a _State Machine_, and **Decoding** which is hard wired in each class implementing a message. |
| |
| As we said, it's not perfect, first because it's complex to implement, complex to add a new message, and complex to test. We don't have a compiler that generates the stubs to encode or decode messages. |
| |
| ### Decoder |
| |
| The _Decoder_ work is to take a **byte[]** and transform it into an instance of a jave object. When we receive the **byte[]**, we don't know yet what kind of message we are dealing with, so the creation of the instance is differed. |
| |
| We have built a generic decoder that takes some imputs and produces the result, based on those elements : |
| |
| * A _Grammar_ |
| * A _Container_ |
| * A _StateEnum_ |
| * A _Decorator_ |
| * and optionally a _Factory_ |
| |
| The _Grammar_ describes the transitions and actions of the state machine used to decode a message. Note that the actions can be stored in separate classes. |
| |
| The _Container_ is a wrapper around a message that is fed by the State Machine and that will contain the Java instance once fully decoded. It's initally empty. |
| |
| The _StateEnum_ is a Java enumeration listing all the possible _Grammar_ states. |
| |
| The _Decorator_ is a wrapper used to store a decoded message. |
| |
| The _Factory_ is used to create the message instance (it's optional) |
| |
| And of course, you have the messsage class that will be created and stored in the _Decorator_ |
| |
| So what we have is based on a **State Engine**, which means you have to describe |
| |
| |
| ### Encoder |
| |
| It's slightly simpler : we use the *Decorator* to implement the encoding of a message. Two methods are necessary : |
| |
| * _int computeLength()_ : compute the _ByteBuffer_ size necessary to stored the encoded message |
| * _ByteBuffer encode( ByteBuffer )_ : actually encode the message into a _ByteBuffer_ |
| |
| ### The state machine |
| |
| So we decode a message using a state machine, which basically transit from one state to another, and optionally execute an action in between : |
| |
| ![State Machine transition](images/sm-transition.png) |
| |
| Now, let's see a real example. |