blob: 3e58fb09808a56a18c34b143dab5d296ba7a5585 [file] [log] [blame]
Title: 8 - Schema
NavPrev: 7-ldap-messages.html
NavPrevText: 7 - LDAP Messages
NavUp: ../internal-design-guide.html
NavUpText: Internal Design Guide
NavNext: 9-dn.html
NavNextText: 9 - DN
Notice: Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.
http://www.apache.org/licenses/LICENSE-2.0
.
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
# 8 - Schema
## Schema parsers
We use a set of classes to parse schema elements. There are 11 flavors of schema elements, 8 of them being described in a **RFC**, 3 of them being ApacheDS proprietary:
* [AttributeType](https://tools.ietf.org/html/rfc4512#section-4.1.2)
* [DitContentRule](https://tools.ietf.org/html/rfc4512#section-4.1.6)
* [DitStructureRule](https://tools.ietf.org/html/rfc4512#section-4.1.7.1)
* [LDAPSyntax](https://tools.ietf.org/html/rfc4512#section-4.1.5)
* [MatchingRule](https://tools.ietf.org/html/rfc4512#section-4.1.3)
* [MatchingRuleUse](https://tools.ietf.org/html/rfc4512#section-4.1.4)
* [NameForm](https://tools.ietf.org/html/rfc4512#section-4.1.7.2)
* [ObjectClass](https://tools.ietf.org/html/rfc4512#section-4.1.1)
and
* LdapComparator
* Normalizer
* SyntaxChecker
We need to be able to parse those schema elements because they can be added into the server as a description (ie, a String representing one of those schema elements as defined by the RFC). For the same reason, the **LDAP API** need to validate that those schema elements are valid before sending them to a **LDAP SERVER**, or to be able to properly parse what it gets from a **LDAP server**.
## Strict vs quirks mode
Here we have a problem : most of the LDAP server implementation violate the RFC. We can't simply expect the String representing a schema element to be compliant with the RFC. Some typical deviations are :
* OpenLDAP uses some macro instead of OIDs. This is convenient, as it allows to define the root OID with a name, and reuse it in the associated schema elements
* AD and many other servers expect some specific characters to be accepted, like '_', ':', '#', ...
* Sometime, the values may come without quotes, when it's required
* etc.
We will define the _strict mode_ a mode which follows the **RFC** tightly, and the _quirks mode_ a relaxed version of the parser, more permissive. One can use either the strict or relaxed mode using a flag.
### Strict mode
The only thing we will relax is the order in which the various parts of each description is present in a schema description : we don't expect them to be ordered as described in the RFC.
The various parts are defined using a few syntaxes :
* _NAME_: qdescrs
* _DESC_: qdstring
* _SUP_ (**ObjectClass**), _MUST_, _MAY_, _APPLIES_, _AUX_, _NOT_: oids
* _SUP_ (**AttributeType**), _EQUALITY_, _ORDERING_, _SUBSTR_, _FORM_, _OC_: oid
* _SYNTAX_ (**AttributeType**): noidlen
* _SYNTAX_ (**MathingRule**): numericoid
* _SUP_ (**DitStructureRule**): ruleids
* _descr_: oid, qdescrs
* _qdescr_: qdescrs, qdescrlist
_qdescrs_ and _oids_ may contain one or many _qdescr_ and _oid_.
#### descr, strict
The _descr_ construct is used by _oid_ and _qdescrs_ (an _OID_ can be a name). The strict mode will use this grammar :
descr ::= keystring
keystring ::= leadkeychar keychar*
leadkeychar ::= ALPHA
keychar ::= ALPHA | DIGIT | HYPHEN
ALPHA ::= ['A'..'Z'] | ['a'..'z']
DIGIT ::= ['0'..'9']
HYPHEN ::= '-'
SQUOTE ::= '\''
#### qdstring, strict
A _qdstring_ can contain any type of **UTF-8** characters, except the simple quote or the backslash, which must be encoded. It's always surrounded by simple quotes :
:::text
qdstring ::= SQUOTE dstring SQUOTE
dstring ::= ( QS | QQ | QUTF8 )*
QQ ::= ESC %x32 %x37
QS ::= ESC %x35 ( %x43 / %x63 )
QUTF8 ::= QUTF1 | UTFMB
QUTF1 ::= %x00-26 | %x28-5B | %x5D-7F
#### qdescr, strict
_qdescr_ is a quoted name, where the first char must be alphabetic, and the following chars must be alphabetic, digits or hyphen. Here is the **ABNF** for _qdescr_ :
:::text
qdescr ::= SQUOTE descr SQUOTE
#### noidlen, strict
### Relaxed mode
#### qdstring, relaxed
There
#### descr, relaxed
The relaxed _descr_ accepts more characters, like underscore, semi-colon, dot, colon or sharp. The leadkeychar will not be mandatory, too. Here is the **ABNF** we will accept :
relaxed-descr ::= relaxed-keystring
leaxed-keystring::= keychar+
relaxed-keychar ::= ALPHA | DIGIT | HYPHEN | UNDERSCORE | SEMICOLON | DOT | COLON | SHARP
ALPHA ::= ['A'..'Z'] | ['a'..'z']
DIGIT ::= ['0'..'9']
HYPHEN ::= '-'
UNDERSCORE ::= '_'
SEMI_COLON ::= ';'
COLON ::= ':'
SDOT ::= '.'
SHARP ::= '#'
#### qdescr, relaxed
Compared to the strict mode, we will accept a non-quoted String, or a String using double quotes.
:::text
relaxed-qdescr ::= SQUOTE relaxed-descr SQUOTE | DQUOTE relaxed-descr DQUOTE | relaxed-descr
#### oid, relaxed
We will accept quoted and double quoted OIDs and Names, in relaxed mode. Here is teh supported **ABNF** :
:::text
oid-relaxed ::= SQUOTE relaxed-descr SQUOTE | DQUOTE relaxed-descr DQUOTE | descr-relaxed |
SQUOTE numericoid SQUOTE | DQUOTE numericoid DQUOTE | numericoid
#### noidlen, strict
Here, we will allow textual syntax name to be used, not only OIDs. For instance, something like _SYNTAX IA5String_ will be allowed.
We also allow quoted and double quoted OIDs.