We use a set of classes to parse schema elements. There are 11 flavors of schema elements, 8 of them being described in a RFC, 3 of them being ApacheDS proprietary:
and
We need to be able to parse those schema elements because they can be added into the server as a description (ie, a String representing one of those schema elements as defined by the RFC). For the same reason, the LDAP API need to validate that those schema elements are valid before sending them to a LDAP SERVER, or to be able to properly parse what it gets from a LDAP server.
Here we have a problem : most of the LDAP server implementation violate the RFC. We can't simply expect the String representing a schema element to be compliant with the RFC. Some typical deviations are :
We will define the strict mode a mode which follows the RFC tightly, and the quirks mode a relaxed version of the parser, more permissive. One can use either the strict or relaxed mode using a flag.
The only thing we will relax is the order in which the various parts of each description is present in a schema description : we don't expect them to be ordered as described in the RFC.
The various parts are defined using a few syntaxes :
NAME: qdescrs
DESC: qdstring
SUP (ObjectClass), MUST, MAY, APPLIES, AUX, NOT: oids
SUP (AttributeType), EQUALITY, ORDERING, SUBSTR, FORM, OC: oid
SYNTAX (AttributeType): noidlen
SYNTAX (MathingRule): numericoid
SUP (DitStructureRule): ruleids
descr: oid, qdescrs
qdescr: qdescrs, qdescrlist
qdescrs and oids may contain one or many qdescr and oid.
The descr construct is used by oid and qdescrs (an OID can be a name). The strict mode will use this grammar :
descr ::= keystring keystring ::= leadkeychar keychar* leadkeychar ::= ALPHA keychar ::= ALPHA | DIGIT | HYPHEN ALPHA ::= ['A'..'Z'] | ['a'..'z'] DIGIT ::= ['0'..'9'] HYPHEN ::= '-' SQUOTE ::= '\''
A qdstring can contain any type of UTF-8 characters, except the simple quote or the backslash, which must be encoded. It's always surrounded by simple quotes :
qdstring ::= SQUOTE dstring SQUOTE dstring ::= ( QS | QQ | QUTF8 )* QQ ::= ESC %x32 %x37 QS ::= ESC %x35 ( %x43 / %x63 ) QUTF8 ::= QUTF1 | UTFMB QUTF1 ::= %x00-26 | %x28-5B | %x5D-7F
qdescr is a quoted name, where the first char must be alphabetic, and the following chars must be alphabetic, digits or hyphen. Here is the ABNF for qdescr :
qdescr ::= SQUOTE descr SQUOTE
There
The relaxed descr accepts more characters, like underscore, semi-colon, dot, colon or sharp. The leadkeychar will not be mandatory, too. Here is the ABNF we will accept :
relaxed-descr ::= relaxed-keystring leaxed-keystring::= keychar+ relaxed-keychar ::= ALPHA | DIGIT | HYPHEN | UNDERSCORE | SEMICOLON | DOT | COLON | SHARP ALPHA ::= ['A'..'Z'] | ['a'..'z'] DIGIT ::= ['0'..'9'] HYPHEN ::= '-' UNDERSCORE ::= '_' SEMI_COLON ::= ';' COLON ::= ':' SDOT ::= '.' SHARP ::= '#'
Compared to the strict mode, we will accept a non-quoted String, or a String using double quotes.
relaxed-qdescr ::= SQUOTE relaxed-descr SQUOTE | DQUOTE relaxed-descr DQUOTE | relaxed-descr
We will accept quoted and double quoted OIDs and Names, in relaxed mode. Here is teh supported ABNF :
oid-relaxed ::= SQUOTE relaxed-descr SQUOTE | DQUOTE relaxed-descr DQUOTE | descr-relaxed | SQUOTE numericoid SQUOTE | DQUOTE numericoid DQUOTE | numericoid
Here, we will allow textual syntax name to be used, not only OIDs. For instance, something like SYNTAX IA5String will be allowed.
We also allow quoted and double quoted OIDs.