| <?xml version="1.0" encoding="UTF-8"?> |
| <document> |
| <properties> |
| <author email="akarasulu@apache.org">Alex Karasulu</author> |
| <title>Partitions</title> |
| </properties> |
| |
| <body> |
| <section name="Partitions"> |
| <p> |
| Partitions are entry stores assigned to a naming context. The idea |
| behind a partition is that it stores a subset of the Directory |
| Information Base (DIB). Partitions can be implemented in any way so |
| long as they adhere to interfaces. |
| </p> |
| |
| <subsection name="Status"> |
| <p> |
| Presently the server has a single partition implementation. This |
| implementation is used for both the system partition and user |
| partitions. It uses <a href="http://jdbm.sourceforge.net/">JDBM</a> |
| as the underlying B+Tree implementation for storing entries. |
| </p> |
| |
| <p> |
| Other implementations are possible. I'm particularly interested in |
| memory based partitions either BTree based or based on something like |
| Prevayer. |
| </p> |
| |
| <p> |
| Partitions have simple interfaces that can be used to align any data |
| source to the LDAP data model thereby accessing it via JNDI or via |
| LDAP over the wire. This makes the server very flexible as a bridge |
| to standardize access to disparate data sources and formats. Dynamic |
| mapping based backends are also interesting. |
| </p> |
| </subsection> |
| |
| <subsection name="System Partition"> |
| <p> |
| The system partition is a very special partition that is hardcoded to |
| hang off of the <b>ou=system</b> naming context. It is always present |
| and contains administrative and operational information needed by the |
| server to operate. Hence its name. |
| </p> |
| |
| <p> |
| The server's subsystems will use this partition to store information |
| critical to their operation. Things like triggers, stored procedures, |
| access control instructions and schema information can be maintained |
| here. |
| </p> |
| </subsection> |
| |
| <subsection name="Root Nexus"> |
| <p> |
| Several partitions can be assigned to different naming contexts within |
| the server so long as their names do not overlap such that one |
| partition's naming context is contained within anothers. The root |
| nexus is a fake partition that does not really store entries. It maps |
| other entry storing partitions to naming contexts and routes backing |
| store calls to the partition containing the entry associated with the |
| operation. |
| </p> |
| </subsection> |
| |
| <subsection name="User Partitions"> |
| <p> |
| User partitions are partitions added by users. When you download and |
| start using the server you may want to create a separate partition to |
| store the entries of your application. To us user (sometimes also |
| referred to as application) partitions are those that are not the system |
| partition! In the following section we describe how a user partition |
| can be created in the server. |
| </p> |
| </subsection> |
| </section> |
| |
| <section name="Adding User Partitions"> |
| <p> |
| Adding new application partitions to the server is a matter of |
| setting the right JNDI environment properties. These properties are |
| used in both standalone and in embedded configurations. We will show |
| you how to configure partitions by example using properties files and |
| programatically. |
| </p> |
| |
| <subsection name="Using Properties Files"> |
| <p> |
| Obviously properties files are not the best way to configure a large |
| system like an LDAP server. However properties files are the JNDI |
| standard for pulling in configuration. The server's JNDI provider tries |
| to honor this. Hence the use of a properties file for configuration. |
| Below we have the configuration of two user defined partitions within |
| a properties file. These partitions are for the naming contexts: |
| <code>dc=apache,dc=org</code> and <code>ou=test</code>. |
| </p> |
| |
| <source> |
| # all multivalued properties are space separated like the list of partions here |
| server.db.partitions=apache test |
| |
| # apache partition configuration |
| server.db.partition.suffix.apache=dc=apache,dc=org |
| server.db.partition.indices.apache=ou cn objectClass uid |
| server.db.partition.attributes.apache.dc=apache |
| server.db.partition.attributes.apache.objectClass=top domain extensibleObject |
| |
| # test partition configuration |
| server.db.partition.suffix.test=ou=test |
| server.db.partition.indices.test=ou objectClass |
| server.db.partition.attributes.test.ou=test |
| server.db.partition.attributes.test.objectClass=top organizationalUnit extensibleObject |
| </source> |
| |
| <p> |
| Although somewhat ugly the way we use properties for settings is |
| portable across JNDI LDAP providers. Hopefully we can build a tool |
| on top of this to save the user some hassle. Another approach may be |
| to use XML or something easier to generate these properties from them. |
| For now its the best non-specific (to the server's provider) means we |
| have to inject settings through JNDI environment Hashtables while |
| still being able to load settings via properties files. Properties |
| from proerties files are the common denominator though. Another |
| easier means to configure the server is possible programatically. |
| </p> |
| <h3>Partition Id</h3> |
| <p> |
| Breifly we'll explain these properties and the scheme used. A |
| partition's property set is associated as a set using the partition's |
| id. All partition ids are listed as a space separated list using the |
| <b>server.db.partitions</b> property: above it lists the ids for the two |
| partitions, <i>apache</i> and <i>test</i>. |
| </p> |
| <h3>Naming Context</h3> |
| <p> |
| Partitions need to know the naming context they will store entries |
| for. This naming context is also referred to as the suffix since all |
| entries in the partition have this common suffix. The suffix is a |
| distinguished name. The property key for the suffix of a partition is |
| composed of the following property key base |
| <b>server.db.partition.suffix.</b> concatenated with the id of the |
| partition: <b>server.db.partition.suffix.</b><i>${id}</i>. For example |
| if the partition id is foo, then the suffix key would be, |
| <b>server.db.partition.suffix.foo</b>. |
| </p> |
| <h3>User Defined Indices</h3> |
| <p> |
| Partitions can have indices on attributes. Unlike OpenLDAP where you |
| can build specific types of indices, the server's indices are of a |
| single type. For each partition, a key is assembled from the |
| partition id and the property key base: |
| <b>server.db.partition.indices.</b><i>${id}</i>. So |
| again for foo the key for attribute indices would be |
| <b>server.db.partition.indices.foo</b>. This value is a space separated |
| list of attributeType names to index. For example the apache |
| partition has indices built on top of <b>ou</b>, <b>objectClass</b> |
| and <b>uid</b>. |
| </p> |
| <h3>Suffix Entry</h3> |
| <p> |
| When creating a context the root entry of the context corresponding |
| to the suffix of the partition must be created. This entry is |
| composed of single-valued and multi-valued attributes. We must |
| specify these attributes as well as their values. To do so we again |
| use a key composed of a base, however this time we use both the id |
| of the partition and the name of the attribute: |
| <b>server.db.partition.attributes.</b><i>${id}</i>.<i>${name}</i>. So |
| for partition foo and attribute bar the following key would be used: |
| <b>server.db.partition.attributes.foo.bar</b>. The value of the key |
| is a space separated list of values for the bar attribute. For |
| example the apache partition's suffix has an objectClass attribute |
| and its values are set to: top domain extensibleObject. |
| </p> |
| </subsection> |
| |
| <subsection name="Programatically"> |
| <p> |
| This is simple create a Hashtable and stuff it with those properties. |
| But that's a real pain. The other option is to set all the properties |
| that way minus the one for the suffix entries attributes. We have |
| a shortcut where you can set an Attributes object within the Hashtable |
| and it will get picked up instead of using the standard property |
| scheme above. |
| </p> |
| |
| <p> |
| Simply put the Attributes into the Hashtable using the following |
| key <b>server.db.partition.attributes.</b><i>${id}</i>. Below we show |
| how this can be done for the same example above: |
| </p> |
| |
| <source> |
| BasicAttributes attrs = new BasicAttributes( true ); |
| BasicAttribute attr = new BasicAttribute( "objectClass" ); |
| attr.add( "top" ); |
| attr.add( "organizationalUnit" ); |
| attr.add( "extensibleObject" ); |
| attrs.put( attr ); |
| attr = new BasicAttribute( "ou" ); |
| attr.add( "testing" ); |
| attrs.put( attr ); |
| |
| extras.put( EnvKeys.PARTITIONS, "testing example" ); |
| extras.put( EnvKeys.SUFFIX + "testing", "ou=testing" ); |
| extras.put( EnvKeys.INDICES + "testing", "ou objectClass" ); |
| extras.put( EnvKeys.ATTRIBUTES + "testing", attrs ); |
| |
| attrs = new BasicAttributes( true ); |
| attr = new BasicAttribute( "objectClass" ); |
| attr.add( "top" ); |
| attr.add( "domain" ); |
| attr.add( "extensibleObject" ); |
| attrs.put( attr ); |
| attr = new BasicAttribute( "dc" ); |
| attr.add( "example" ); |
| attrs.put( attr ); |
| |
| extras.put( EnvKeys.SUFFIX + "example", "dc=example" ); |
| extras.put( EnvKeys.INDICES + "example", "ou dc objectClass" ); |
| extras.put( EnvKeys.ATTRIBUTES + "example", attrs ); |
| </source> |
| |
| <p> |
| Ok that does not look any shorter. We'll add to this in the future. |
| Perhaps we enable the use of configuration beans that can be used |
| with an SPI specific to server. However this starts making your code |
| server provider specific. You can just change properties and use the |
| SUN provider anymore to have your code be location independent. |
| </p> |
| </subsection> |
| </section> |
| |
| <section name="Future Progress"> |
| <subsection name="Partition Nesting"> |
| <p> |
| Today we have some limitations to the way we can partition the DIB. |
| Namely we can't have a partition within a partition and sometimes this |
| makes sense. Eventually we intend to enable this kind of |
| functionality using a special type of nexus which is both a router |
| and a backing store for entries. It's smart enough to know what to |
| route verses when to use its own database. Here's a <a href= |
| "http://issues.apache.org/jira/browse/DIREVE-23">JIRA improvement</a> |
| specifically aimed at achieving this goal. |
| </p> |
| </subsection> |
| |
| <subsection name="Partition Variety"> |
| <p> |
| Obviously we want as many different kinds of partitions as possible. |
| Some really cool ideas have floated around out there for a while. |
| Here's a list of theoretically possible partition types that might |
| be useful or just cool: |
| </p> |
| |
| <ul> |
| <li> |
| Partitions that use JDBC to store entries. These would probably |
| be way too slow. However they might be useful if some mapping |
| were to be used to represent an existing application's database |
| schema as an LDAP DIT. This would allow us to expose any database |
| data via LDAP. |
| </li> |
| |
| <li> |
| Partitions using other LDAP servers to store their entries. Why |
| do this when introducing latency. Perhaps you want to proxy other |
| servers or make other servers behave like the server. |
| </li> |
| |
| <li> |
| A partition that serves out the Windows registry via LDAP. A |
| standard mechanism to map the Windows registry to an LDAP DIT is |
| pretty simple. This would be a neat way to expose client machine |
| registry management. |
| </li> |
| |
| <li> |
| A partition based on SleepyCat's JE. I was going to try this |
| and see how it performs against JDBM. |
| </li> |
| |
| <li> |
| A partition based on an in-memory BTree implementation. This would |
| be fast and really cool for storing things like schema info. It |
| would also be cool for staging data between memory and disk. |
| </li> |
| |
| <li> |
| A partition based on Prevalyer. This is like an in-memory partition |
| but you can save it at the end of the day. This might be really |
| useful especially for things the system partition which almost |
| always need to be in memory. The system partition can do this by |
| using really large caches equal to the number of entries in the |
| system partition. |
| </li> |
| </ul> |
| </subsection> |
| |
| <subsection name="Partitioning entries under a single context?"> |
| <p> |
| Other aspirations include entry partitioning within a container |
| context. Imagine having 250 million entries under |
| <code>ou=citizens,dc=census,dc=gov</code>. You don't want all 250 |
| million in one partition but would like to sub partition these entries |
| under the same context based on some attribute. Basically we will be |
| using the attribute's value to implement sub partitioning where within |
| a single context we are partitioning entries. The value is used to |
| hash entries across buckets (the buckets are other partitions). Yeah |
| this is a bit wild but it would be useful in several situations. |
| </p> |
| </subsection> |
| </section> |
| </body> |
| </document> |