| <?xml version="1.0"?> |
| <!-- |
| Copyright 2004-2005 The Apache Software Foundation |
| |
| Licensed under the Apache License, Version 2.0 (the "License"); |
| you may not use this file except in compliance with the License. |
| You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <document> |
| |
| <properties> |
| <title>Configuration Factory and Hierarchical Structured Data Howto</title> |
| <author email="oliver.heger@t-online.de">Oliver Heger</author> |
| </properties> |
| |
| <body> |
| <section name="Using XML based Configurations"> |
| <p> |
| This section explains how to use hierarchical |
| and structured XML datasets. |
| </p> |
| </section> |
| |
| |
| <section name="Hierarchical properties"> |
| <p> |
| The XML document we used in the section about ConfigurationFactory was quite simple. Because of its |
| tree-like nature XML documents can represent data that is |
| structured in many ways. This section explains how to deal with |
| such structured documents. |
| </p> |
| <subsection name="Structured XML"> |
| <p> |
| Consider the following scenario: An application operates on |
| database tables and wants to load a definition of the database |
| schema from its configuration. A XML document provides this |
| information. It could look as follows: |
| </p> |
| <source><![CDATA[ |
| <?xml version="1.0" encoding="ISO-8859-1" ?> |
| |
| <database> |
| <tables> |
| <table tableType="system"> |
| <name>users</name> |
| <fields> |
| <field> |
| <name>uid</name> |
| <type>long</type> |
| </field> |
| <field> |
| <name>uname</name> |
| <type>java.lang.String</type> |
| </field> |
| <field> |
| <name>firstName</name> |
| <type>java.lang.String</type> |
| </field> |
| <field> |
| <name>lastName</name> |
| <type>java.lang.String</type> |
| </field> |
| <field> |
| <name>email</name> |
| <type>java.lang.String</type> |
| </field> |
| </fields> |
| </table> |
| <table tableType="application"> |
| <name>documents</name> |
| <fields> |
| <field> |
| <name>docid</name> |
| <type>long</type> |
| </field> |
| <field> |
| <name>name</name> |
| <type>java.lang.String</type> |
| </field> |
| <field> |
| <name>creationDate</name> |
| <type>java.util.Date</type> |
| </field> |
| <field> |
| <name>authorID</name> |
| <type>long</type> |
| </field> |
| <field> |
| <name>version</name> |
| <type>int</type> |
| </field> |
| </fields> |
| </table> |
| </tables> |
| </database> |
| ]]></source> |
| <p> |
| This XML is quite self explanatory; there is an arbitrary number |
| of table elements, each of it has a name and a list of fields. |
| A field in turn consists of a name and a data type. |
| To access the data stored in this document it must be included |
| in the configuration definition file: |
| </p> |
| <source><![CDATA[ |
| <?xml version="1.0" encoding="ISO-8859-1" ?> |
| |
| <configuration> |
| <properties fileName="usergui.properties"/> |
| <xml fileName="gui.xml"/> |
| <xml fileName="tables.xml"/> |
| </configuration> |
| ]]></source> |
| <p> |
| The additional <code>xml</code> element causes the document |
| with the table definitions to be loaded. When we now want to |
| read some of the properties we face a problem: the syntax for |
| constructing configuration keys we learned so far is not |
| powerful enough to access all of the data stored in the tables |
| document. |
| </p> |
| <p> |
| Because the document contains a list of tables some properties |
| are defined more than once. E.g. the configuration key |
| <code>tables.table.name</code> refers to a <code>name</code> |
| element inside a <code>table</code> element inside a |
| <code>tables</code> element. This constellation happens to |
| occur twice in the tables document. |
| </p> |
| <p> |
| Multiple definitions of a property do not cause problems and are |
| supported by all classes of Configuration. If such a property |
| is queried using <code>getProperty()</code>, the method |
| recognizes that there are multiple values for that property and |
| returns a collection with all these values. So we could write |
| </p> |
| <source><![CDATA[ |
| Object prop = config.getProperty("tables.table.name"); |
| if(prop instanceof Collection) |
| { |
| System.out.println("Number of tables: " + ((Collection) prop).size()); |
| } |
| ]]></source> |
| <p> |
| An alternative to this code would be the <code>getList()</code> |
| method of <code>Configuration</code>. If a property is known to |
| have multiple values (as is the table name property in this example), |
| <code>getList()</code> allows to retrieve all values at once. |
| <b>Note:</b> it is legal to call <code>getString()</code> |
| or one of the other getter methods on a property with multiple |
| values; it returns the first element of the list. |
| </p> |
| </subsection> |
| <subsection name="Accessing structured properties"> |
| <p> |
| Okay, we can obtain a list with the name of all defined |
| tables. In the same way we can retrieve a list with the names |
| of all table fields: just pass the key |
| <code>tables.table.fields.field.name</code> to the |
| <code>getList()</code> method. In our example this list |
| would contain 10 elements, the names of all fields of all tables. |
| This is fine, but how do we know, which field belongs to |
| which table? |
| </p> |
| <p> |
| When working with such hierarchical structures the configuration keys |
| used to query properties can have an extended syntax. All components |
| of a key can be appended by a numerical value in parentheses that |
| determines the index of the affected property. So if we have two |
| <code>table</code> elements we can exactly specify, which one we |
| want to address by appending the corresponding index. This is |
| explained best by some examples: |
| </p> |
| <p> |
| We will now provide some configuration keys and show the results |
| of a <code>getProperty()</code> call with these keys as arguments. |
| <dl> |
| <dt><code>tables.table(0).name</code></dt> |
| <dd> |
| Returns the name of the first table (all indices are 0 based), |
| in this example the string <em>users</em>. |
| </dd> |
| <dt><code>tables.table(0)[@tableType]</code></dt> |
| <dd> |
| Returns the value of the tableType attribute of the first |
| table (<em>system</em>). |
| </dd> |
| <dt><code>tables.table(1).name</code></dt> |
| <dd> |
| Analogous to the first example returns the name of the |
| second table (<em>documents</em>). |
| </dd> |
| <dt><code>tables.table(2).name</code></dt> |
| <dd> |
| Here the name of a third table is queried, but because there |
| are only two tables result is <b>null</b>. The fact that a |
| <b>null</b> value is returned for invalid indices can be used |
| to find out how many values are defined for a certain property: |
| just increment the index in a loop as long as valid objects |
| are returned. |
| </dd> |
| <dt><code>tables.table(1).fields.field.name</code></dt> |
| <dd> |
| Returns a collection with the names of all fields that |
| belong to the second table. With such kind of keys it is |
| now possible to find out, which fields belong to which table. |
| </dd> |
| <dt><code>tables.table(1).fields.field(2).name</code></dt> |
| <dd> |
| The additional index after field selects a certain field. |
| This expression represents the name of the third field in |
| the second table (<em>creationDate</em>). |
| </dd> |
| <dt><code>tables.table.fields.field(0).type</code></dt> |
| <dd> |
| This key may be a bit unusual but nevertheless completely |
| valid. It selects the data types of the first fields in all |
| tables. So here a collection would be returned with the |
| values [<em>long, long</em>]. |
| </dd> |
| </dl> |
| </p> |
| <p> |
| These examples should make the usage of indices quite clear. |
| Because each configuration key can contain an arbitrary number |
| of indices it is possible to navigate through complex structures of |
| XML documents; each XML element can be uniquely identified. |
| </p> |
| <p> |
| <b>Note:</b> In earlier versions of Configuration there have been |
| two different Configuration classes for dealing with XML documents: |
| <code>XMLConfiguration</code> that did not support the extended |
| query syntax described above and <code>HierarchicalXMLConfiguration</code>, |
| which could operate on truely hierarchical structures. These classes |
| have now been merged into a single class with the name |
| <code>XMLConfiguration</code>, which now supports all types of XML |
| documents. So there is no longer the need to select one of the |
| XML configurations; <code>XMLConfiguration</code> is always the |
| right (and only) choice. The <code><hierarchicalXml></code> |
| XML element that was used in the configuration definition |
| files for <code>ConfigurationFactory</code> to create an instance |
| of <code>HierarchicalXMLConfiguration</code> is now deprecated. |
| </p> |
| </subsection> |
| <subsection name="Adding new properties"> |
| <p> |
| So far we have learned how to use indices to avoid ambiguities when |
| querying properties. The same problem occurs when adding new |
| properties to a structured configuration. As an example let's |
| assume we want to add a new field to the second table. New properties |
| can be added to a configuration using the <code>addProperty()</code> |
| method. Of course, we have to exactly specify where in the tree like structure new |
| data is to be inserted. A statement like |
| </p> |
| <source><![CDATA[ |
| // Warning: This might cause trouble! |
| config.addProperty("tables.table.fields.field.name", "size"); |
| ]]></source> |
| <p> |
| would not be sufficient because it does not contain all needed |
| information. How is such a statement processed by the |
| <code>addProperty()</code> method? |
| </p> |
| <p> |
| <code>addProperty()</code> splits the provided key into its |
| single parts and navigates through the properties tree along the |
| corresponding element names. In this example it will start at the |
| root element and then find the <code>tables</code> element. The |
| next key part to be processed is <code>table</code>, but here a |
| problem occurs: the configuration contains two <code>table</code> |
| properties below the <code>tables</code> element. To get rid off |
| this ambiguity an index can be specified at this position in the |
| key that makes clear, which of the two properties should be |
| followed. <code>tables.table(1).fields.field.name</code> e.g. |
| would select the second <code>table</code> property. If an index |
| is missing, <code>addProperty()</code> always follows the last |
| available element. In our example this would be the second |
| <code>table</code>, too. |
| </p> |
| <p> |
| The following parts of the key are processed in exactly the same |
| manner. Under the selected <code>table</code> property there is |
| exactly one <code>fields</code> property, so this step is not |
| problematic at all. In the next step the <code>field</code> part |
| has to be processed. At the actual position in the properties tree |
| there are multiple <code>field</code> (sub) properties. So we here |
| have the same situation as for the <code>table</code> part. |
| Because no explicit index is defined the last <code>field</code> |
| property is selected. The last part of the key passed to |
| <code>addProperty()</code> (<code>name</code> in this example) |
| will always be added as new property at the position that has |
| been reached in the former processing steps. So in our example |
| the last <code>field</code> property of the second table would |
| be given a new <code>name</code> sub property and the resulting |
| structure would look like the following listing: |
| </p> |
| <source><![CDATA[ |
| ... |
| <table tableType="application"> |
| <name>documents</name> |
| <fields> |
| <field> |
| <name>docid</name> |
| <type>long</type> |
| </field> |
| <field> |
| <name>name</name> |
| <type>java.lang.String</type> |
| </field> |
| <field> |
| <name>creationDate</name> |
| <type>java.util.Date</type> |
| </field> |
| <field> |
| <name>authorID</name> |
| <type>long</type> |
| </field> |
| <field> |
| <name>version</name> |
| <name>size</name> <== Newly added property |
| <type>int</type> |
| </field> |
| </fields> |
| </table> |
| </tables> |
| </database> |
| ]]></source> |
| <p> |
| This result is obviously not what was desired, but it demonstrates |
| how <code>addProperty()</code> works: the method follows an |
| existing branch in the properties tree and adds new leaves to it. |
| (If the passed in key does not match a branch in the existing tree, |
| a new branch will be added. E.g. if we pass the key |
| <code>tables.table.data.first.test</code>, the existing tree can be |
| navigated until the <code>data</code> part of the key. From here a |
| new branch is started with the remaining parts <code>data</code>, |
| <code>first</code> and <code>test</code>.) |
| </p> |
| <p> |
| If we want a different behavior, we must explicitely tell |
| <code>addProperty()</code> what to do. In our example with the |
| new field our intension was to create a new branch for the |
| <code>field</code> part in the key, so that a new <code>field</code> |
| property is added to the structure rather than adding sub properties |
| to the last existing <code>field</code> property. This can be |
| achieved by specifying the special index <code>(-1)</code> at the |
| corresponding position in the key as shown below: |
| </p> |
| <source><![CDATA[ |
| config.addProperty("tables.table(1).fields.field(-1).name", "size"); |
| config.addProperty("tables.table(1).fields.field.type", "int"); |
| ]]></source> |
| <p> |
| The first line in this fragment specifies that a new branch is |
| to be created for the <code>field</code> property (index -1). |
| In the second line no index is specified for the field, so the |
| last one is used - which happens to be the field that has just |
| been created. So these two statements add a fully defined field |
| to the second table. This is the default pattern for adding new |
| properties or whole hierarchies of properties: first create a new |
| branch in the properties tree and then populate its sub properties. |
| As an additional example let's add a complete new table definition |
| to our example configuration: |
| </p> |
| <source><![CDATA[ |
| // Add a new table element and define the name |
| config.addProperty("tables.table(-1).name", "versions"); |
| |
| // Add a new field to the new table |
| // (an index for the table is not necessary because the latest is used) |
| config.addProperty("tables.table.fields.field(-1).name", "id"); |
| config.addProperty("tables.table.fields.field.type", "int"); |
| |
| // Add another field to the new table |
| config.addProperty("tables.table.fields.field(-1).name", "date"); |
| config.addProperty("tables.table.fields.field.type", "java.sql.Date"); |
| ... |
| ]]></source> |
| <p> |
| For more information about adding properties to a hierarchical |
| configuration also have a look at the javadocs for |
| <code>HierarchicalConfiguration</code>. |
| </p> |
| </subsection> |
| </section> |
| |
| <section name="Union configuration"> |
| <p> |
| In an earlier section about the configuration definition file for |
| <code>ConfigurationFactory</code> it was stated that configuration |
| files included first can override properties in configuraton files |
| included later and an example use case for this behaviour was given. |
| There may be times when there are other requirements. |
| </p> |
| <p> |
| Let's continue the example with the application that somehow process |
| database tables and that reads the definitions of the affected tables from |
| its configuration. Now consider that this application grows larger and |
| must be maintained by a team of developers. Each developer works on |
| a separated set of tables. In such a scenario it would be problematic |
| if the definitions for all tables would be kept in a single file. It can be |
| expected that this file needs to be changed very often and thus can be |
| a bottleneck for team development when it is nearly steadily checked |
| out. It would be much better if each developer had an associated file |
| with table definitions and all these information could be linked together |
| at the end. |
| </p> |
| <p> |
| <code>ConfigurationFactory</code> provides support for such a use case, |
| too. It is possible to specify in the configuration definition file that |
| from a set of configuration sources a logic union configuration is to be |
| constructed. Then all properties defined in the provided sources are |
| collected and can be accessed as if they had been defined in a single |
| source. To demonstrate this feature let us assume that a developer of |
| the database application has defined a specific XML file with a table |
| definition named <code>tasktables.xml</code>: |
| </p> |
| <source><![CDATA[ |
| <?xml version="1.0" encoding="ISO-8859-1" ?> |
| |
| <config> |
| <table tableType="application"> |
| <name>tasks</name> |
| <fields> |
| <field> |
| <name>taskid</name> |
| <type>long</type> |
| </field> |
| <field> |
| <name>name</name> |
| <type>java.lang.String</type> |
| </field> |
| <field> |
| <name>description</name> |
| <type>java.lang.String</type> |
| </field> |
| <field> |
| <name>responsibleID</name> |
| <type>long</type> |
| </field> |
| <field> |
| <name>creatorID</name> |
| <type>long</type> |
| </field> |
| <field> |
| <name>startDate</name> |
| <type>java.util.Date</type> |
| </field> |
| <field> |
| <name>endDate</name> |
| <type>java.util.Date</type> |
| </field> |
| </fields> |
| </table> |
| </config> |
| ]]></source> |
| <p> |
| This file defines the structure of an additional table, which should be |
| added to the so far existing table definitions. To achieve this the |
| configuration definition file has to be changed: A new section is added |
| that contains the include elements of all configuration sources which |
| are to be combined. |
| </p> |
| <source><![CDATA[ |
| <?xml version="1.0" encoding="ISO-8859-1" ?> |
| <!-- Configuration definition file that demonstrates the |
| override and additional sections --> |
| |
| <configuration> |
| <override> |
| <properties fileName="usergui.properties"/> |
| <xml fileName="gui.xml"/> |
| </override> |
| |
| <additional> |
| <xml fileName="tables.xml"/> |
| <xml fileName="tasktables.xml" at="tables"/> |
| </additional> |
| </configuration> |
| ]]></source> |
| <p> |
| Compared to the older versions of this file a couple of changes has been |
| done. One major difference is that the elements for including configuration |
| sources are no longer direct children of the root element, but are now |
| contained in either an <code>override</code> or <code>additional</code> |
| section. The names of these sections already imply their purpose. |
| </p> |
| <p> |
| The <code>override</code> section is not strictly necessary. Elements in |
| this section are treated as if they were children of the root element, i.e. |
| properties in the included configuration sources override properties in |
| sources included later. So the <code>override</code> tags could have |
| been ommitted, but for sake of clearity it is recommended to use them |
| when there is also an <code>additional</code> section. |
| </p> |
| <p> |
| It is the <code>additonal</code> section that introduces a new behaviour. |
| All configuration sources listed here are combined to a union configuration. |
| In our example we have put two <code>xml</code> elements in this area |
| that load the available files with database table definitions. The syntax |
| of elements in the <code>additional</code> section is analogous to the |
| syntax described so far. The only difference is an additionally supported |
| <code>at</code> attribute that specifies the position in the logic union |
| configuration where the included properties are to be added. In this |
| example we set the <code>at</code> attribute of the second element to |
| <em>tables</em>. This is because the file starts with a <code>table</code> |
| element, but to be compatible with the other table definition file should be |
| accessable under the key <code>tables.table</code>. |
| </p> |
| <p> |
| After these modifications have been performed the configuration obtained |
| from the <code>ConfigurationFactory</code> will allow access to three |
| database tables. A call of <code>config.getString("tables.table(2).name");</code> |
| will result in a value of <em>tasks</em>. In an analogous way it is possible |
| to retrieve the fields of the third table. |
| </p> |
| <p> |
| Note that it is also possible to override properties defined in an |
| <code>additonal</code> section. This can be done by placing a |
| configuration source in the <code>override</code> section that defines |
| properties that are also defined in one of the sources listed in the |
| <code>additional</code> section. The example does not make use of that. |
| Note also that the order of the <code>override</code> and |
| <code>additional</code> sections in a configuration definition file does |
| not matter. Sources in an <code>override</code> section are always treated with |
| higher priority (otherwise they could not override the values of other |
| sources). |
| </p> |
| </section> |
| </body> |
| |
| </document> |