<?xml version="1.0"?>

<document>

 <properties>
  <title>Configuration Factory and Hierarchical Structured Data Howto</title>
  <author email="oliver.heger@t-online.de">Oliver Heger</author>
 </properties>

<body>		
	<section name="Using XML based Configurations">
		<p>
 	 		This section explains how to use Hierarchical
    		and Structured XML datasets.
    	</p>
    </section>	
		
	
	<section name="Hierarchical properties">
		<p>
			The XML document we used in the section about composite configuration was quite simple. Because of its
			tree-like nature XML documents can represent data that is
			structured in many ways. This section explains how to deal with
			such structured documents.
		</p>
		<subsection name="Structured XML">
			<p>
				Consider the following scenario: An application operates on
				database tables and wants to load a definition of the database
				schema from its configuration. A XML document provides this
				information. It could look as follows:
			</p>
   			<source>
<![CDATA[
<?xml version="1.0" encoding="ISO-8859-1" ?>

<database>
  <tables>
    <table tableType="system">
      <name>users</name>
      <fields>
        <field>
          <name>uid</name>
          <type>long</type>
        </field>
        <field>
          <name>uname</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>firstName</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>lastName</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>email</name>
          <type>java.lang.String</type>
        </field>
      </fields>
    </table>
    <table tableType="application">
      <name>documents</name>
      <fields>
        <field>
          <name>docid</name>
          <type>long</type>
        </field>
        <field>
          <name>name</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>creationDate</name>
          <type>java.util.Date</type>
        </field>
        <field>
          <name>authorID</name>
          <type>long</type>
        </field>
        <field>
          <name>version</name>
          <type>int</type>
        </field>
      </fields>
    </table>
  </tables>
</database>
]]>
			</source>
			<p>
				This XML is quite self explanatory; there is an arbitrary number
				of table elements, each of it has a name and a list of fields.
				A field in turn consists of a name and a data type.
				To access the data stored in this document it must be included
				in the configuration definition file:
			</p>
   			<source>
<![CDATA[
<?xml version="1.0" encoding="ISO-8859-1" ?>

<configuration>
  <properties fileName="usergui.properties"/>
  <xml fileName="gui.xml"/>
  <xml fileName="tables.xml"/>
</configuration>
]]>
			</source>
			<p>
				The additional <code>xml</code> element causes the document
				with the table definitions to be loaded. When we now want to
				read some of the properties we face a problem: the syntax for
				constructing configuration keys we learned so far is not
				powerful enough to access all of the data stored in the tables
				document.
			</p>
			<p>
				Because the document contains a list of tables some properties
				are defined more than once. E.g. the configuration key
				<code>tables.table.name</code> refers to a <code>name</code>
				element inside a <code>table</code> element inside a
				<code>tables</code> element. This constellation happens to
				occur twice in the tables document.
			</p>
			<p>
				Multiple definitions of a property do not cause problems and are
				supported by all classes of Configuration. If such a property
				is queried using <code>getProperty()</code>, the method
				recognizes that there are multiple values for that property and
				returns a collection with all these values. So we could write
			</p>
   			<source>
<![CDATA[
Object prop = config.getProperty("tables.table.name");
if(prop instanceof Collection)
{
	System.out.println("Number of tables: " + ((Collection) prop).size());
}
]]>
			</source>
			<p>
				An alternative to this code would be the <code>getList()</code>
				method of <code>Configuration</code>. If a property is known to
				have multiple values (as is the table name property in this example),
				<code>getList()</code> allows to retrieve all values at once.
				<b>Note:</b> it is legal to call <code>getString()</code>
				or one of the other getter methods on a property with multiple
				values; it returns the first element of the list.
			</p>
		</subsection>
		<subsection name="Accessing structured properties">
			<p>
				Okay, we can obtain a list with the name of all defined
				tables. In the same way we can retrieve a list with the names
				of all table fields: just pass the key
				<code>tables.table.fields.field.name</code> to the
				<code>getList()</code> method. In our example this list
				would contain 10 elements, the names of all fields of all tables.
				This is fine, but how do we know, which field belongs to
				which table?
			</p>
			<p>
				The answer is, with our actual approach we have no chance to
				obtain this knowledge! If XML documents are loaded this way,
				their exact structure is lost. Though all field names are found
				and stored the information which field belongs to which table
				is not saved. Fortunately Configuration provides a way of
				dealing with structured XML documents. To enable this feature
				the configuration definition file has to be slightly altered.
				It becomes:
			</p>
   			<source>
<![CDATA[
<?xml version="1.0" encoding="ISO-8859-1" ?>

<configuration>
  <properties fileName="usergui.properties"/>
  <xml fileName="gui.xml"/>
  <hierarchicalXml fileName="tables.xml"/>
</configuration>
]]>
			</source>
			<p>
				Note that one <code>xml</code> element was replaced by a
				<code>hierarchicalXml</code> element. This element tells the configuration
				factory that not the default class for processing XML documents
				should be used, but the class <code>HierarchicalXMLConfiguration</code>.
				As the name implies this class is capable of saving the
				hierarchy of XML documents thus keeping their structure.
			</p>
			<p>
				When working with such hierarchical properties configuration keys
				used to query properties support an extended syntax. All components
				of a key can be appended by a numerical value in parentheses that
				determines the index of the affected property. This is explained best
				by some examples:
			</p>
			<p>
				We will now provide some configuration keys and show the results
				of a <code>getProperty()</code> call with these keys as arguments.
				<dl>
					<dt><code>tables.table(0).name</code></dt>
					<dd>
						Returns the name of the first table (all indices are 0 based),
						in this example the string <em>users</em>.
					</dd>
					<dt><code>tables.table(0)[@tableType]</code></dt>
					<dd>
						Returns the value of the tableType attribute of the first
						table (<em>system</em>).
					</dd>
					<dt><code>tables.table(1).name</code></dt>
					<dd>
						Analogous to the first example returns the name of the
						second table (<em>documents</em>).
					</dd>
					<dt><code>tables.table(2).name</code></dt>
					<dd>
						Here the name of a third table is queried, but because there
						are only two tables result is <b>null</b>. The fact that a
						<b>null</b> value is returned for invalid indices can be used
						to find out how many values are defined for a certain property:
						just increment the index in a loop as long as valid objects
						are returned.
					</dd>
					<dt><code>tables.table(1).fields.field.name</code></dt>
					<dd>
						Returns a collection with the names of all fields that
						belong to the second table. With such kind of keys it is
						now possible to find out, which fields belong to which table.
					</dd>
					<dt><code>tables.table(1).fields.field(2).name</code></dt>
					<dd>
						The additional index after field selects a certain field.
						This expression represents the name of the third field in
						the second table (<em>creationDate</em>).
					</dd>
					<dt><code>tables.table.fields.field(0).type</code></dt>
					<dd>
						This key may be a bit unusual but nevertheless completely
						valid. It selects the data types of the first fields in all
						tables. So here a collection would be returned with the
						values [<em>long, long</em>].
					</dd>
				</dl>
			</p>
			<p>
				These examples should make the usage of indices quite clear.
				Because each configuration key can contain an arbitrary number
				of indices it is possible to navigate through complex structures of
				XML documents; each XML element can be uniquely identified.
				So at the end of this section we can draw the following facit:
				For simple XML documents that define only some simple properties
				and do not have a complex structure the default XML configuration
				class is suitable. If documents are more complex and their structure
				is important, the hierarchy aware class should be used, which is
				enabled by an additional <code>className</code> attribute as
				shown in the example configuration definition file above.
			</p>
		</subsection>
	</section>
	
	<section name="Union configuration">
		<p>
			In an earlier section about the configuration definition file for
			<code>ConfigurationFactory</code> it was stated that configuration
			files included first can override properties in configuraton files
			included later and an example use case for this behaviour was given.
			There may be times when there are other requirements.
		</p>
		<p>
			Let's continue the example with the application that somehow process
			database tables and that reads the definitions of the affected tables from
			its configuration. Now consider that this application grows larger and
			must be maintained by a team of developers. Each developer works on
			a separated set of tables. In such a scenario it would be problematic
			if the definitions for all tables would be kept in a single file. It can be
			expected that this file needs to be changed very often and thus can be
			a bottleneck for team development when it is nearly steadily checked
			out. It would be much better if each developer had an associated file
			with table definitions and all these information could be linked together
			at the end.
		</p>
		<p>
			<code>ConfigurationFactory</code> provides support for such a use case,
			too. It is possible to specify in the configuration definition file that
			from a set of configuration sources a logic union configuration is to be
			constructed. Then all properties defined in the provided sources are
			collected and can be accessed as if they had been defined in a single
			source. To demonstrate this feature let us assume that a developer of
			the database application has defined a specific XML file with a table
			definition named <code>tasktables.xml</code>:
		</p>
   		<source>
<![CDATA[
<?xml version="1.0" encoding="ISO-8859-1" ?>

<config>
  <table tableType="application">
    <name>tasks</name>
    <fields>
      <field>
        <name>taskid</name>
        <type>long</type>
      </field>
      <field>
        <name>name</name>
        <type>java.lang.String</type>
      </field>
      <field>
        <name>description</name>
        <type>java.lang.String</type>
      </field>
      <field>
        <name>responsibleID</name>
        <type>long</type>
      </field>
      <field>
        <name>creatorID</name>
        <type>long</type>
      </field>
      <field>
        <name>startDate</name>
        <type>java.util.Date</type>
      </field>
      <field>
        <name>endDate</name>
        <type>java.util.Date</type>
      </field>
    </fields>
  </table>
</config>
]]>
		</source>
		<p>
			This file defines the structure of an additional table, which should be
			added to the so far existing table definitions. To achieve this the
			configuration definition file has to be changed: A new section is added
			that contains the include elements of all configuration sources which
			are to be combined.
		</p>
		<source>
<![CDATA[
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!-- Configuration definition file that demonstrates the
     override and additional sections -->

<configuration>
  <override>
    <properties fileName="usergui.properties"/>
    <xml fileName="gui.xml"/>
  </override>
  
  <additional>
    <hierarchicalXml fileName="tables.xml"/>
    <hierarchicalXml fileName="tasktables.xml" at="tables"/>
  </additional>
</configuration>
]]>
		</source>
		<p>
			Compared to the older versions of this file a couple of changes has been
			done. One major difference is that the elements for including configuration
			sources are no longer direct children of the root element, but are now
			contained in either an <code>override</code> or <code>additional</code>
			section. The names of these sections already imply their purpose.
		</p>
		<p>
			The <code>override</code> section is not strictly necessary. Elements in
			this section are treated as if they were children of the root element, i.e.
			properties in the included configuration sources override properties in
			sources included later. So the <code>override</code> tags could have
			been ommitted, but for sake of clearity it is recommended to use them
			when there is also an <code>additional</code> section.
		</p>
		<p>
			It is the <code>additonal</code> section that introduces a new behaviour.
			All configuration sources listed here are combined to a union configuration.
			In our example we have put two <code>xml</code> elements in this area
			that load the available files with database table definitions. The syntax
			of elements in the <code>additional</code> section is analogous to the
			syntax described so far. The only difference is an additionally supported
			<code>at</code> attribute that specifies the position in the logic union
			configuration where the included properties are to be added. In this
			example we set the <code>at</code> attribute of the second element to
			<em>tables</em>. This is because the file starts with a <code>table</code>
			element, but to be compatible with the other table definition file should be
			accessable under the key <code>tables.table</code>.
		</p>
		<p>
			After these modifications have been performed the configuration obtained
			from the <code>ConfigurationFactory</code> will allow access to three
			database tables. A call of <code>config.getString("tables.table(2).name");</code>
			will result in a value of <em>tasks</em>. In an analogous way it is possible
			to retrieve the fields of the third table.
		</p>
		<p>
			Note that it is also possible to override properties defined in an
			<code>additonal</code> section. This can be done by placing a
			configuration source in the <code>override</code> section that defines
			properties that are also defined in one of the sources listed in the
			<code>additional</code> section. The example does not make use of that.
			Note also that the order of the <code>override</code> and
			<code>additional</code> sections in a configuration definition file does
			not matter. Sources in an <code>override</code> section are always treated with
			higher priority (otherwise they could not override the values of other
			sources).
		</p>
	</section>
</body>

</document>