<?xml version="1.0" encoding="UTF-8" standalone="no"?> | |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | |
<html> | |
<head> | |
<title>ASF: <xsl:key> / key() / IdKeyPattern</title> | |
<meta http-equiv="content-type" content="text/html; charset=UTF-8" /> | |
<meta http-equiv="Content-Style-Type" content="text/css" /> | |
<link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" /> | |
</head> | |
<!-- | |
* Licensed to the Apache Software Foundation (ASF) under one | |
* or more contributor license agreements. See the NOTICE file | |
* distributed with this work for additional information | |
* regarding copyright ownership. The ASF licenses this file | |
* to you under the Apache License, Version 2.0 (the "License"); | |
* you may not use this file except in compliance with the License. | |
* You may obtain a copy of the License at | |
* | |
* http://www.apache.org/licenses/LICENSE-2.0 | |
* | |
* Unless required by applicable law or agreed to in writing, software | |
* distributed under the License is distributed on an "AS IS" BASIS, | |
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
* See the License for the specific language governing permissions and | |
* limitations under the License. | |
--> | |
<body> | |
<div id="title"> | |
<table class="HdrTitle"> | |
<tbody> | |
<tr> | |
<th rowspan="2"> | |
<a href="../index.html"> | |
<img alt="Trademark Logo" src="resources/XalanJ-Logo-tm.png" width="190" height="90" /> | |
</a> | |
</th> | |
<th text-align="center" width="75%"> | |
<a href="index.html">XSLTC Design</a> | |
</th> | |
</tr> | |
<tr> | |
<td valign="middle"><xsl:key> / key() / IdKeyPattern</td> | |
</tr> | |
</tbody> | |
</table> | |
<table class="HdrButtons" align="center" border="1"> | |
<tbody> | |
<tr> | |
<td> | |
<a href="http://www.apache.org">Apache Foundation</a> | |
</td> | |
<td> | |
<a href="http://xalan.apache.org">Xalan Project</a> | |
</td> | |
<td> | |
<a href="http://xerces.apache.org">Xerces Project</a> | |
</td> | |
<td> | |
<a href="http://www.w3.org/TR">Web Consortium</a> | |
</td> | |
<td> | |
<a href="http://www.oasis-open.org/standards">Oasis Open</a> | |
</td> | |
</tr> | |
</tbody> | |
</table> | |
</div> | |
<div id="navLeft"> | |
<ul> | |
<li> | |
<a href="index.html">Overview</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="xsltc_compiler.html">Compiler design</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="xsl_whitespace_design.html">Whitespace</a> | |
</li> | |
<li> | |
<a href="xsl_sort_design.html">xsl:sort</a> | |
</li> | |
<li>Keys<br /> | |
</li> | |
<li> | |
<a href="xsl_comment_design.html">Comment design</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="xsl_lang_design.html">lang()</a> | |
</li> | |
<li> | |
<a href="xsl_unparsed_design.html">Unparsed entities</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="xsl_if_design.html">If design</a> | |
</li> | |
<li> | |
<a href="xsl_choose_design.html">Choose|When|Otherwise design</a> | |
</li> | |
<li> | |
<a href="xsl_include_design.html">Include|Import design</a> | |
</li> | |
<li> | |
<a href="xsl_variable_design.html">Variable|Param design</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="xsltc_runtime.html">Runtime</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="xsltc_dom.html">Internal DOM</a> | |
</li> | |
<li> | |
<a href="xsltc_namespace.html">Namespaces</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="xsltc_trax.html">Translet & TrAX</a> | |
</li> | |
<li> | |
<a href="xsltc_predicates.html">XPath Predicates</a> | |
</li> | |
<li> | |
<a href="xsltc_iterators.html">Xsltc Iterators</a> | |
</li> | |
<li> | |
<a href="xsltc_native_api.html">Xsltc Native API</a> | |
</li> | |
<li> | |
<a href="xsltc_trax_api.html">Xsltc TrAX API</a> | |
</li> | |
<li> | |
<a href="xsltc_performance.html">Performance Hints</a> | |
</li> | |
</ul> | |
</div> | |
<div id="content"> | |
<h2><xsl:key> / key() / IdKeyPattern</h2> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>Contents</h3> | |
<ul> | |
<li> | |
<a href="#functionality">Functionality</a> | |
</li> | |
<li> | |
<a href="#implementation">Implementation</a> | |
</li> | |
</ul> | |
<a name="functionality"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>Functionality</h3> | |
<p>The <code><xsl:key></code> element is a top-level element that can be | |
used to define a named index of nodes from the source XML tree(s). The | |
element has three attributes:</p> | |
<ul> | |
<li> | |
<code>name</code> - the name of the index | |
</li> | |
<li> | |
<code>match</code> - a pattern that defines the nodeset we want | |
indexed | |
</li> | |
<li> | |
<code>use</code> - an expression that defines the value to be used | |
as the index key value. | |
</li> | |
</ul> | |
<p>A named index can be accessed using either the <code>key()</code> function | |
or a KeyPattern. Both these methods address the index using its defined name | |
(the "name" attribute above) and a parameter defining one or more lookup | |
values for the index. The function or pattern returns a node set containing | |
all nodes in the index whose key value match the parameter's value(s):</p> | |
<blockquote class="source"> | |
<pre> | |
<xsl:key name="book-author" match="book" use="author"/> | |
: | |
: | |
<xsl:for-each select="key('book-author', 'Mikhail Bulgakov')"> | |
<xsl:value-of select="author"/> | |
<xsl:text>: </xsl:text> | |
<xsl:value-of select="author"/> | |
<xsl:text>&#xa;</xsl:text> | |
</xsl:for-each></pre> | |
</blockquote> | |
<p>The KeyPattern can be used within an index definition to create unions | |
and intersections of node sets:</p> | |
<blockquote class="source"> | |
<pre> | |
<xsl:key name="cultcies" match="farmer | fisherman" use="name"/></pre> | |
</blockquote> | |
<p>This could have been done using regular <code><xsl:for-each></code> | |
and <code><xsl:select></code> elements. However, if your stylesheet | |
accesses the same selection of nodes over and over again, the transformation | |
will be much more efficient using pre-indexed keys as shown above.</p> | |
<a name="implementation"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>Implementation</h3> | |
<a name="indexing"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h4>Indexing</h4> | |
<p>The <code>AbstractTranslet</code> class has a global hashtable that holds | |
an index for each named key in the stylesheet (hashing on the "name" | |
attribute of <code><xsl:key></code>). <code>AbstractTranslet</code> | |
has a couple of public methods for inserting and retrieving data from this | |
hashtable:</p> | |
<blockquote class="source"> | |
<pre> | |
public void buildKeyIndex(String keyName, int nodeID, String value); | |
public KeyIndex KeyIndex getKeyIndex(String keyName);</pre> | |
</blockquote> | |
<p>The <code>Key</code> class compiles code that traverses the input DOM and | |
extracts nodes that match some given parameters (the <code>"match"</code> | |
attribute of the <code><xsl:key></code> element). A new element is | |
inserted into the named key's index. The nodes' DOM index and the value | |
translated from the <code>"use"</code> attribute of the | |
<code><xsl:key></code> element are stored in the new entry in the | |
index.</p> | |
<p>Something similar is done for indexing IDs. This index is generated from | |
the <code>ID</code> and <code>IDREF</code> fields in the input document's | |
DTD. This means that the code for generating this index cannot be generated | |
at compile-time, which again means that the code has to be generic enough | |
to handle all DTDs. The class that handles this is the | |
<code>org.apache.xalan.xsltc.dom.DTDMonitor</code> class. This class | |
implements the <code>org.xml.sax.XMLReader</code> and | |
<code>org.xml.sax.DTDHandler</code> interfaces. A client application using | |
the native API must instanciate a <code>DTDMonitor</code> and pass it to the | |
translet code - if, and only if, it wants IDs indexed (one can improve | |
performance by omitting this step). This is descrived in the | |
<a href="xsltc_native_api.html">XSLTC Native API reference</a>. The | |
<code>DTDMonitor</code> class will use the same indexing as the code | |
generated by the <code>Key</code> class. The index for ID's is called | |
"##id". We assume that no stylesheets will contain a key with this name.</p> | |
<p>The index itself is implemented in the | |
<code>org.apache.xalan.xsltc.dom.KeyIndex</code> class. The index has an | |
hashtable with all the values from the matching nodes (the part of the node | |
used to generate this value is the one specified in the <code>"use"</code> | |
attribute). For every matching value there is a bit-array (implemented in | |
the <code>org.apache.xalan.xsltc.BitArray</code> class), holding a list of | |
all node indexes for which this value gives a match:</p> | |
<p> | |
<img src="key_relations.gif" alt="key_relations.gif" /> | |
</p> | |
<p> | |
<b> | |
<i>Figure 1: Indexing tables</i> | |
</b> | |
</p> | |
<p>The <code>KeyIndex</code> class implements the <code>NodeIterator</code> | |
interface, so that it can be returned directly by the implementation of the | |
<code>key()</code> function. This is how the index generated by | |
<code><xsl:key></code> and the node-set returned by the | |
<code>key()</code> and KeyPattern are tied together. You can see how this is | |
done in the <code>translate()</code> method of the <code>KeyCall</code> | |
class.</p> | |
<p>The <code>key()</code> function can be called in two ways:</p> | |
<blockquote class="source"> | |
<pre> | |
key('key-name','value') | |
key('key-name','node-set')</pre> | |
</blockquote> | |
<p>The first parameter is always the name of the key. We use this value to | |
lookup our index from the _keyIndexes hashtable in AbstractTranslet:</p> | |
<blockquote class="source"> | |
<pre> | |
il.append(classGen.aloadThis()); | |
_name.translate(classGen, methodGen); | |
il.append(new INVOKEVIRTUAL(getKeyIndex));</pre> | |
</blockquote> | |
<p>This compiles into a call to | |
<code>AbstractTranslet.getKeyIndex(String name)</code>, and it leaves a | |
<code>KeyIndex</code> object on the stack. What we then need to do it to | |
initialise the <code>KeyIndex</code> to give us nodes with the requested | |
value. This is done by leaving the <code>KeyIndex</code> object on the stack | |
and pushing the <code>"value"</code> parameter to <code>key()</code>, before | |
calling <code>lookup()</code> on the index:</p> | |
<blockquote class="source"> | |
<pre> | |
il.append(DUP); // duplicate the KeyIndex obejct before return | |
_value.translate(classGen, methodGen); | |
il.append(new INVOKEVIRTUAL(lookup));</pre> | |
</blockquote> | |
<p>This compiles into a call to <code>KeyIndex.lookup(String value)</code>. | |
This will initialise the <code>KeyIndex</code> object to return nodes that | |
match the given value, so the <code>KeyIndex</code> object can be left on | |
the stack when we return. This because the <code>KeyIndex</code> object | |
implements the <code>NodeIterator</code> interface.</p> | |
<p>This matter is a bit more complex when the second parameter of | |
<code>key()</code> is a node-set. In this case we need to traverse the nodes | |
in the set and do a lookup for each node in the set. What I do is this:</p> | |
<ul> | |
<li> | |
construct a <code>KeyIndex</code> object that will hold the | |
return node-set | |
</li> | |
<li> | |
find the named <code>KeyIndex</code> object from the hashtable in | |
AbstractTranslet | |
</li> | |
<li> | |
get an iterator for the node-set and do the folowing loop:</li> | |
<ul> | |
<li>get string value for current node</li> | |
<li>do lookup in KeyIndex object for the named index</li> | |
<li>merge the resulting node-set into the return node-set</li> | |
</ul> | |
<li> | |
leave the return node-set on stack when done | |
</li> | |
</ul> | |
<a name="improvements"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h4>Possible indexing improvements</h4> | |
<p>The indexing implementation can be very, very memeory exhaustive as there | |
is one <code>BitArray</code> allocated for each value for every index. This | |
is particulary bad for the index used for IDs ('##id'), where a single value | |
should only map to one, single node. This means that a whole | |
<code>BitArray</code> is used just to contain one node. The | |
<code>KeyIndex</code> class should be updated to hold the first node for a | |
value in an <code>Integer</code> object, and then replace that with a | |
<code>BitArray</code> or a <code>Vector</code> only is a second node is | |
added to the value. Here is an outline for <code>KeyIndex</code>:</p> | |
<blockquote class="source"> | |
<pre> | |
public void add(Object value, int node) { | |
Object container; | |
if ((container = (BitArray)_index.get(value)) == null) { | |
_index.put(value, new Integer(node)); | |
} | |
else { | |
// Check if there is _one_ node for this value | |
if (container instanceof Integer) { | |
int first = ((Integer)container | |
_nodes = new BitArray(_arraySize); | |
_nodes.setMask(node & 0xff000000); | |
_nodes.setBit(first & 0x00ffffff); | |
_nodes.setBit(node & 0x00ffffff); | |
_index.put(value, _nodes); | |
} | |
// Otherwise add node to axisting bit array | |
else { | |
_nodex = (BitArray)container; | |
_nodes.setBit(node & 0x00ffffff); | |
} | |
} | |
}</pre> | |
</blockquote> | |
<p>Other methods inside the <code>KeyIndex</code> should be updated to | |
reflect this.</p> | |
<a name="patterns"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h4>Id and Key patterns</h4> | |
<p>As already mentioned, the challenge with the <code>id()</code> and | |
<code>key()</code> patterns is that they have no specific node type. | |
Patterns are normally grouped based on their pattern's "kernel" node type. | |
This is done in the <code>org.apache.xalan.xsltc.compiler.Mode</code> class. | |
The <code>Mode</code> class has a method, <code>flatternAlaternative</code>, | |
that does this grouping, and all templates with a common kernel node type | |
will be inserted into a "test sequence". A test sequence is a set templates | |
with the same kernel node type. The <code>TestSeq</code> class generates | |
code that will figure out which pattern, amongst several patterns with the | |
same kernel node type, that matches a certain node. This is used by the | |
<code>Mode</code> class when generating the <code>applyTemplates</code> | |
method in the translet. A test sequence is also generated for all templates | |
whose pattern does not have a kernel node type. This is the case for all | |
Id and KeyPatterns. This test sequence, if necessary, is put before the | |
big <code>switch()</code> in the <code>applyTemplates()</code> mehtod. This | |
test has to be done for every single node that is traversed, causing the | |
transformation to slow down siginificantly. This is why we do <b>not</b> | |
recommend using this type of patterns with XSLTC.</p> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
</div> | |
<div id="footer">Copyright © 1999-2012 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Fri 07/13/2012</div> | |
</div> | |
</body> | |
</html> |