xalan-j/xsltc/xsltc_predicates.html - xalan-java - Git at Google

 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 <html>
 <head>
 <title>ASF: XSLTC Predicate Handling</title>
 <meta http-equiv="content-type" content="text/html; charset=UTF-8" />
 <meta http-equiv="Content-Style-Type" content="text/css" />
 <link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" />
 </head>
 <!--
  * Licensed to the Apache Software Foundation (ASF) under one
  * or more contributor license agreements. See the NOTICE file
  * distributed with this work for additional information
  * regarding copyright ownership. The ASF licenses this file
  * to you under the Apache License, Version 2.0 (the  "License");
  * you may not use this file except in compliance with the License.
  * You may obtain a copy of the License at
  *
  *     http://www.apache.org/licenses/LICENSE-2.0
  *
  * Unless required by applicable law or agreed to in writing, software
  * distributed under the License is distributed on an "AS IS" BASIS,
  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  * See the License for the specific language governing permissions and
  * limitations under the License.
  -->
 <body>
 <div id="title">
 <table class="HdrTitle">
 <tbody>
 <tr>
 <th rowspan="2">
 <a href="../index.html">
 <img alt="Trademark Logo" src="resources/XalanJ-Logo-tm.png" width="190" height="90" />
 </a>
 </th>
 <th text-align="center" width="75%">
 <a href="index.html">XSLTC Design</a>
 </th>
 </tr>
 <tr>
 <td valign="middle">XSLTC Predicate Handling</td>
 </tr>
 </tbody>
 </table>
 <table class="HdrButtons" align="center" border="1">
 <tbody>
 <tr>
 <td>
 <a href="http://www.apache.org">Apache Foundation</a>
 </td>
 <td>
 <a href="http://xalan.apache.org">Xalan Project</a>
 </td>
 <td>
 <a href="http://xerces.apache.org">Xerces Project</a>
 </td>
 <td>
 <a href="http://www.w3.org/TR">Web Consortium</a>
 </td>
 <td>
 <a href="http://www.oasis-open.org/standards">Oasis Open</a>
 </td>
 </tr>
 </tbody>
 </table>
 </div>
 <div id="navLeft">
 <ul>
 <li>
 <a href="index.html">Overview</a>
 </li></ul><hr /><ul>
 <li>
 <a href="xsltc_compiler.html">Compiler design</a>
 </li></ul><hr /><ul>
 <li>
 <a href="xsl_whitespace_design.html">Whitespace</a>
 </li>
 <li>
 <a href="xsl_sort_design.html">xsl:sort</a>
 </li>
 <li>
 <a href="xsl_key_design.html">Keys</a>
 </li>
 <li>
 <a href="xsl_comment_design.html">Comment design</a>
 </li></ul><hr /><ul>
 <li>
 <a href="xsl_lang_design.html">lang()</a>
 </li>
 <li>
 <a href="xsl_unparsed_design.html">Unparsed entities</a>
 </li></ul><hr /><ul>
 <li>
 <a href="xsl_if_design.html">If design</a>
 </li>
 <li>
 <a href="xsl_choose_design.html">Choose|When|Otherwise design</a>
 </li>
 <li>
 <a href="xsl_include_design.html">Include|Import design</a>
 </li>
 <li>
 <a href="xsl_variable_design.html">Variable|Param design</a>
 </li></ul><hr /><ul>
 <li>
 <a href="xsltc_runtime.html">Runtime</a>
 </li></ul><hr /><ul>
 <li>
 <a href="xsltc_dom.html">Internal DOM</a>
 </li>
 <li>
 <a href="xsltc_namespace.html">Namespaces</a>
 </li></ul><hr /><ul>
 <li>
 <a href="xsltc_trax.html">Translet &amp; TrAX</a>
 </li>
 <li>XPath Predicates<br />
 </li>
 <li>
 <a href="xsltc_iterators.html">Xsltc Iterators</a>
 </li>
 <li>
 <a href="xsltc_native_api.html">Xsltc Native API</a>
 </li>
 <li>
 <a href="xsltc_trax_api.html">Xsltc TrAX API</a>
 </li>
 <li>
 <a href="xsltc_performance.html">Performance Hints</a>
 </li>
 </ul>
 </div>
 <div id="content">
 <h2>XSLTC Predicate Handling</h2>

   <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h3>Definition</h3>

   <p>According to Michael Kay's "XSLT Programmer's Reference" page
   736, a predicate is "An expression used to filter which nodes are
   selected by a particular step in a path expression, or to select a subset of
   the nodes in a node-set. A Boolean expression selects the nodes for which the
   predicate is true; a numeric expression selects the node at the position
   given by the value of the expression, for example '[1]' selects the first
   node.". Note that a predicate containing a boolean expression can
   return zero, one or more nodes, while a predicate containing a numeric
   expression can return only zero or one node.</p>


   <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h3>Examples</h3>

   <p>I'll list a few examples that I can refer back to later on in this
   document. All examples will use this XML document:</p>
 <blockquote class="source">
 <pre>
     &lt;?xml version="1.0"?&gt;
     &lt;doc&gt;
       &lt;foo location="Drumcondra"&gt;
         &lt;bar name="Cat and Cage"/&gt;
         &lt;bar name="Fagan's"/&gt;
         &lt;bar name="Gravedigger's"/&gt;
         &lt;bar name="Ivy House"/&gt;
       &lt;foo&gt;
       &lt;foo location="Town"&gt;
         &lt;bar name="Peter's Pub"/&gt;
         &lt;bar name="Grogan's"/&gt;
         &lt;bar name="Hogans's"/&gt;
         &lt;bar name="Brogan's"/&gt;
       &lt;/foo&gt;
     &lt;/doc&gt;</pre>
 </blockquote>

   <p>Here are some examples of a predicate with boolean expressions:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:for-each select="//bar[contains(@name,'ogan')]"&gt;
     &lt;xsl:for-each select="//bar[parent::*/@location = 'Drumcondra']"&gt;
     &lt;xsl:for-each select="//bar[@name = 'Cat and Cage']"&gt;</pre>
 </blockquote>

   <p>The first two select more than one node, while the last selects only one.
   The last expression could select more nodes if the input document was
   different. Now, here are a few examples of predicates with numeric
   expressions:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:value-of select="//bar[1]"&gt;
     &lt;xsl:value-of select="/doc/foo[2]/bar[1]"&gt;
     &lt;xsl:value-of select="/doc/foo[2]/bar"&gt;</pre>
 </blockquote>
   <p>The last expression will return more than one node, but the step that
   contains the predicate returns only one (the second <code>&lt;foo&gt;</code>
   element).</p>

   <p>The above are the basic types of predicates. These can be grouped to create
   a predicate pipeline, where the first predicate reduces the node-set that the
   second predicate filters, and so on. Here are some examples:</p>
 <blockquote class="source">
 <pre>
     A: &lt;for-each select="//bar[contains(@name,'ogan')][2]"&gt;
     C: &lt;for-each select="//bar[2][contains(@name,'ogan')]"&gt;
     B: &lt;for-each select="//bar[position() &gt; 3][2]"&gt;</pre>
 </blockquote>

   <p>It is easier to figure out which nodes these expressions should return if
   one goes through the steps and predicates one by one. In expression
   <code>A:</code> we first get all <code>&lt;bar&gt;</code> elements from the
   whole document. Then the first predicate selects from that node-set only
   those elements that have a <code>@name</code> attribute that contains
   "ogan", and we're left with these elements:</p>
 <blockquote class="source">
 <pre>
         &lt;bar name="Grogan's"&gt;
         &lt;bar name="Hogans's"&gt;
         &lt;bar name="Brogan's"&gt;</pre>
 </blockquote>
   <p>And finally, the last predicate then selects the second of those
   elements:</p>
 <blockquote class="source">
 <pre>
         &lt;bar name="Hogans's"&gt;</pre>
 </blockquote>

   <p>Expression <code>B:</code> contains the same predicates as <code>A:</code>,
   but the resulting node set if completely different. We start off with the same
   set of <code>&lt;bar&gt;</code> elements, but we apply the
   <code>"[2]"</code> predicate first, and end up with this
   element:</p>
 <blockquote class="source">
 <pre>
         &lt;bar name="Fagan's"&gt;</pre>
 </blockquote>

   <p>Fagan's is the bar where the Irish Taoiseach (prime minister) drinks his
   pints, but its name does not contain the string "<code>ogan</code>",
   so the resulting node-set is empty.</p>

   <p>The third expressions also starts off with all <code>&lt;bar&gt;</code>
   elements, applies the predicate "<code>[position() &gt; 3]</code>",
   and reduces the node set to these:</p>
 <blockquote class="source">
 <pre>
         &lt;bar name="Ivy House"&gt;
         &lt;bar name="Peter's Pub"&gt;
         &lt;bar name="Grogan's"&gt;
         &lt;bar name="Hogans's"&gt;
         &lt;bar name="Brogan's"&gt;</pre>
 </blockquote>
   <p>The last predicate "<code>[2]</code>" is applied to this node-set
   and set is further reduced to:</p>
 <blockquote class="source">
 <pre>
         &lt;bar name="Peter's Pub"&gt;</pre>
 </blockquote>


   <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h3>Categories</h3>

   <p>From the examples in the last chapter we can try to categorize predicate
   chains/pipelines to simplify our implementation. We can speed up processing
   significantly if we can avoid using a data-structure (iterator) to represent
   the intermediate step between predicates. The goal of setting up these
   categories is to pinpoint those cases where an intermediate iterator has
   to be used and when it can be avoided.</p>

     <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h4>Single predicate expressions</h4>

     <p>Expressions containing just a single predicate have no intermediate step
     and there is no need for any extra iterator. The expression inside the
     predicate can be applied directly to the original iterator. We call this
     category <b>SIMPLE_CONTEXT</b>.</p>


     <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h4>Expressions containing only non-position predicates</h4>

     <p>Predicate-order is significant when the predicate-chain contains one or
     more predicate with an expression similar to
     "<code>position() &gt; 3</code>" or "<code>2</code>". This
     is because the <code>position()</code> and <code>last()</code> explicitly
     refer to the intermediate step between applying each predicate. The
     expression:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:for-each select="//bar[contains(@name,'ogan')][parent::*/@location = 'Town']"&gt;</pre>
 </blockquote>
     <p>has two predicates that can be applied in any order and still produce the
     desired node-set. Such predicates can be merged to:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:for-each select="//bar[contains(@name,'ogan') &amp; (parent::*/@location = 'Town')]"&gt;</pre>
 </blockquote>
     <p>We call this category NO_CONTEXT.</p>


     <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h4>Expressions containing position predicates</h4>

     <p>A predicate-chain, whose predicates' expressions contain any use of the
     <code>position()</code> or <code>last()</code> functions require some way
     of representing the intermediate step in an iterator. The first predicate
     is applied to the original node-set, and the resulting node-set must then
     be stored in some other iterator, from which the second predicate can get
     the current position from the iterator's <code>getPosition()</code> and
     <code>getLast()</code> methods. We call this category
     GENERAL_CONTEXT</p>


     <a name="exception">‌</a>
     <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h4>Expressions containing one position predicate</h4>

     <p>There is one expection from the GENERAL_CONTEXT category. If the
     predicate-chain contains only one position-predicate, and that predicate is
     the very first one, then that predicate can call the iterator that contains
     the first node-set directly. Just look:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:for-each select="//bar[2][parent::*/@location = 'Drumcondra']"&gt;</pre>
 </blockquote>
     <p>The <code>[2]</code> predicate can be applied to the original iterator
     for the <code>//bar</code> step. And so can the
     <code>[parent::*/@location = 'Drumcondra']</code> predicate as well. This
     is only the case when the position predicate is first in the predicate
     chain. These types of predicate chains belong in the
      NO_CONTEXT  category.</p>


   <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h3>Design details</h3>

     <p>Predicates are handled quite differently in step expressions and step
     patterns. Step expressions are not implemented with the various contexts in
     mind and use a specialised iterator to wrap the code for each predicate.
     Step patterns are more complicated and CPU (or should I say JVM?)
     exhaustive. Step patterns containing predicates are analysed to determine
     context type and compiled accordingly.</p>

     <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h4>Predicates and Step expressions</h4>

     <p>The basic behaviour for a predicate is to compile a <b>filter</b>. This
     filter is an auxiliary class that implements the
     <code>org.apache.xalan.xsltc.dom.CurrentNodeListFilter</code> interface. The
     <code>Step</code> or <code>StepPattern</code> that uses the predicate will
     create a <code>org.apache.xalan.xsltc.dom.CurrentNodeListFilter</code>. This
     iterator contains the nodes that pass through the predicate. The compiled
     filter is used by the iterator to determine which nodes that should be
     included. The <code>org.apache.xalan.xsltc.dom.CurrentNodeListFilter</code>
     interface contains only a single method:</p>
 <blockquote class="source">
 <pre>
     public interface CurrentNodeListFilter {
     public abstract boolean test(int node, int position, int last, int current,
                                  AbstractTranslet translet, NodeIterator iter);
     }</pre>
 </blockquote>

     <p>The code that is compiled into the <code>test()</code> method is the
     code for the predicate's expression. The <code>Predicate</code> class
     compiles the filter class and a <code>test()</code> method skeleton, while
     some sub-class of the <code>Expression</code> class compiles the actual
     code that goes into this method.</p>

     <p>The iterator is initialised with a filter that implements this interface:
     </p>
 <blockquote class="source">
 <pre>
     public CurrentNodeListIterator(NodeIterator source,
 				   CurrentNodeListFilter filter,
 				   int currentNode,
 				   AbstractTranslet translet) {
 	this(source, !source.isReverse(), filter, currentNode, translet);
     }</pre>
 </blockquote>

     <p>The iterator will use its source iterator to provide it with the initial
     node-set. Each node that is returned from this set is passed through the
     filter before returned by the <code>next()</code> method. Note that the
     source iterator can also be a current node-list iterator (if two or more
     predicates are chained together).</p>


     <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h4>Optimisations in Step expressions</h4>

     <h5>Node-value iterators</h5>

     <p>Some simple predicates that test for node values are handled by the
     <code>NodeValueIterator</code> class at runtime. These are:</p>
 <blockquote class="source">
 <pre>
     A: foo[@attr = &lt;value&gt;]
     B: foo[bar = &lt;value&gt;]
     C: foo/bar[. = &lt;value&gt;]</pre>
 </blockquote>

     <p>The first case is handled by creating an iterator that represents
     <code>foo/@attr</code>, then passing this iterator and a test-value to
     a <code>NodeValueIterator</code>. The <b>&lt;value&gt;</b> is an
     expression that is compiled and passed to the iterator as a string. It
     does <b>not</b> have to be a literal string as the string value is
     found at runtime. The last two cases are similarly handled by creating an
     iterator for <code>foo/bar</code> and passing that and the test-value to
     a <code>NodeValueIterator</code>.</p>


     <h5>Nth descendant iterators</h5>

     <p>The <code>Step</code> class is also optimised for position-predicates
     that are applied to descendant iterators:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:for-each select="//bar[3]"&gt;</pre>
 </blockquote>

     <p>Such step/predicate combinations are handled by the internal DOM's
     inner class <code>NthDescendantIterator</code>.</p>


     <h5>Nth position iterators</h5>

     <p>Similarly, the <code>Step</code> class is optimised for
     position-predicates that are applied to basic steps:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:for-each select="bar[3]"&gt;</pre>
 </blockquote>

     <p>Such step/predicate combinations are handled by the internal DOM's
     inner class <code>NthPositionIterator</code>.</p>


     <h5>Node test</h5>

     <p>The predicate class contains a method that tells you if it is a boolean
     test:</p>
 <blockquote class="source">
 <pre>
     public boolean isBooleanTest();</pre>
 </blockquote>

     <p>This can be, but it currently is not, used by the <code>Step</code> class
     to compile in optimised code. Some work to be done here!</p>


     <a name="step-patterns">‌</a>
     <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 <h4>Predicates and StepPatterns</h4>

     <p>Using predicates in patterns is slow on any XSLT processor, and XSLTC
     is no exception. This is why the predicate context is carefully analysed
     by the <code>StepPattern</code> class, so that the compiled code is
     specialised to handle the specific predicate(s) in use. First we should
     consider the basic step pattern.</p>

     <h5>Basic pattern handling</h5>

     <p>All patterns are grouped (by the <code>Mode</code> class) according to
     their <b>
 <i>kernel</i>
 </b>-node type. The kernel node-type is node-type of the
     <b>last</b> step in a pattern:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:template match="foo/bar/baz"&gt;  ...  &lt;xsl:template&gt;</pre>
 </blockquote>

     <p>In this case the type for elements <code>&lt;baz&gt;</code> is the
     kernel type. This step is <b>not</b> compiled as a step pattern. The node
     type is passed to the <code>Mode</code> class and is used to place the
     remainder of the pattern code inside the big <code>switch()</code> statement
     in the translet's <code>applyTemplates()</code> method. The whole pattern
     is then <b>reduced</b> to:</p>
 <blockquote class="source">
 <pre>
     match="foo/bar"</pre>
 </blockquote>

     <p>The <code>StepPattern</code> representing the <code>&lt;bar&gt;</code>
     element test is compiled under the appropriate <code>case:</code> section
     of the <code>switch()</code> statement. The code compiled for the step
     pattern is basically just a call to the DOM's <code>getType()</code>
     method and a test for the desired node type. There are two special cases
     for:</p>
 <blockquote class="source">
 <pre>
     &lt;xsl:template match="foo/*/baz"&gt;  ...  &lt;xsl:template&gt;
     &lt;xsl:template match="foo/*@[2]"&gt;  ...  &lt;xsl:template&gt;</pre>
 </blockquote>

     <p>In the first case we call <code>isElement()</code> and in the second case
     we call <code>isAttribute()</code>, instead of <code>getType()</code>.</p>


     <h5>Patterns with predicates</h5>

     <p>The <code>typeCheck()</code> method of the <code>StepPattern</code>
     invokes a method that analyses the predicates of the step pattern and
     determines their context. (Note that this method needs to be updated to
     handle the exception to the GENERAL_CONTEXT metioned in the section
     <a href="#exception">Expressions containing one position predicate</a> earlier in this document.) The <code>translate()</code> method of the
     <code>StepPattern</code> class contains a <code>switch()</code> statement
     that calls methods that are tailored for compiling code for the various
     predicate contexts.</p>


 <p align="right" size="2">
 <a href="#content">(top)</a>
 </p>
 </div>
 <div id="footer">Copyright © 1999-2014 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Thu 2014-05-15</div>
 </div>
 </body>
 </html>
	<?xml version="1.0" encoding="UTF-8" standalone="no"?>
	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
	<html>
	<head>
	<title>ASF: XSLTC Predicate Handling</title>
	<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
	<meta http-equiv="Content-Style-Type" content="text/css" />
	<link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" />
	</head>
	<!--
	* Licensed to the Apache Software Foundation (ASF) under one
	* or more contributor license agreements. See the NOTICE file
	* distributed with this work for additional information
	* regarding copyright ownership. The ASF licenses this file
	* to you under the Apache License, Version 2.0 (the "License");
	* you may not use this file except in compliance with the License.
	* You may obtain a copy of the License at
	*
	* http://www.apache.org/licenses/LICENSE-2.0
	*
	* Unless required by applicable law or agreed to in writing, software
	* distributed under the License is distributed on an "AS IS" BASIS,
	* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	* See the License for the specific language governing permissions and
	* limitations under the License.
	-->
	<body>
	<div id="title">
	<table class="HdrTitle">
	<tbody>
	<tr>
	<th rowspan="2">
	<a href="../index.html">
	<img alt="Trademark Logo" src="resources/XalanJ-Logo-tm.png" width="190" height="90" />
	</a>
	</th>
	<th text-align="center" width="75%">
	<a href="index.html">XSLTC Design</a>
	</th>
	</tr>
	<tr>
	<td valign="middle">XSLTC Predicate Handling</td>
	</tr>
	</tbody>
	</table>
	<table class="HdrButtons" align="center" border="1">
	<tbody>
	<tr>
	<td>
	<a href="http://www.apache.org">Apache Foundation</a>
	</td>
	<td>
	<a href="http://xalan.apache.org">Xalan Project</a>
	</td>
	<td>
	<a href="http://xerces.apache.org">Xerces Project</a>
	</td>
	<td>
	<a href="http://www.w3.org/TR">Web Consortium</a>
	</td>
	<td>
	<a href="http://www.oasis-open.org/standards">Oasis Open</a>
	</td>
	</tr>
	</tbody>
	</table>
	</div>
	<div id="navLeft">
	<ul>
	<li>
	<a href="index.html">Overview</a>
	</li></ul><hr /><ul>
	<li>
	<a href="xsltc_compiler.html">Compiler design</a>
	</li></ul><hr /><ul>
	<li>
	<a href="xsl_whitespace_design.html">Whitespace</a>
	</li>
	<li>
	<a href="xsl_sort_design.html">xsl:sort</a>
	</li>
	<li>
	<a href="xsl_key_design.html">Keys</a>
	</li>
	<li>
	<a href="xsl_comment_design.html">Comment design</a>
	</li></ul><hr /><ul>
	<li>
	<a href="xsl_lang_design.html">lang()</a>
	</li>
	<li>
	<a href="xsl_unparsed_design.html">Unparsed entities</a>
	</li></ul><hr /><ul>
	<li>
	<a href="xsl_if_design.html">If design</a>
	</li>
	<li>
	<a href="xsl_choose_design.html">Choose\|When\|Otherwise design</a>
	</li>
	<li>
	<a href="xsl_include_design.html">Include\|Import design</a>
	</li>
	<li>
	<a href="xsl_variable_design.html">Variable\|Param design</a>
	</li></ul><hr /><ul>
	<li>
	<a href="xsltc_runtime.html">Runtime</a>
	</li></ul><hr /><ul>
	<li>
	<a href="xsltc_dom.html">Internal DOM</a>
	</li>
	<li>
	<a href="xsltc_namespace.html">Namespaces</a>
	</li></ul><hr /><ul>
	<li>
	<a href="xsltc_trax.html">Translet & TrAX</a>
	</li>
	<li>XPath Predicates<br />
	</li>
	<li>
	<a href="xsltc_iterators.html">Xsltc Iterators</a>
	</li>
	<li>
	<a href="xsltc_native_api.html">Xsltc Native API</a>
	</li>
	<li>
	<a href="xsltc_trax_api.html">Xsltc TrAX API</a>
	</li>
	<li>
	<a href="xsltc_performance.html">Performance Hints</a>
	</li>
	</ul>
	</div>
	<div id="content">
	<h2>XSLTC Predicate Handling</h2>

	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h3>Definition</h3>

	<p>According to Michael Kay's "XSLT Programmer's Reference" page
	736, a predicate is "An expression used to filter which nodes are
	selected by a particular step in a path expression, or to select a subset of
	the nodes in a node-set. A Boolean expression selects the nodes for which the
	predicate is true; a numeric expression selects the node at the position
	given by the value of the expression, for example '[1]' selects the first
	node.". Note that a predicate containing a boolean expression can
	return zero, one or more nodes, while a predicate containing a numeric
	expression can return only zero or one node.</p>



	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h3>Examples</h3>

	<p>I'll list a few examples that I can refer back to later on in this
	document. All examples will use this XML document:</p>
	<blockquote class="source">
	<pre>
	<?xml version="1.0"?>
	<doc>
	<foo location="Drumcondra">
	<bar name="Cat and Cage"/>
	<bar name="Fagan's"/>
	<bar name="Gravedigger's"/>
	<bar name="Ivy House"/>
	<foo>
	<foo location="Town">
	<bar name="Peter's Pub"/>
	<bar name="Grogan's"/>
	<bar name="Hogans's"/>
	<bar name="Brogan's"/>
	</foo>
	</doc></pre>
	</blockquote>

	<p>Here are some examples of a predicate with boolean expressions:</p>
	<blockquote class="source">
	<pre>
	<xsl:for-each select="//bar[contains(@name,'ogan')]">
	<xsl:for-each select="//bar[parent::*/@location = 'Drumcondra']">
	<xsl:for-each select="//bar[@name = 'Cat and Cage']"></pre>
	</blockquote>

	<p>The first two select more than one node, while the last selects only one.
	The last expression could select more nodes if the input document was
	different. Now, here are a few examples of predicates with numeric
	expressions:</p>
	<blockquote class="source">
	<pre>
	<xsl:value-of select="//bar[1]">
	<xsl:value-of select="/doc/foo[2]/bar[1]">
	<xsl:value-of select="/doc/foo[2]/bar"></pre>
	</blockquote>
	<p>The last expression will return more than one node, but the step that
	contains the predicate returns only one (the second <code><foo></code>
	element).</p>

	<p>The above are the basic types of predicates. These can be grouped to create
	a predicate pipeline, where the first predicate reduces the node-set that the
	second predicate filters, and so on. Here are some examples:</p>
	<blockquote class="source">
	<pre>
	A: <for-each select="//bar[contains(@name,'ogan')][2]">
	C: <for-each select="//bar[2][contains(@name,'ogan')]">
	B: <for-each select="//bar[position() > 3][2]"></pre>
	</blockquote>

	<p>It is easier to figure out which nodes these expressions should return if
	one goes through the steps and predicates one by one. In expression
	<code>A:</code> we first get all <code><bar></code> elements from the
	whole document. Then the first predicate selects from that node-set only
	those elements that have a <code>@name</code> attribute that contains
	"ogan", and we're left with these elements:</p>
	<blockquote class="source">
	<pre>
	<bar name="Grogan's">
	<bar name="Hogans's">
	<bar name="Brogan's"></pre>
	</blockquote>
	<p>And finally, the last predicate then selects the second of those
	elements:</p>
	<blockquote class="source">
	<pre>
	<bar name="Hogans's"></pre>
	</blockquote>

	<p>Expression <code>B:</code> contains the same predicates as <code>A:</code>,
	but the resulting node set if completely different. We start off with the same
	set of <code><bar></code> elements, but we apply the
	<code>"[2]"</code> predicate first, and end up with this
	element:</p>
	<blockquote class="source">
	<pre>
	<bar name="Fagan's"></pre>
	</blockquote>

	<p>Fagan's is the bar where the Irish Taoiseach (prime minister) drinks his
	pints, but its name does not contain the string "<code>ogan</code>",
	so the resulting node-set is empty.</p>

	<p>The third expressions also starts off with all <code><bar></code>
	elements, applies the predicate "<code>[position() > 3]</code>",
	and reduces the node set to these:</p>
	<blockquote class="source">
	<pre>
	<bar name="Ivy House">
	<bar name="Peter's Pub">
	<bar name="Grogan's">
	<bar name="Hogans's">
	<bar name="Brogan's"></pre>
	</blockquote>
	<p>The last predicate "<code>[2]</code>" is applied to this node-set
	and set is further reduced to:</p>
	<blockquote class="source">
	<pre>
	<bar name="Peter's Pub"></pre>
	</blockquote>



	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h3>Categories</h3>

	<p>From the examples in the last chapter we can try to categorize predicate
	chains/pipelines to simplify our implementation. We can speed up processing
	significantly if we can avoid using a data-structure (iterator) to represent
	the intermediate step between predicates. The goal of setting up these
	categories is to pinpoint those cases where an intermediate iterator has
	to be used and when it can be avoided.</p>

	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h4>Single predicate expressions</h4>

	<p>Expressions containing just a single predicate have no intermediate step
	and there is no need for any extra iterator. The expression inside the
	predicate can be applied directly to the original iterator. We call this
	category <b>SIMPLE_CONTEXT</b>.</p>



	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h4>Expressions containing only non-position predicates</h4>

	<p>Predicate-order is significant when the predicate-chain contains one or
	more predicate with an expression similar to
	"<code>position() > 3</code>" or "<code>2</code>". This
	is because the <code>position()</code> and <code>last()</code> explicitly
	refer to the intermediate step between applying each predicate. The
	expression:</p>
	<blockquote class="source">
	<pre>
	<xsl:for-each select="//bar[contains(@name,'ogan')][parent::*/@location = 'Town']"></pre>
	</blockquote>
	<p>has two predicates that can be applied in any order and still produce the
	desired node-set. Such predicates can be merged to:</p>
	<blockquote class="source">
	<pre>
	<xsl:for-each select="//bar[contains(@name,'ogan') & (parent::*/@location = 'Town')]"></pre>
	</blockquote>
	<p>We call this category NO_CONTEXT.</p>



	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h4>Expressions containing position predicates</h4>

	<p>A predicate-chain, whose predicates' expressions contain any use of the
	<code>position()</code> or <code>last()</code> functions require some way
	of representing the intermediate step in an iterator. The first predicate
	is applied to the original node-set, and the resulting node-set must then
	be stored in some other iterator, from which the second predicate can get
	the current position from the iterator's <code>getPosition()</code> and
	<code>getLast()</code> methods. We call this category
	GENERAL_CONTEXT</p>



	<a name="exception">‌</a>
	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h4>Expressions containing one position predicate</h4>

	<p>There is one expection from the GENERAL_CONTEXT category. If the
	predicate-chain contains only one position-predicate, and that predicate is
	the very first one, then that predicate can call the iterator that contains
	the first node-set directly. Just look:</p>
	<blockquote class="source">
	<pre>
	<xsl:for-each select="//bar[2][parent::*/@location = 'Drumcondra']"></pre>
	</blockquote>
	<p>The <code>[2]</code> predicate can be applied to the original iterator
	for the <code>//bar</code> step. And so can the
	<code>[parent::*/@location = 'Drumcondra']</code> predicate as well. This
	is only the case when the position predicate is first in the predicate
	chain. These types of predicate chains belong in the
	NO_CONTEXT category.</p>





	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h3>Design details</h3>

	<p>Predicates are handled quite differently in step expressions and step
	patterns. Step expressions are not implemented with the various contexts in
	mind and use a specialised iterator to wrap the code for each predicate.
	Step patterns are more complicated and CPU (or should I say JVM?)
	exhaustive. Step patterns containing predicates are analysed to determine
	context type and compiled accordingly.</p>

	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h4>Predicates and Step expressions</h4>

	<p>The basic behaviour for a predicate is to compile a <b>filter</b>. This
	filter is an auxiliary class that implements the
	<code>org.apache.xalan.xsltc.dom.CurrentNodeListFilter</code> interface. The
	<code>Step</code> or <code>StepPattern</code> that uses the predicate will
	create a <code>org.apache.xalan.xsltc.dom.CurrentNodeListFilter</code>. This
	iterator contains the nodes that pass through the predicate. The compiled
	filter is used by the iterator to determine which nodes that should be
	included. The <code>org.apache.xalan.xsltc.dom.CurrentNodeListFilter</code>
	interface contains only a single method:</p>
	<blockquote class="source">
	<pre>
	public interface CurrentNodeListFilter {
	public abstract boolean test(int node, int position, int last, int current,
	AbstractTranslet translet, NodeIterator iter);
	}</pre>
	</blockquote>

	<p>The code that is compiled into the <code>test()</code> method is the
	code for the predicate's expression. The <code>Predicate</code> class
	compiles the filter class and a <code>test()</code> method skeleton, while
	some sub-class of the <code>Expression</code> class compiles the actual
	code that goes into this method.</p>

	<p>The iterator is initialised with a filter that implements this interface:
	</p>
	<blockquote class="source">
	<pre>
	public CurrentNodeListIterator(NodeIterator source,
	CurrentNodeListFilter filter,
	int currentNode,
	AbstractTranslet translet) {
	this(source, !source.isReverse(), filter, currentNode, translet);
	}</pre>
	</blockquote>

	<p>The iterator will use its source iterator to provide it with the initial
	node-set. Each node that is returned from this set is passed through the
	filter before returned by the <code>next()</code> method. Note that the
	source iterator can also be a current node-list iterator (if two or more
	predicates are chained together).</p>



	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h4>Optimisations in Step expressions</h4>

	<h5>Node-value iterators</h5>

	<p>Some simple predicates that test for node values are handled by the
	<code>NodeValueIterator</code> class at runtime. These are:</p>
	<blockquote class="source">
	<pre>
	A: foo[@attr = <value>]
	B: foo[bar = <value>]
	C: foo/bar[. = <value>]</pre>
	</blockquote>

	<p>The first case is handled by creating an iterator that represents
	<code>foo/@attr</code>, then passing this iterator and a test-value to
	a <code>NodeValueIterator</code>. The <b><value></b> is an
	expression that is compiled and passed to the iterator as a string. It
	does <b>not</b> have to be a literal string as the string value is
	found at runtime. The last two cases are similarly handled by creating an
	iterator for <code>foo/bar</code> and passing that and the test-value to
	a <code>NodeValueIterator</code>.</p>



	<h5>Nth descendant iterators</h5>

	<p>The <code>Step</code> class is also optimised for position-predicates
	that are applied to descendant iterators:</p>
	<blockquote class="source">
	<pre>
	<xsl:for-each select="//bar[3]"></pre>
	</blockquote>

	<p>Such step/predicate combinations are handled by the internal DOM's
	inner class <code>NthDescendantIterator</code>.</p>



	<h5>Nth position iterators</h5>

	<p>Similarly, the <code>Step</code> class is optimised for
	position-predicates that are applied to basic steps:</p>
	<blockquote class="source">
	<pre>
	<xsl:for-each select="bar[3]"></pre>
	</blockquote>

	<p>Such step/predicate combinations are handled by the internal DOM's
	inner class <code>NthPositionIterator</code>.</p>



	<h5>Node test</h5>

	<p>The predicate class contains a method that tells you if it is a boolean
	test:</p>
	<blockquote class="source">
	<pre>
	public boolean isBooleanTest();</pre>
	</blockquote>

	<p>This can be, but it currently is not, used by the <code>Step</code> class
	to compile in optimised code. Some work to be done here!</p>





	<a name="step-patterns">‌</a>
	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	<h4>Predicates and StepPatterns</h4>

	<p>Using predicates in patterns is slow on any XSLT processor, and XSLTC
	is no exception. This is why the predicate context is carefully analysed
	by the <code>StepPattern</code> class, so that the compiled code is
	specialised to handle the specific predicate(s) in use. First we should
	consider the basic step pattern.</p>

	<h5>Basic pattern handling</h5>

	<p>All patterns are grouped (by the <code>Mode</code> class) according to
	their <b>
	<i>kernel</i>
	</b>-node type. The kernel node-type is node-type of the
	<b>last</b> step in a pattern:</p>
	<blockquote class="source">
	<pre>
	<xsl:template match="foo/bar/baz"> ... <xsl:template></pre>
	</blockquote>

	<p>In this case the type for elements <code><baz></code> is the
	kernel type. This step is <b>not</b> compiled as a step pattern. The node
	type is passed to the <code>Mode</code> class and is used to place the
	remainder of the pattern code inside the big <code>switch()</code> statement
	in the translet's <code>applyTemplates()</code> method. The whole pattern
	is then <b>reduced</b> to:</p>
	<blockquote class="source">
	<pre>
	match="foo/bar"</pre>
	</blockquote>

	<p>The <code>StepPattern</code> representing the <code><bar></code>
	element test is compiled under the appropriate <code>case:</code> section
	of the <code>switch()</code> statement. The code compiled for the step
	pattern is basically just a call to the DOM's <code>getType()</code>
	method and a test for the desired node type. There are two special cases
	for:</p>
	<blockquote class="source">
	<pre>
	<xsl:template match="foo/*/baz"> ... <xsl:template>
	<xsl:template match="foo/*@[2]"> ... <xsl:template></pre>
	</blockquote>

	<p>In the first case we call <code>isElement()</code> and in the second case
	we call <code>isAttribute()</code>, instead of <code>getType()</code>.</p>



	<h5>Patterns with predicates</h5>

	<p>The <code>typeCheck()</code> method of the <code>StepPattern</code>
	invokes a method that analyses the predicates of the step pattern and
	determines their context. (Note that this method needs to be updated to
	handle the exception to the GENERAL_CONTEXT metioned in the section
	<a href="#exception">Expressions containing one position predicate</a> earlier in this document.) The <code>translate()</code> method of the
	<code>StepPattern</code> class contains a <code>switch()</code> statement
	that calls methods that are tailored for compiling code for the various
	predicate contexts.</p>







	<p align="right" size="2">
	<a href="#content">(top)</a>
	</p>
	</div>
	<div id="footer">Copyright © 1999-2014 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Thu 2014-05-15</div>
	</div>
	</body>
	</html>