blob: 84c1f1eb9eb0aaca10f14b1fb80f323fe1629909 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>ASF: XSLTC Performance</title>
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" />
</head>
<!--
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
-->
<body>
<div id="title">
<table class="HdrTitle">
<tbody>
<tr>
<th rowspan="2">
<a href="../index.html">
<img alt="Trademark Logo" src="resources/XalanJ-Logo-tm.png" width="190" height="90" />
</a>
</th>
<th text-align="center" width="75%">
<a href="index.html">XSLTC Design</a>
</th>
</tr>
<tr>
<td valign="middle">XSLTC Performance</td>
</tr>
</tbody>
</table>
<table class="HdrButtons" align="center" border="1">
<tbody>
<tr>
<td>
<a href="http://www.apache.org">Apache Foundation</a>
</td>
<td>
<a href="http://xalan.apache.org">Xalan Project</a>
</td>
<td>
<a href="http://xerces.apache.org">Xerces Project</a>
</td>
<td>
<a href="http://www.w3.org/TR">Web Consortium</a>
</td>
<td>
<a href="http://www.oasis-open.org/standards">Oasis Open</a>
</td>
</tr>
</tbody>
</table>
</div>
<div id="navLeft">
<ul>
<li>
<a href="index.html">Overview</a>
</li></ul><hr /><ul>
<li>
<a href="xsltc_compiler.html">Compiler design</a>
</li></ul><hr /><ul>
<li>
<a href="xsl_whitespace_design.html">Whitespace</a>
</li>
<li>
<a href="xsl_sort_design.html">xsl:sort</a>
</li>
<li>
<a href="xsl_key_design.html">Keys</a>
</li>
<li>
<a href="xsl_comment_design.html">Comment design</a>
</li></ul><hr /><ul>
<li>
<a href="xsl_lang_design.html">lang()</a>
</li>
<li>
<a href="xsl_unparsed_design.html">Unparsed entities</a>
</li></ul><hr /><ul>
<li>
<a href="xsl_if_design.html">If design</a>
</li>
<li>
<a href="xsl_choose_design.html">Choose|When|Otherwise design</a>
</li>
<li>
<a href="xsl_include_design.html">Include|Import design</a>
</li>
<li>
<a href="xsl_variable_design.html">Variable|Param design</a>
</li></ul><hr /><ul>
<li>
<a href="xsltc_runtime.html">Runtime</a>
</li></ul><hr /><ul>
<li>
<a href="xsltc_dom.html">Internal DOM</a>
</li>
<li>
<a href="xsltc_namespace.html">Namespaces</a>
</li></ul><hr /><ul>
<li>
<a href="xsltc_trax.html">Translet &amp; TrAX</a>
</li>
<li>
<a href="xsltc_predicates.html">XPath Predicates</a>
</li>
<li>
<a href="xsltc_iterators.html">Xsltc Iterators</a>
</li>
<li>
<a href="xsltc_native_api.html">Xsltc Native API</a>
</li>
<li>
<a href="xsltc_trax_api.html">Xsltc TrAX API</a>
</li>
<li>Performance Hints<br />
</li>
</ul>
</div>
<div id="content">
<h2>XSLTC Performance</h2>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
<h3>Introduction</h3>
<p>
<b>XSLT is not a programming language!</b> Just so you remember.
XSLT is a declarative language and can be used by you to describe
<b>what</b> you want put in your output document and
<b>what</b> you want this output to look like. It does not describe
<b>how</b> these tasks should be carried. That is the job of the XSLT
processor. This document is <b>not</b> a "<b>
<i>programmer's guide to XSLT</i>
</b>"
and should not be considered as such. All XSLT processors have their
properties and ways of handling XSL elements and XPath properties. This
document will give you some insight into the XSLTC internals, so that you
can channel your stylesheets through XSLTC's shortest and most efficient
code paths.</p>
<p>XSLTC's performance has always been one of its key selling points.
(I should probably find a better term here, since we're giving XSLTC away
for free.) But, there are some specific patterns and expressions that are
not handled much better than with other interpretive XSLT processors, and
this document is an attempt to pinpoint these and to outline alternatives.
</p>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
<h3>Contents</h3>
<ul>
<li>
<a href="#pred">Avoid using predicates in '*' patterns</a>
</li>
<li>
<a href="#idkey">Avoid using id/key-patterns</a>
</li>
<li>
<a href="#union">Avoid union expressions where possible</a>
</li>
<li>
<a href="#sort">Sort stored node-sets once</a>
</li>
<li>
<a href="#cache">Cache input documents</a>
</li>
<li>
<a href="#trax">TrAX vs. native API</a>
</li>
</ul>
<a name="pred"></a>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
<h3>Avoid using predicates in wildcard patterns</h3>
<p>XSLTC gains its speed from the simple dispatch loop in the translet's
<code>applyTemplates()</code> method. This method uses a simple
<code>switch()</code> statement to choose the desired template based on
the current node's node type (an integer). By adding a pattern with a
wildcard (no type) and a predicate, XSLTC is forced to evaluate the
predicate for every single node.</p>
<blockquote class="source">
<pre>
&lt;xsl:template match="*[2]"&gt;</pre>
</blockquote>
<p>The above pattern should be avoided by selecting the desired node when
using <code>&lt;xsl:apply-templates&gt;</code>. Use named templates or
modes to make sure you trigger the correct template:</p>
<blockquote class="source">
<pre>
&lt;xsl:template match="/"&gt;
&lt;xsl:apply-templates select="bar"/&gt;
&lt;/xsl:template&gt;
&lt;xsl:template match="*[2]"/&gt;
&lt;xsl:template match="*"/&gt;</pre>
</blockquote>
<p>can be replaced by:</p>
<blockquote class="source">
<pre>
&lt;xsl:template match="/"&gt;
&lt;xsl:apply-templates select="bar"/&gt;
&lt;xsl:apply-templates select="bar[2]" mode="second"/&gt;
&lt;/xsl:template&gt;
&lt;xsl:template match="*" mode="second"/&gt;
&lt;xsl:template match="*"/&gt;</pre>
</blockquote>
<p>This change will only improve performance if the stylesheet is fairly
large and has a good few templates (10 or more). Also note that the order
of the output is changed by this approach, so if the order is significant
you'll have to stick to the original stylesheet.</p>
<p>
<b>Important note:</b> The type of pattern referred to as a
type-less pattern, as it does not match any specific node type. Such
patterns do in general degrade the performance of XSLTC. Type-less patterns
must be evaluated for every single node in the input document - causing a
general performance degradation.</p>
<a name="idkey"></a>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
<h3>Avoid using id/key-patterns</h3>
<p>Id and key patterns can be used to trigger a template if the current
node has a specific id or has a specific value in a key's index:</p>
<blockquote class="source">
<pre>
&lt;xsl:template match="id('some-value')"/&gt;
&lt;xsl:template match="key('key-name', 'some-value')"/&gt;</pre>
</blockquote>
<p>Looking up a value/node-pair in an index does not require much processing
time at all. But, this is also a type-less pattern and can match any type
of node. This degrades XSLTC's performance, just like wildcard patterns
with predicates (see above paragraph).</p>
<a name="union"></a>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
<h3>Avoid union expressions where possible</h3>
<p>Union expressions provide an all-in-one-go easy way of applying templates
to sets of nodes:</p>
<blockquote class="source">
<pre>
&lt;xsl:apply-templates select="foo|bar|baz"/&gt;</pre>
</blockquote>
<p>The union iterator that is used to implement union expressions is
unfortunately not very efficient. If node order is not of importance, then
one can benefit from breaking the union up in several elements:</p>
<blockquote class="source">
<pre>
&lt;xsl:apply-templates select="foo"/&gt;
&lt;xsl:apply-templates select="bar"/&gt;
&lt;xsl:apply-templates select="baz"/&gt;</pre>
</blockquote>
<p>But, remeber that this will give you all <code>&lt;foo&gt;</code>
elements first, then all <code>&lt;bar&gt;</code> elements, and so on.
This is not always desirable. You may want to handle these elements in
the order in which they appear in the input document.</p>
<p>
<b>Important note:</b> This does <b>not</b> apply to union patterns.
Using unions in patterns actually makes smaller and more efficient code,
as only one copy of the templete body has to be compiled. Use:</p>
<blockquote class="source">
<pre>
&lt;xsl:template match="foo|bar|baz"/&gt;</pre>
</blockquote>
<p>instead of:</p>
<blockquote class="source">
<pre>
&lt;xsl:template match="foo"/&gt;
&lt;xsl:template match="bar"/&gt;
&lt;xsl:template match="baz"/&gt;</pre>
</blockquote>
<a name="sort"></a>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
<h3>Sort stored node-sets once</h3>
<p>This item is very obvious, but nevertheless easy to forget in some
complicated cases. If you put a result-tree fragment inside a variable, and
you want the nodes in a specific, sorted order, then sort the nodes as you
create the variable and not when you use it. Instead of:</p>
<blockquote class="source">
<pre>
&lt;xsl:variable name="bars"&gt;
&lt;xsl:copy-of select="//foo/bar"/&gt;
&lt;/xsl:variable&gt;
&lt;xsl:template match="/"&gt;
&lt;xsl:text&gt;List of bar's in sorted order:&amp;#xa;&lt;/xsl:text&gt;
&lt;xsl:for-each select="$bars-sorted"&gt;
&lt;xsl:value-of select="@name"/&gt;
&lt;xsl:text&gt;&amp;#xa;&lt;/xsl:text&gt;
&lt;/xsl:for-each&gt;
&lt;/xsl:template&gt;</pre>
</blockquote>
<p>A better way, and with most XSLT processors the only legal way, is to
sort the result tree when creating it:</p>
<blockquote class="source">
<pre>
&lt;xsl:variable name="bars"&gt;
&lt;xsl:for-each select="//foo/bar"&gt;
&lt;xsl:sort select="@name"/&gt;
&lt;xsl:copy-of select="."/&gt;
&lt;/xsl:for-each&gt;
&lt;/xsl:variable&gt;
&lt;xsl:template match="/"&gt;
&lt;xsl:text&gt;List of bar's in sorted order:&amp;#xa;&lt;/xsl:text&gt;
&lt;xsl:for-each select="$bars"&gt;
&lt;xsl:value-of select="@name"/&gt;
&lt;xsl:text&gt;&amp;#xa;&lt;/xsl:text&gt;
&lt;/xsl:for-each&gt;
&lt;/xsl:template&gt;</pre>
</blockquote>
<p>It is very common to sort node-sets returned by the id() and key()
functions. Instead of doing this sorting over and over again, one should
use a variable and store the node set in the desired sort order, and read
the node set from the variable whenever used.</p>
<a name="cache"></a>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
<h3>Cache the input document</h3>
<p>All XSLT processors use an internal DOM-like structure, and XSLTC is no
exception. The internal DOM is tailored for the XSLTC design and can be
navigated efficiently by the translet. Building the internal DOM is a
rather slow process, and does very many cases take more time than the
actual transformation. This is a general rule, and does not only apply to
XSLTC. It is advisable, and common in most large-scale XSLT-based
applications, to create a cache for the input documents. Not only does this
prevent CPU- and memory-intensive DOM creation, but it also prevents several
translets from having their own private copies of common input documents.
Both XSLTC's internal API and TrAX implementation provide ways of
implementing a decent input document cache:</p>
<ul>
<li>See <a href="#trax-cache">below</a> for a description of how
to do this using the TrAX interface.</li>
<li>The <a href="xsltc_native_api.html#document-locator">native API
documentation</a> contains a section on using the internal
<code>org.apache.xalan.xsltc.compiler.SourceLoader</code> interface.</li>
</ul>
<a name="trax"></a>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
<h3>TrAX vs. native API</h3>
<h5>TrAX performance benefits</h5>
<p>If XSLTC's two-step approach to XSLT processing suits your application
then there is no reason why you should not use the TrAX API. The API fits
very nicely in with XSLTC internals and processing model. In fact, you may
even benefit from using TrAX in cases where your stylesheet is compiled
into a large ammount of auxiliary classes. The most obvious benefit is that
the translet class and auxiliary classes are all bundled inside the
<code>Templates</code> object. Performance can also be improved due to the
fact that XSLTC chaches all auxiliary classes inside <code>Templates</code>
code, preventing the class loader from being invoked more than necessary.
This is just theory and no tests have been done, but you should see a
performance improvement when using XSLTC and TrAX in such cases.</p>
<h5>Treat Templates objects as compiled translets</h5>
<p>When using TrAX, the <code>Templates</code> object should be considered
the result of a compilation. With XSLTC this is the actual case - the
<code>Templates</code> object contains the translet Java class(es). With
other XSLT processors the <code>Templates</code> directly or indirectly
contains data-structures represent all or parts of the input stylesheet.
The bottom line is: Create your <code>Templates</code> object once, cache
and re-use it as often as possible.</p>
<a name="trax-cache"></a>
<h5>Input document caching</h5>
<p>An extension to the TrAX API allows input documents to be cached. The
extensions is a sub-class to the TrAX <code>Source</code> class, which can
be used to wrap XSLTC's internal DOM structures. This is described in
detail in the <a href="xsltc_trax_api.html">XSLTC TrAX API reference</a>.
</p>
<p>If you do chose to implement a DOM cache, you should have your cache
implement the <code>javax.xml.transform.URIResolver</code> interface so
that documents loaded by the <code>document()</code> function are also read
from your cache.</p>
<p align="right" size="2">
<a href="#content">(top)</a>
</p>
</div>
<div id="footer">Copyright © 1999-2014 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Thu 2014-05-15</div>
</div>
</body>
</html>