| <HTML><HEAD><SCRIPT language="JavaScript" src="resources/script.js" type="text/javascript"></SCRIPT><TITLE><xsl:strip/preserve-space></TITLE></HEAD><BODY alink="#ff0000" bgcolor="#ffffff" leftmargin="4" link="#0000ff" marginheight="4" marginwidth="4" text="#000000" topmargin="4" vlink="#0000aa"><TABLE border="0" cellpadding="0" cellspacing="0" width="620"><TR><TD align="left" height="60" rowspan="3" valign="top" width="135"><IMG alt="logo" border="0" height="60" hspace="0" src="resources/logo.gif" vspace="0" width="135"></TD><TD align="left" colspan="4" height="5" valign="top" width="456"><IMG alt="line" border="0" height="5" hspace="0" src="resources/line.gif" vspace="0" width="456"></TD><TD align="left" height="60" rowspan="3" valign="top" width="29"><IMG alt="right" border="0" height="60" hspace="0" src="resources/right.gif" vspace="0" width="29"></TD></TR><TR><TD align="left" bgcolor="#0086b2" colspan="4" height="35" valign="top" width="456"><IMG alt="" border="0" height="35" hspace="0" src="graphics/xsl_whitespace_design-header.jpg" vspace="0" width="456"></TD></TR><TR><TD align="left" height="20" valign="top" width="168"><IMG alt="bottom" border="0" height="20" hspace="0" src="resources/bottom.gif" vspace="0" width="168"></TD><TD align="left" height="20" valign="top" width="96"><A href="http://xml.apache.org/" onMouseOut="rolloverOff('xml');" onMouseOver="rolloverOn('xml');" target="new"><IMG alt="http://xml.apache.org/" border="0" height="20" hspace="0" name="xml" onLoad="rolloverLoad('xml','resources/button-xml-hi.gif','resources/button-xml-lo.gif');" src="resources/button-xml-lo.gif" vspace="0" width="96"></A></TD><TD align="left" height="20" valign="top" width="96"><A href="http://www.apache.org/" onMouseOut="rolloverOff('asf');" onMouseOver="rolloverOn('asf');" target="new"><IMG alt="http://www.apache.org/" border="0" height="20" hspace="0" name="asf" onLoad="rolloverLoad('asf','resources/button-asf-hi.gif','resources/button-asf-lo.gif');" src="resources/button-asf-lo.gif" vspace="0" width="96"></A></TD><TD align="left" height="20" valign="top" width="96"><A href="http://www.w3.org/" onMouseOut="rolloverOff('w3c');" onMouseOver="rolloverOn('w3c');" target="new"><IMG alt="http://www.w3.org/" border="0" height="20" hspace="0" name="w3c" onLoad="rolloverLoad('w3c','resources/button-w3c-hi.gif','resources/button-w3c-lo.gif');" src="resources/button-w3c-lo.gif" vspace="0" width="96"></A></TD></TR></TABLE><TABLE border="0" cellpadding="0" cellspacing="0" width="620"><TR><TD align="left" valign="top" width="120"><IMG alt="join" border="0" height="14" hspace="0" src="resources/join.gif" vspace="0" width="120"><BR> |
| |
| <A href="index.html" onMouseOut="rolloverOff('side-index');" onMouseOver="rolloverOn('side-index');"><IMG alt="Overview" border="0" height="12" hspace="0" name="side-index" onLoad="rolloverLoad('side-index','graphics/index-label-2.jpg','graphics/index-label-3.jpg');" src="graphics/index-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <IMG alt="separator" border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> |
| |
| <A href="xsltc_compiler.html" onMouseOut="rolloverOff('side-xsltc_compiler');" onMouseOver="rolloverOn('side-xsltc_compiler');"><IMG alt="Compiler design" border="0" height="12" hspace="0" name="side-xsltc_compiler" onLoad="rolloverLoad('side-xsltc_compiler','graphics/xsltc_compiler-label-2.jpg','graphics/xsltc_compiler-label-3.jpg');" src="graphics/xsltc_compiler-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <IMG alt="separator" border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> |
| |
| <IMG alt="Whitespace" border="0" height="12" hspace="0" src="graphics/xsl_whitespace_design-label-1.jpg" vspace="0" width="120"><BR> |
| |
| <A href="xsl_sort_design.html" onMouseOut="rolloverOff('side-xsl_sort_design');" onMouseOver="rolloverOn('side-xsl_sort_design');"><IMG alt="xsl:sort" border="0" height="12" hspace="0" name="side-xsl_sort_design" onLoad="rolloverLoad('side-xsl_sort_design','graphics/xsl_sort_design-label-2.jpg','graphics/xsl_sort_design-label-3.jpg');" src="graphics/xsl_sort_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <A href="xsl_key_design.html" onMouseOut="rolloverOff('side-xsl_key_design');" onMouseOver="rolloverOn('side-xsl_key_design');"><IMG alt="Keys" border="0" height="12" hspace="0" name="side-xsl_key_design" onLoad="rolloverLoad('side-xsl_key_design','graphics/xsl_key_design-label-2.jpg','graphics/xsl_key_design-label-3.jpg');" src="graphics/xsl_key_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <A href="xsl_comment_design.html" onMouseOut="rolloverOff('side-xsl_comment_design');" onMouseOver="rolloverOn('side-xsl_comment_design');"><IMG alt="Comment design" border="0" height="12" hspace="0" name="side-xsl_comment_design" onLoad="rolloverLoad('side-xsl_comment_design','graphics/xsl_comment_design-label-2.jpg','graphics/xsl_comment_design-label-3.jpg');" src="graphics/xsl_comment_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <IMG alt="separator" border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> |
| |
| <A href="xsl_lang_design.html" onMouseOut="rolloverOff('side-xsl_lang_design');" onMouseOver="rolloverOn('side-xsl_lang_design');"><IMG alt="lang()" border="0" height="12" hspace="0" name="side-xsl_lang_design" onLoad="rolloverLoad('side-xsl_lang_design','graphics/xsl_lang_design-label-2.jpg','graphics/xsl_lang_design-label-3.jpg');" src="graphics/xsl_lang_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <A href="xsl_unparsed_design.html" onMouseOut="rolloverOff('side-xsl_unparsed_design');" onMouseOver="rolloverOn('side-xsl_unparsed_design');"><IMG alt="Unparsed entities" border="0" height="12" hspace="0" name="side-xsl_unparsed_design" onLoad="rolloverLoad('side-xsl_unparsed_design','graphics/xsl_unparsed_design-label-2.jpg','graphics/xsl_unparsed_design-label-3.jpg');" src="graphics/xsl_unparsed_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <IMG alt="separator" border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> |
| <A href="xsl_if_design.html" onMouseOut="rolloverOff('side-xsl_if_design');" onMouseOver="rolloverOn('side-xsl_if_design');"><IMG alt="If design" border="0" height="12" hspace="0" name="side-xsl_if_design" onLoad="rolloverLoad('side-xsl_if_design','graphics/xsl_if_design-label-2.jpg','graphics/xsl_if_design-label-3.jpg');" src="graphics/xsl_if_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| <A href="xsl_choose_design.html" onMouseOut="rolloverOff('side-xsl_choose_design');" onMouseOver="rolloverOn('side-xsl_choose_design');"><IMG alt="Choose|When|Otherwise design" border="0" height="12" hspace="0" name="side-xsl_choose_design" onLoad="rolloverLoad('side-xsl_choose_design','graphics/xsl_choose_design-label-2.jpg','graphics/xsl_choose_design-label-3.jpg');" src="graphics/xsl_choose_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| <A href="xsl_include_design.html" onMouseOut="rolloverOff('side-xsl_include_design');" onMouseOver="rolloverOn('side-xsl_include_design');"><IMG alt="Include|Import design" border="0" height="12" hspace="0" name="side-xsl_include_design" onLoad="rolloverLoad('side-xsl_include_design','graphics/xsl_include_design-label-2.jpg','graphics/xsl_include_design-label-3.jpg');" src="graphics/xsl_include_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| <A href="xsl_variable_design.html" onMouseOut="rolloverOff('side-xsl_variable_design');" onMouseOver="rolloverOn('side-xsl_variable_design');"><IMG alt="Variable|Param design" border="0" height="12" hspace="0" name="side-xsl_variable_design" onLoad="rolloverLoad('side-xsl_variable_design','graphics/xsl_variable_design-label-2.jpg','graphics/xsl_variable_design-label-3.jpg');" src="graphics/xsl_variable_design-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <IMG alt="separator" border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> |
| |
| <A href="xsltc_runtime.html" onMouseOut="rolloverOff('side-xsltc_runtime');" onMouseOver="rolloverOn('side-xsltc_runtime');"><IMG alt="Runtime" border="0" height="12" hspace="0" name="side-xsltc_runtime" onLoad="rolloverLoad('side-xsltc_runtime','graphics/xsltc_runtime-label-2.jpg','graphics/xsltc_runtime-label-3.jpg');" src="graphics/xsltc_runtime-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <IMG alt="separator" border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> |
| |
| <A href="xsltc_dom.html" onMouseOut="rolloverOff('side-xsltc_dom');" onMouseOver="rolloverOn('side-xsltc_dom');"><IMG alt="Internal DOM" border="0" height="12" hspace="0" name="side-xsltc_dom" onLoad="rolloverLoad('side-xsltc_dom','graphics/xsltc_dom-label-2.jpg','graphics/xsltc_dom-label-3.jpg');" src="graphics/xsltc_dom-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <A href="xsltc_namespace.html" onMouseOut="rolloverOff('side-xsltc_namespace');" onMouseOver="rolloverOn('side-xsltc_namespace');"><IMG alt="Namespaces" border="0" height="12" hspace="0" name="side-xsltc_namespace" onLoad="rolloverLoad('side-xsltc_namespace','graphics/xsltc_namespace-label-2.jpg','graphics/xsltc_namespace-label-3.jpg');" src="graphics/xsltc_namespace-label-3.jpg" vspace="0" width="120"></A><BR> |
| |
| <IMG alt="separator" border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> |
| |
| <A href="xsltc_trax.html" onMouseOut="rolloverOff('side-xsltc_trax');" onMouseOver="rolloverOn('side-xsltc_trax');"><IMG alt="Translet & TrAX" border="0" height="12" hspace="0" name="side-xsltc_trax" onLoad="rolloverLoad('side-xsltc_trax','graphics/xsltc_trax-label-2.jpg','graphics/xsltc_trax-label-3.jpg');" src="graphics/xsltc_trax-label-3.jpg" vspace="0" width="120"></A><BR> |
| <A href="xsltc_predicates.html" onMouseOut="rolloverOff('side-xsltc_predicates');" onMouseOver="rolloverOn('side-xsltc_predicates');"><IMG alt="XPath Predicates" border="0" height="12" hspace="0" name="side-xsltc_predicates" onLoad="rolloverLoad('side-xsltc_predicates','graphics/xsltc_predicates-label-2.jpg','graphics/xsltc_predicates-label-3.jpg');" src="graphics/xsltc_predicates-label-3.jpg" vspace="0" width="120"></A><BR> |
| <A href="xsltc_iterators.html" onMouseOut="rolloverOff('side-xsltc_iterators');" onMouseOver="rolloverOn('side-xsltc_iterators');"><IMG alt="Xsltc Iterators" border="0" height="12" hspace="0" name="side-xsltc_iterators" onLoad="rolloverLoad('side-xsltc_iterators','graphics/xsltc_iterators-label-2.jpg','graphics/xsltc_iterators-label-3.jpg');" src="graphics/xsltc_iterators-label-3.jpg" vspace="0" width="120"></A><BR> |
| <A href="xsltc_native_api.html" onMouseOut="rolloverOff('side-xsltc_native_api');" onMouseOver="rolloverOn('side-xsltc_native_api');"><IMG alt="Xsltc Native API" border="0" height="12" hspace="0" name="side-xsltc_native_api" onLoad="rolloverLoad('side-xsltc_native_api','graphics/xsltc_native_api-label-2.jpg','graphics/xsltc_native_api-label-3.jpg');" src="graphics/xsltc_native_api-label-3.jpg" vspace="0" width="120"></A><BR> |
| <A href="xsltc_trax_api.html" onMouseOut="rolloverOff('side-xsltc_trax_api');" onMouseOver="rolloverOn('side-xsltc_trax_api');"><IMG alt="Xsltc TrAX API" border="0" height="12" hspace="0" name="side-xsltc_trax_api" onLoad="rolloverLoad('side-xsltc_trax_api','graphics/xsltc_trax_api-label-2.jpg','graphics/xsltc_trax_api-label-3.jpg');" src="graphics/xsltc_trax_api-label-3.jpg" vspace="0" width="120"></A><BR> |
| <A href="xsltc_performance.html" onMouseOut="rolloverOff('side-xsltc_performance');" onMouseOver="rolloverOn('side-xsltc_performance');"><IMG alt="Performance Hints" border="0" height="12" hspace="0" name="side-xsltc_performance" onLoad="rolloverLoad('side-xsltc_performance','graphics/xsltc_performance-label-2.jpg','graphics/xsltc_performance-label-3.jpg');" src="graphics/xsltc_performance-label-3.jpg" vspace="0" width="120"></A><BR> |
| <IMG alt="close" border="0" height="14" hspace="0" src="resources/close.gif" vspace="0" width="120"><BR></TD><TD align="left" valign="top" width="500"><TABLE border="0" cellpadding="3" cellspacing="0"><TR><TD> |
| |
| <UL> |
| <LI><A href="#functionality">Functionality</A></LI> |
| <LI><A href="#identify">Identifying strippable whitespace nodes</A></LI> |
| <LI><A href="#which">Determining which nodes to strip</A></LI> |
| <LI><A href="#strip">Stripping nodes</A></LI> |
| <LI><A href="#filter">Filtering whitespace nodes</A></LI> |
| </UL> |
| |
| <A name="functionality"><!--anchor--></A> |
| <TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="666699" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG alt="" border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Functionality</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> |
| |
| <P>The <CODE><FONT face="courier, monospaced"><xsl:strip-space></FONT></CODE> and <CODE><FONT face="courier, monospaced"><xsl:preserve-space></FONT></CODE> |
| elements are used to control the way whitespace nodes in the source XML |
| document are handled. These elements have no impact on whitespace in the XSLT |
| stylesheet. Both elements can occur only as top-level elements, possible more |
| than once, and the elements are always empty</P> |
| |
| <P>Both elements take one attribute "elements" which contains a |
| whitespace separated list of named nodes which should be or preserved |
| stripped from the source document. These names can be on one of these three |
| formats (NameTest format):</P> |
| |
| <UL> |
| <LI> |
| All whitespace nodes: |
| <CODE><FONT face="courier, monospaced">elements="*"</FONT></CODE> |
| </LI> |
| <LI> |
| All whitespace nodes with a namespace: |
| <CODE><FONT face="courier, monospaced">elements="<namespace>:*"</FONT></CODE> |
| </LI> |
| <LI> |
| Specific whitespace nodes: <CODE><FONT face="courier, monospaced">elements="<qname>"</FONT></CODE> |
| </LI> |
| </UL> |
| |
| </FONT></TD></TR></TABLE><BR><A name="identify"><!--anchor--></A> |
| <TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="666699" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG alt="" border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Identifying strippable whitespace nodes</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> |
| |
| <P>The DOM detects all text nodes and assigns them the type <CODE><FONT face="courier, monospaced">TEXT</FONT></CODE>. |
| All text nodes are scanned to detect whitespace-only nodes. A text-node is |
| considered a whitespace node only if it consist entirely of characters from |
| the set { 0x09, 0x0a, 0x0d, 0x20 }. The DOM implementation class has a static |
| method used to detect such nodes:</P> |
| |
| <DIV align="right"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> |
| private static final boolean isWhitespaceChar(char c) { |
| return c == 0x20 || c == 0x0A || c == 0x0D || c == 0x09; |
| } |
| </PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> |
| |
| <P>The characters are checked in probable order.</P> |
| |
| <P> The DOM has a bit-array that is used to tag text-nodes as strippable |
| whitespace nodes:</P> |
| |
| <DIV align="right"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE>private int[] _whitespace;</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> |
| |
| <P>There are two methods in the DOM implementation class for accessing this |
| bit-array: <CODE><FONT face="courier, monospaced">markWhitespace(node)</FONT></CODE> and <CODE><FONT face="courier, monospaced">isWhitespace(node)</FONT></CODE>. |
| The array is resized together with all other arrays in the DOM by the |
| <CODE><FONT face="courier, monospaced">DOM.resizeArrays()</FONT></CODE> method. The bits in the array are set in the |
| <CODE><FONT face="courier, monospaced">DOM.maybeCreateTextNode()</FONT></CODE> method. This method must know whether |
| the current node is a located under an element with an |
| <CODE><FONT face="courier, monospaced">xml:space="<value>"</FONT></CODE> attribute in the DOM, in which |
| case it is not a strippable whitespace node.</P> |
| |
| <P>An auxillary class, WhitespaceHandler, is used for this purpose. The class |
| works in a way as a stack, where you "push" a new strip/preserve setting |
| together with the node in which this setting was determined. This means that |
| for every time the DOM builder encounters an <CODE><FONT face="courier, monospaced">xml:space</FONT></CODE> attribute |
| it will invoke a method on an instance of the WhitespaceHandler class to |
| signal that a new preserve/strip setting has been encountered. This is done |
| in the <CODE><FONT face="courier, monospaced">makeAttributeNode()</FONT></CODE> method. The whitespace handler stores the |
| new setting and pushes the current element node on its stack. When the |
| DOM builder closes up an element (in <CODE><FONT face="courier, monospaced">endElement()</FONT></CODE>), it invokes |
| another method of the whitespace handler to check if the strip/preserve |
| setting is still valid. If the setting is now invalid (we're closing the |
| element whose node id is on the top of the stack) the handler inverts the |
| setting and pops the element node id off the stack. The previous |
| strip/preserve setting is then valid, and the id of node where this setting |
| was defined is on the top of the stack.</P> |
| |
| </FONT></TD></TR></TABLE><BR><A name="which"><!--anchor--></A> |
| <TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="666699" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG alt="" border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Determining which nodes to strip</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> |
| |
| <P>A text node is never stripped unless it contains only whitespace |
| characters (Unicode characters 0x09, 0x0A, 0x0D and 0x20). Stripping a text |
| node means that the node disappears from the DOM; so that it is never |
| included in the output and that it is ignored by all functions such as |
| <CODE><FONT face="courier, monospaced">count()</FONT></CODE>. A text node is preserved if any of the following apply:</P> |
| |
| <UL> |
| <LI> |
| the element name of the parent of the text node is in the set of |
| elements listed in <CODE><FONT face="courier, monospaced"><xsl:preserve-space></FONT></CODE> |
| </LI> |
| <LI> |
| the text node contains at least one non-whitespace character |
| </LI> |
| <LI> |
| an ancenstor of the whitespace text node has an attribute of |
| <CODE><FONT face="courier, monospaced">xsl:space="preserve"</FONT></CODE>, and no close ancestor has and |
| attribute of <CODE><FONT face="courier, monospaced">xsl:space="default"</FONT></CODE>. |
| </LI> |
| </UL> |
| |
| <P>Otherwise, the text node is stripped. Initially the set of |
| whitespace-preserving element names contains all element names, so the |
| default behaviour is to preserve all whitespace text nodes.</P> |
| |
| <P>This seems simple enough, but resolving conflicts between matching |
| <CODE><FONT face="courier, monospaced"><xsl:strip-space></FONT></CODE> and <CODE><FONT face="courier, monospaced"><xsl:preserve-space></FONT></CODE> |
| elements requires a lot of thought. Our first consideration is import |
| precedence; the match with the highest import precedence is always chosen. |
| Import precedence is determined by the order in which the compared elements |
| are visited. (In this case those elements are the top-level |
| <CODE><FONT face="courier, monospaced"><xsl:strip-space></FONT></CODE> and <CODE><FONT face="courier, monospaced"><xsl:preserve-space></FONT></CODE> |
| elements.) This example is taken from the XSLT recommendation:</P> |
| |
| <UL> |
| <LI>stylesheet A imports stylesheets B and C in that order;</LI> |
| <LI>stylesheet B imports stylesheet D;</LI> |
| <LI>stylesheet C imports stylesheet E.</LI> |
| </UL> |
| |
| <P>Then the order of import precedence (lowest first) is D, B, E, C, A.</P> |
| |
| <P>Our next consideration is the priority of NameTests (XPath spec):</P> |
| <UL> |
| <LI> |
| <CODE><FONT face="courier, monospaced">elements="<qname>"</FONT></CODE> has priority 0 |
| </LI> |
| <LI> |
| <CODE><FONT face="courier, monospaced">elements="<namespace>:*"</FONT></CODE> has priority -0.25 |
| </LI> |
| <LI> |
| <CODE><FONT face="courier, monospaced">elements="*"</FONT></CODE> has priority -0.5 |
| </LI> |
| </UL> |
| |
| <P>It is considered an error if the desicion is still ambiguous after this, |
| and it is up to the implementors to decide what the apropriate action is.</P> |
| |
| <P>With all this complexity, the normal usage for these elements is quite |
| smiple; either preserve all whitespace nodes but one type:</P> |
| |
| <DIV align="right"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE><xsl:strip-space elements="foo"/></PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> |
| |
| <P>or strip all whitespace nodes but one type:</P> |
| |
| <DIV align="right"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> |
| <xsl:strip-space elements="*"/> |
| <xsl:preserve-space elements="foo"/></PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> |
| |
| </FONT></TD></TR></TABLE><BR><A name="strip"><!--anchor--></A> |
| <TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="666699" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG alt="" border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Stripping nodes</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> |
| |
| <P>The ultimate goal of our design would be to totally screen all stripped |
| nodes from the translet; to either physically remove them from the DOM or to |
| make it appear as if they are not there. The first approach will cause |
| problems in cases where multiple translets access the same DOM. In the future |
| we wish to let translets run within servlets / JSPs with a common DOM cache. |
| This DOM cache will keep copies of DOMs in memory to prevent the same XML |
| file from being downloaded and parsed several times. This is a scenarios we |
| might see:</P> |
| |
| <P><IMG align="right" alt="DOMInterface.gif" border="0" hspace="4" src="images/DOMInterface.gif" vspace="4"><BR clear="all"></P> |
| <P><I>Figure 1: Multiple translets accessing a common pool of DOMs</I></P> |
| |
| <P>The three translets running on this host access a common pool of 4 DOMs. |
| The DOMs are accessed through a common DOM interface. Translets accessing |
| a single DOM will have a DOMAdapter and a single DOMImpl object behind this |
| interface, while translets accessing several DOMs will be given a MultiDOM |
| and a set of DOMImpl objects.</P> |
| |
| <P>The translet to the left may want to strip some nodes from the shared DOM |
| in the cache, while the other translets may want to preserve all whitespace |
| nodes. Our initial thought then is to keep the DOM as it is and somehow |
| screen the left-hand translet of all the whitespace nodes it does not want to |
| process. There are a few ways in which we can accomplish this:</P> |
| |
| <UL> |
| <LI> |
| The translet can, prior to starting to traverse the DOM, send a reference |
| to the tables containing information on which nodes we want stripped to |
| the DOM interface. The DOM interface is then responsible for hiding all |
| stripped whitespace nodes from the iterators and the translet. A problem |
| with this approach is that we want to omit the DOM interface layer if |
| the translet is only accessing a single DOM. The DOM interface layer will |
| only be instanciated by the translet if the stylesheet contained a call |
| to the <CODE><FONT face="courier, monospaced">document()</FONT></CODE> function.<BR><BR> |
| </LI> |
| <LI> |
| The translet can provide its iterators with information on which nodes it |
| does not want to see. The translet is still shielded from unwanted |
| whitespace nodes, but it has the hassle of passing extra information over |
| to most iterators it ever instanciates. Note that all iterators do not |
| need be aware of whitepspace nodes in this case. If you have a look at |
| the figure again you will see that only the first level iterator (that is |
| the one closest to the DOM or DOM interface) will have to strip off |
| whitespace nodes. But, there may be several iterators that operate |
| directly on the DOM ( invoked by code handling XSL functions such as |
| <CODE><FONT face="courier, monospaced">count()</FONT></CODE>) and every single one of those will need to be told |
| which whitespace nodes the translet does not want to see.<BR><BR> |
| </LI> |
| <LI> |
| The third approach will take advantage of the fact that not all |
| translets will want to strip whitespace nodes. The most effective way of |
| removing unwanted whitespace nodes is to do it once and for all, before |
| the actual traversal of the DOM starts. This can be done by making a |
| clone of the DOM with exlusive-access rights for this translet only. We |
| still gain performance from the cache because we do not have to pay the |
| cost of the delay caused by downloading and parsing the XML source file. |
| The cost we have to pay is the time needed for the actual cloning and the |
| extra memory we use.<BR><BR> |
| Normally one would imagine the translet (or the wrapper class that |
| invokes the translet) calls the DOM cache with just an URL and receives |
| a reference to an instanciated DOM. The cache will either have built |
| this DOM on-demand or just passed back a reference to an existing tree. |
| In this case the DOM would need an extra call that a translet would use |
| to clone a DOM, passing the existing DOM reference to the cache and |
| recieving a new reference to the cloned DOM. The translet can then do |
| whatever it wants with this DOM (the cache need not even keep a reference |
| to this tree). |
| </LI> |
| </UL> |
| |
| <P>We are lucky enough to be able to combine the first two approaches. All |
| iterators that directly access the DOM (axis iterators) are instanciated by |
| calls to the DOM interface layer (the DOM class). The actual iterators are |
| created in the DOM implementation layer (the DOMImpl class). So, we can pass |
| references to the preserve/strip whitespace tables to the DOM, and the DOM |
| will make sure that all axis iterators return node sets with respect to these |
| tables.</P> |
| </FONT></TD></TR></TABLE><BR><A name="filter"><!--anchor--></A> |
| <TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="666699" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG alt="" border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Filtering whitespace nodes</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> |
| |
| <P>For each axis iterator and for <CODE><FONT face="courier, monospaced">DOM.makeStringValue()</FONT></CODE> and |
| <CODE><FONT face="courier, monospaced">DOM.stringValueAux()</FONT></CODE> we must apply a filter for eliminating all |
| unwanted whitespace nodes. To achive this we must build a very efficient |
| predicate for determining if the current node should be stripped or not. This |
| predicate is built by <CODE><FONT face="courier, monospaced">Whitespace.compilePredicate()</FONT></CODE>. This method is |
| static and builds a predicate for a vector of WhitespaceRule objects. (The |
| WhitespaceRule class is defined within the Whitespace class.) Each |
| WhitespaceRule object contains information for a single element listed |
| in an <CODE><FONT face="courier, monospaced"><xsl:strip/preserve-space></FONT></CODE> element, and is broken down |
| into the following information:</P> |
| |
| <UL> |
| <LI>the namespace (can be the default namespace)</LI> |
| <LI>the element name or "<CODE><FONT face="courier, monospaced">*</FONT></CODE>"</LI> |
| <LI>the type of rule; NS:EL, NS:<CODE><FONT face="courier, monospaced">*</FONT></CODE> or <CODE><FONT face="courier, monospaced">*</FONT></CODE></LI> |
| <LI>the priority of the rule (based on import precedence and type)</LI> |
| <LI>the action; either strip or preserver</LI> |
| </UL> |
| |
| <P>The Vector of WhitespaceRules is arranged in order of priority and |
| redundant rules are removed. A predicate method is then compiled into the |
| translet as:</P> |
| |
| <DIV align="right"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> |
| public boolean stripSpace(int node); |
| </PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG alt="" border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> |
| |
| <P>Unfortunately this method cannot be declared static.</P> |
| |
| <P>When the Stylesheet objectcompiles the <CODE><FONT face="courier, monospaced">topLevel()</FONT></CODE> method of the |
| translet it checks for the existance of the <CODE><FONT face="courier, monospaced">stripSpace()</FONT></CODE> method. If |
| this method exists the <CODE><FONT face="courier, monospaced">topLevel()</FONT></CODE> will be compiled to pass the |
| translet to the DOM as a StripWhitespaceFilter (the translet implements this |
| interface when the <CODE><FONT face="courier, monospaced">stripSpace()</FONT></CODE> method is compiled).</P> |
| |
| <P>All axis iterators and the <CODE><FONT face="courier, monospaced">DOM.makeStringValue()</FONT></CODE> and |
| <CODE><FONT face="courier, monospaced">DOM.stringValueAux()</FONT></CODE> methods check for the existance of this filter |
| (it is kept in a global variable in the DOM implementation class) and takes |
| the appropriate actions. The methods in the DOM for returning axis iterators |
| will place a StrippingIterator on top of the axis iterator if the filter is |
| present, and the two methods just mentioned will return empty strings for |
| whitespace nodes that should be stripped.</P> |
| |
| </FONT></TD></TR></TABLE><BR> |
| </TD></TR></TABLE></TD></TR></TABLE><BR><TABLE border="0" cellpadding="0" cellspacing="0" width="620"><TR><TD bgcolor="#0086b2"><IMG alt="dot" height="1" src="resources/dot.gif" width="1"></TD></TR><TR><TD align="center"><FONT color="#0086b2" size="-1"><I> |
| Copyright © 2004 The Apache Software Foundation. |
| All Rights Reserved. |
| </I></FONT></TD></TR></TABLE></BODY></HTML> |