| <?xml version="1.0"?> |
| <!DOCTYPE spec SYSTEM "../../style/dtd/spec.dtd"> |
| <spec> |
| <title>Transformation API For XML (TrAX)</title> |
| <frontmatter> |
| <pubdate>November 12, 2000</pubdate> |
| <copyright>Copyright 2000 Java Community Process (Sun Microsystems, |
| Inc.)</copyright> |
| <author><firstname>Scott</firstname> |
| <surname>Boag</surname> |
| <orgname>IBM Research</orgname> |
| <address> |
| <email>sboag@lotus.com</email> |
| </address> |
| </author></frontmatter> |
| <introduction> |
| <title>Introduction</title> |
| <para>This overview describes the set of APIs contained in |
| <ulink url="package-summary.html">javax.xml.transform</ulink>, <ulink url="package-summary.html">javax.xml.transform.stream</ulink>, <ulink url="package-summary.html">javax.xml.transform.dom</ulink>, and <ulink url="package-summary.html">javax.xml.transform.sax</ulink>. For the sake of brevity, these interfaces are referred to |
| as TrAX (Transformation API for XML). </para> |
| <para>There is a broad need for Java applications to be able to transform XML |
| and related tree-shaped data structures. In fact, XML is not normally very |
| useful to an application without going through some sort of transformation, |
| unless the semantic structure is used directly as data. Almost all XML-related |
| applications need to perform transformations. Transformations may be described |
| by Java code, Perl code, <ulink url="http://www.w3.org/TR/xslt">XSLT</ulink> |
| Stylesheets, other types of script, or by proprietary formats. The inputs, one |
| or multiple, to a transformation, may be a URL, XML stream, a DOM tree, SAX |
| Events, or a proprietary format or data structure. The output types are the |
| pretty much the same types as the inputs, but different inputs may need to be |
| combined with different outputs.</para> |
| <para>The great challenge of a transformation API is how to deal with all the |
| possible combinations of inputs and outputs, without becoming specialized for |
| any of the given types.</para> |
| <para>The Java community will greatly benefit from a common API that will |
| allow them to understand and apply a single model, write to consistent |
| interfaces, and apply the transformations polymorphically. TrAX attempts to |
| define a model that is clean and generic, yet fills general application |
| requirements across a wide variety of uses. </para> |
| <sect2> |
| <title>General Terminology</title> |
| <para>This section will explain some general terminology used in this |
| document. Technical terminology will be explained in the Model section. In many |
| cases, the general terminology overlaps with the technical terminology.</para> |
| <variablelist> |
| <varlistentry> |
| <term>Tree</term> |
| <listitem>This term, as used within this document, describes an |
| abstract structure that consists of nodes or events that may be produced by |
| XML. A Tree physically may be a DOM tree, a series of well balanced parse |
| events (such as those coming from a SAX2 ContentHander), a series of requests |
| (the result of which can describe a tree), or a stream of marked-up |
| characters.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Source Tree(s)</term> |
| <listitem>One or more trees that are the inputs to the |
| transformation.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Result Tree(s)</term> |
| <listitem>One or more trees that are the output of the |
| transformation.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Transformation</term> |
| <listitem>The processor of consuming a stream or tree to produce |
| another stream or tree.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Identity (or Copy) Transformation</term> |
| <listitem>The process of transformation from a source to a result, |
| making as few structural changes as possible and no informational changes. The |
| term is somewhat loosely used, as the process is really a copy. from one |
| "format" (such as a DOM tree, stream, or set of SAX events) to |
| another.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Serialization</term> |
| <listitem>The process of taking a tree and turning it into a stream. In |
| some sense, a serialization is a specialized transformation.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Parsing</term> |
| <listitem>The process of taking a stream and turning it into a tree. In |
| some sense, parsing is a specialized transformation.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Transformer</term> |
| <listitem>A Transformer is the object that executes the transformation. |
| </listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Transformation instructions</term> |
| <listitem>Describes the transformation. A form of code, script, or |
| simply a declaration or series of declarations.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Stylesheet</term> |
| <listitem>The same as "transformation instructions," except it is |
| likely to be used in conjunction with <ulink |
| url="http://www.w3.org/TR/xslt">XSLT</ulink>.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Templates</term> |
| <listitem>Another form of "transformation instructions." In the TrAX |
| interface, this term is used to describe processed or compiled transformation |
| instructions. The Source flows through a Templates object to be formed into the |
| Result.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>Processor</term> |
| <listitem>A general term for the thing that may both process the |
| transformation instructions, and perform the transformation.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>DOM</term> |
| <listitem>Document Object Model, specifically referring to the |
| <termref link-url="http://www.w3.org/TR/DOM-Level-2 ">Document Object Model |
| (DOM) Level 2 Specification</termref>.</listitem> |
| </varlistentry> |
| <varlistentry> |
| <term>SAX</term> |
| <listitem>Simple API for XML, specifically referring to the |
| <termref link-url="http://www.megginson.com/SAX/SAX2">SAX 2.0 |
| release</termref>.</listitem> |
| </varlistentry> |
| </variablelist> |
| </sect2></introduction> |
| <requirements> |
| <title>Requirements</title> |
| <para>The following requirements have been determined from broad experience |
| with XML projects from the various members participating on the JCP.</para> |
| <orderedlist> |
| <listitem id="requirement-simple">TrAX must provide a clean, simple |
| interface for simple uses.</listitem> |
| <listitem id="requirement-general">TrAX must be powerful enough to be |
| applied to a wide range of uses, such as, e-commerce, content management, |
| server content delivery, and client applications.</listitem> |
| <listitem id="requirement-optimizeable">A processor that implements a TrAX |
| interface must be optimizeable. Performance is a critical issue for most |
| transformation use cases.</listitem> |
| <listitem id="requirement-compiled-model">As a specialization of the above |
| requirement, a TrAX processor must be able to support a compiled model, so that |
| a single set of transformation instructions can be compiled, optimized, and |
| applied to a large set of input sources.</listitem> |
| <listitem id="requirement-independence">TrAX must not be dependent an any |
| given type of transformation instructions. For instance, it must remain |
| independent of <ulink url="http://www.w3.org/TR/xslt">XSLT</ulink>.</listitem> |
| <listitem id="requirement-from-dom">TrAX must be able to allow processors |
| to transform DOM trees.</listitem> |
| <listitem id="requirement-to-dom">TrAX must be able to allow processors to |
| produce DOM trees.</listitem> |
| <listitem id="requirement-from-sax">TrAX must allow processors to transform |
| SAX events.</listitem> |
| <listitem id="requirement-to-sax">TrAX must allow processors to produce SAX |
| events.</listitem> |
| <listitem id="requirement-from-stream">TrAX must allow processors to |
| transform streams of XML.</listitem> |
| <listitem id="requirement-to-stream">TrAX must allow processors to produce |
| XML, HTML, and other types of streams.</listitem> |
| <listitem id="requirement-combo-input-output">TrAX must allow processors to |
| implement the various combinations of inputs and outputs within a single |
| processor.</listitem> |
| <listitem id="requirement-limited-input-output">TrAX must allow processors |
| to implement only a limited set of inputs. For instance, it should be possible |
| to write a processor that implements the TrAX interfaces and that only |
| processes DOM trees, not streams or SAX events.</listitem> |
| <listitem id="requirement-proprietary-data-structures">TrAX should allow a |
| processor to implement transformations of proprietary data structures. For |
| instance, it should be possible to implement a processor that provides TrAX |
| interfaces that performs transformation of JDOM trees.</listitem> |
| <listitem id="requirement-serialization-props">TrAX must allow the setting |
| of serialization properties, without constraint as to what the details of those |
| properties are.</listitem> |
| <listitem id="requirement-setting-parameters">TrAX must allow the setting |
| of parameters to the transformation instructions.</listitem> |
| <listitem id="requirement-namespaced-properties">TrAX must support the |
| setting of parameters and properties as XML Namespaced items (i.e., qualified |
| names).</listitem> |
| <listitem id="requirement-relative-url-resolution">TrAX must support URL |
| resolution from within the transformation, and have it return the needed data |
| structure.</listitem> |
| <listitem id="requirement-error-reporting">TrAX must have a mechanism for |
| reporting errors and warnings to the calling application.</listitem> |
| </orderedlist> </requirements> |
| <model> |
| <title>Model</title> |
| <para>The section defines the abstract model for TrAX, apart from the details |
| of the interfaces.</para> |
| <para>A TRaX <termref |
| link-url="pattern-TransformerFactory">TransformerFactory</termref> is an object |
| that processes transformation instructions, and produces |
| <termref link-url="pattern-Templates">Templates</termref> (in the technical |
| terminology). A <termref link-url="pattern-Templates">Templates</termref> |
| object provides a <termref |
| link-url="pattern-Transformer">Transformer</termref>, which transforms one or |
| more <termref link-url="pattern-Source">Source</termref>s into one or more |
| <termref link-url="pattern-Result">Result</termref>s.</para> |
| <para>To use the TRaX interface, you create a |
| <termref link-url="pattern-TransformerFactory">TransformerFactory</termref>, |
| which may directly provide a <termref |
| link-url="pattern-Transformers">Transformers</termref>, or which can provide |
| <termref link-url="pattern-Templates">Templates</termref> from a variety of |
| <termref link-url="pattern-Source">Source</termref>s. The |
| <termref link-url="pattern-Templates">Templates</termref> object is a processed |
| or compiled representation of the transformation instructions, and provides a |
| <termref link-url="pattern-Transformer">Transformer</termref>. The |
| <termref link-url="pattern-Transformer">Transformer</termref> processes a |
| <termref link-url="pattern-Transformer">Source</termref> according to the |
| instructions found in the <termref |
| link-url="pattern-Templates">Templates</termref>, and produces a |
| <termref link-url="pattern-Result">Result</termref>.</para> |
| <para>The process of transformation from a tree, either in the form of an |
| object model, or in the form of parse events, into a stream, is known as |
| <termref>serialization</termref>. We believe this is the most suitable term for |
| this process, despite the overlap with Java object serialization.</para> |
| <patterns module="TRaX"> <pattern><pattern-name |
| id="pattern-Processor">Processor</pattern-name><intent>Generic concept for the |
| set of objects that implement the TrAX interfaces.</intent> |
| <responsibilities>Create compiled transformation instructions, transform |
| sources, and manage transformation parameters and |
| properties.</responsibilities><thread-safety>Only the Templates object can be |
| used concurrently in multiple threads. The rest of the processor does not do |
| synchronized blocking, and so may not be used to perform multiple concurrent |
| operations.</thread-safety></pattern><pattern> |
| <pattern-name id="pattern-TransformerFactory">TransformerFactory</pattern-name> |
| <intent>Serve as a vendor-neutral Processor interface for |
| <ulink url="http://www.w3.org/TR/xslt">XSLT</ulink> and similar |
| processors.</intent> <responsibilities>Serve as a factory for a concrete |
| implementation of an TransformerFactory, serve as a direct factory for |
| Transformer objects, serve as a factory for Templates objects, and manage |
| processor specific features.</responsibilities> <thread-safety>A |
| TransformerFactory may not perform mulitple concurrent |
| operations.</thread-safety> </pattern> <pattern> |
| <pattern-name id="pattern-Templates">Templates</pattern-name> <intent>The |
| runtime representation of the transformation instructions.</intent> |
| <responsibilities>A data bag for transformation instructions; act as a factory |
| for Transformers.</responsibilities> <thread-safety>Threadsafe for concurrent |
| usage over multiple threads once construction is complete.</thread-safety> |
| </pattern> <pattern> <pattern-name |
| id="pattern-Transformer">Transformer</pattern-name> <intent>Act as a per-thread |
| execution context for transformations, act as an interface for performing the |
| transformation.</intent><responsibilities>Perform the |
| transformation.</responsibilities> <thread-safety>Only one instance per thread |
| is safe.</thread-safety> <notes>The Transformer is bound to the Templates |
| object that created it.</notes> </pattern> <pattern> |
| <pattern-name id="pattern-Source">Source</pattern-name> <intent>Serve as a |
| single vendor-neutral object for multiple types of input.</intent> |
| <responsibilities>Act as simple data holder for System IDs, DOM nodes, streams, |
| etc.</responsibilities> <thread-safety>Threadsafe concurrently over multiple |
| threads for read-only operations; must be synchronized for edit |
| operations.</thread-safety> </pattern><pattern> |
| <pattern-name id="pattern-Result">Result</pattern-name> |
| <potential-alternate-name>ResultTarget</potential-alternate-name> <intent>Serve |
| as a single object for multiple types of output, so there can be simple process |
| method signatures.</intent> <responsibilities>Act as simple data holder for |
| output stream, DOM node, ContentHandler, etc.</responsibilities> |
| <thread-safety>Threadsafe concurrently over multiple threads for read-only, |
| must be synchronized for edit.</thread-safety> </pattern> </patterns></model> |
| <sect1 id="package"> |
| <title>javax.xml.transform</title> |
| <para>This package defines the generic APIs for processing transformation instructions, |
| and performing a transformation from source to result. For an overview, see |
| <ulink url="trax.html">Transformation API for XML (TrAX)</ulink>. The TrAX |
| interfaces have no dependencies on SAX or the DOM standard, and try to make as |
| few assumptions as possible about the details of the source and result of a |
| transformation. TrAX achieves this by defining |
| <plink>javax.xml.transform.Source</plink> and |
| <plink>javax.xml.transform.Result</plink> interfaces.</para> |
| <para>To define concrete classes for the user, TrAX defines specializations |
| of the interfaces found at the TrAX root level. These interfaces are found in |
| <plink>javax.xml.transform.sax</plink>, <plink>javax.xml.transform.dom</plink>, |
| and <plink>javax.xml.transform.stream</plink>.</para> |
| <para>The following illustrates a simple transformation from input URI to |
| result stream.</para> |
| <programlisting> // Create a transform factory instance. |
| TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| // Create a transformer for the stylesheet. |
| Transformer transformer |
| = tfactory.newTransformer(new StreamSource(xslID)); |
| |
| // Transform the source XML to System.out. |
| transformer.transform( new StreamSource(sourceID), |
| new StreamResult(System.out)); |
| </programlisting> |
| <sect2> |
| <title>Creating Objects</title> |
| <para>TrAX allows a concrete |
| <plink>javax.xml.transform.TransformerFactory</plink> object to be created from |
| the static function |
| <plink>javax.xml.transform.TransformerFactory#newInstance</plink>. The |
| "javax.xml.transform.TransformerFactory" system property determines which |
| factory implementation to instantiate. This property names a concrete subclass |
| of the TransformerFactory abstract class. If this system property is not |
| defined, a platform default is used.</para> |
| </sect2> |
| <sect2> |
| <title>Specification of Inputs and Outputs</title> |
| <para>TrAX defines two interface objects called |
| <plink>javax.xml.transform.Source</plink> and |
| <plink>javax.xml.transform.Result</plink>. In order to pass Source and Result |
| objects to the TrAX interfaces, concrete classes must be used. TrAX defines |
| three concrete representations for each of these objects: |
| <plink>javax.xml.transform.stream.StreamSource</plink> and |
| <plink>javax.xml.transform.stream.StreamResult</plink>, |
| <plink>javax.xml.transform.sax.SAXSource</plink> and |
| <plink>javax.xml.transform.sax.SAXResult</plink>, and |
| <plink>javax.xml.transform.dom.DOMSource</plink> and |
| <plink>javax.xml.transform.dom.DOMResult</plink>. Each of these objects defines |
| a FEATURE string (which is i the form of a URL), which can be passed into |
| <plink>javax.xml.transform.TransformerFactory#getFeature</plink> to see if the |
| given type of Source or Result object is supported. For instance, to test if a |
| DOMSource and a StreamResult is supported, you can apply the following |
| test.</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| if (tfactory.getFeature(DOMSource.FEATURE) && tfactory.getFeature(StreamResult.FEATURE)) |
| { |
| ... |
| }</programlisting> |
| </sect2> |
| <sect2> |
| <title id="qname-delimiter">Qualified Name representation</title> |
| <para><ulink url="http://www.w3.org/TR/REC-xml-names">Namespaces</ulink> |
| present something of a problem area when dealing with XML objects. Qualified |
| Names appear in XML markup as prefixed names. But the prefixes themselves do |
| not hold identity. Rather, it is the URIs that they contextually map to that |
| hold the identity. Therefore, when passing a Qualified Name like "xyz:foo" |
| among Java programs, one must provide a means to map "xyz" to a namespace. |
| </para> |
| <para>One solution has been to create a "QName" object that holds the |
| namespace URI, as well as the prefix and local name, but this is not always an |
| optimal solution, as when, for example, you want to use unique strings as keys |
| in a dictionary object. Not having a string representation also makes it |
| difficult to specify a namespaced identity outside the context of an XML |
| document.</para> |
| <para>In order to pass namespaced values to transformations, for instance |
| as a set of properties to the Serializer, this specification defines that a |
| String "qname" object parameter be passed as two-part string, the namespace URI |
| enclosed in curly braces ({}), followed by the local name. If the qname has a |
| null URI, then the String object only contains the local name. An application |
| can safely check for a non-null URI by testing to see if the first character of |
| the name is a '{' character.</para> |
| <para>For example, if a URI and local name were obtained from an element |
| defined with <xyz:foo xmlns:xyz="http://xyz.foo.com/yada/baz.html"/>, |
| then the TrAX Qualified Name would be "{http://xyz.foo.com/yada/baz.html}foo". |
| Note that the prefix is lost.</para> |
| </sect2> |
| <sect2> |
| <title>Result Tree Serialization</title> |
| <para>Serialization of the result tree to a stream can be controlled with |
| the <plink>javax.xml.transform.Transformer#setOutputProperties</plink> and the |
| <plink>javax.xml.transform.Transformer#setOutputProperty</plink> methods. |
| Strings that match the <ulink url="http://www.w3.org/TR/xslt#output">XSLT |
| specification for xsl:output attributes</ulink> can be referenced from the |
| <plink>javax.xml.transform.OutputKeys</plink> class. Other strings can be |
| specified as well. If the transformer does not recognize an output key, a |
| <plink>java.lang.IllegalArgumentException</plink> is thrown, unless the |
| <emphasis>unless</emphasis> the key name is <link |
| linkend="qname-delimiter">namespace qualified</link>. Output key names that are |
| qualified by a namespace are ignored or passed on to the serializer |
| mechanism.</para> |
| <para>If all that is desired is the simple identity transformation of a |
| source to a result, then <plink>javax.xml.transform.TransformerFactory</plink> |
| provides a |
| <plink>javax.xml.transform.TransformerFactory#newTransformer()</plink> method |
| with no arguments. This method creates a Transformer that effectively copies |
| the source to the result. This method may be used to create a DOM from SAX |
| events or to create an XML or HTML stream from a DOM or SAX events. The |
| following example illustrates the serialization of a DOM node to an XML |
| stream.</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| Transformer serializer = tfactory.newTransformer(); |
| Properties oprops = new Properties(); |
| oprops.put("method", "html"); |
| oprops.put("indent-amount", "2"); |
| serializer.setOutputProperties(oprops); |
| serializer.transform(new DOMSource(doc), |
| new StreamResult(System.out));</programlisting> |
| </sect2> |
| <sect2> |
| <title>Exceptions and Error Reporting</title> |
| <para>The TrAX APIs throw three types of specialized exceptions. A |
| <plink>javax.xml.transform.TFactoryConfigurationError</plink> is parallel to |
| the <plink>javax.xml.parsers.FactoryConfigurationError</plink>, and is thrown |
| when a configuration problem with the TransformerFactory exists. This error |
| will typically be thrown when the transformation factory class specified with |
| the "javax.xml.transform.TransformerFactory" system property cannot be found or |
| instantiated.</para> |
| <para>A <plink>javax.xml.transform.TransformerConfigurationException</plink> |
| may be thrown if for any reason a Transformer can not be created. A |
| TransformerConfigurationException may be thrown if there is a syntax error in |
| the transformation instructions, for example when |
| <plink>javax.xml.transform.TransformerFactory#newTransformer</plink> is |
| called.</para> |
| <para><plink>javax.xml.transform.TransformerException</plink> is a general |
| exception that occurs during the course of a transformation. A transformer |
| exception may wrap another exception, and if any of the |
| <plink>javax.xml.transform.TransformerException#printStackTrace()</plink> |
| methods are called on it, it will produce a list of stack dumps, starting from |
| the most recent. The transformer exception also provides a |
| <plink>javax.xml.transform.SourceLocator</plink> object which indicates where |
| in the source tree or transformation instructions the error occurred. |
| <plink>javax.xml.transform.TransformerException#getMessageAndLocation()</plink> |
| may be called to get an error message with location info, and |
| <plink>javax.xml.transform.TransformerException#getLocationAsString()</plink> |
| may be called to get just the location string.</para> |
| <para>Transformation warnings and errors are normally first sent to a |
| <plink>javax.xml.transform.ErrorListener</plink>, at which point the |
| implementor may decide to report the error or warning, and may decide to throw |
| an exception for a non-fatal error. The error listener may be set via |
| <plink>javax.xml.transform.TransformerFactory#setErrorListener</plink> for |
| reporting errors that have to do with syntax errors in the transformation |
| instructions, or via |
| <plink>javax.xml.transform.Transformer#setErrorListener</plink> to report |
| errors that occur during the transformation. The error listener on both objects |
| should always be valid and non-null, whether set by the user or a default |
| implementation provided by the processor.</para> |
| </sect2> |
| <sect2> |
| <title>Resolution of URIs within a transformation</title> |
| <para>TrAX provides a way for URIs referenced from within the stylesheet |
| instructions or within the transformation to be resolved by the calling |
| application. This can be done by creating a class that implements the |
| URIResolver interface, with its one method, |
| <plink>javax.xml.transform.URIResolver#resolve</plink>, and use this class to |
| set the URI resolution for the transformation instructions or transformation |
| with <plink>javax.xml.transform.TransformerFactory#setURIResolver</plink> or |
| <plink>javax.xml.transform.Transformer#setURIResolver</plink>. The |
| URIResolver.resolve method takes two String arguments, the URI found in the |
| stylesheet instructions or built as part of the transformation process, and the |
| base URI in effect when the URI passed as the first argument was encountered. |
| The returned <plink>javax.xml.transform.Source</plink> object must be usable by |
| the transformer, as specified in its implemented features.</para> |
| <para>The following example illustrates the use of the URI resolver to |
| resolve URIs to DOM nodes, in a transformation whose input is totally DOM |
| based.</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| if (tfactory.getFeature(DOMSource.FEATURE) && tfactory.getFeature(StreamResult.FEATURE)) |
| { |
| DocumentBuilderFactory dfactory = |
| DocumentBuilderFactory.newInstance(); |
| dfactory.setNamespaceAware(true); // Always, required for XSLT |
| DocumentBuilder docBuilder = dfactory.newDocumentBuilder(); |
| |
| // Set up to resolve URLs that correspond to our inc1.xsl, |
| // to a DOM node. Use an anonymous class for the URI resolver. |
| final Node xslInc1 = docBuilder.parse("xsl/inc1/inc1.xsl"); |
| final Node xslInc2 = docBuilder.parse("xsl/inc2/inc2.xsl"); |
| tfactory.setURIResolver(new URIResolver() { |
| public Source resolve(String href, String base) |
| throws TransformerException |
| { |
| // ignore base because we're lazy, or we don't care. |
| return (href.equals("inc1/inc1.xsl")) |
| ? new DOMSource(xslInc1) : |
| (href.equals("inc2/inc2.xsl")) |
| ? new DOMSource(xslInc2) : null; |
| }}); |
| |
| // The TransformerFactory will call the anonymous URI |
| // resolver set above when it encounters |
| // <xsl:include href="inc1/inc1.xsl"/> |
| Templates templates |
| = tfactory.newTemplates(new DOMSource(docBuilder.parse(xslID), xslID)); |
| |
| // Get a transformer from the templates. |
| Transformer transformer = templates.newTransformer(); |
| |
| // Set up to resolve URLs that correspond to our foo2.xml, to |
| // a DOM node. Use an anonymous class for the URI resolver. |
| // Be sure to return the same DOM tree every time for the |
| // given URI. |
| final Node xmlSubdir1Foo2Node = docBuilder.parse("xml/subdir1/foo2.xml"); |
| transformer.setURIResolver(new URIResolver() { |
| public Source resolve(String href, String base) |
| throws TransformerException |
| { |
| // ignore base because we're lazy, or we don't care. |
| return (href.equals("subdir1/foo2.xml")) |
| ? new DOMSource(xmlSubdir1Foo2Node) : null; |
| }}); |
| |
| // Now the transformer will call our anonymous URI resolver |
| // when it encounters the document('subdir1/foo2.xml') invocation. |
| transformer.transform(new DOMSource(docBuilder.parse(sourceID), sourceID), |
| new StreamResult(System.out)); |
| } |
| </programlisting> |
| </sect2> |
| <sect2 id="specialized-packages"> |
| <title>Specialized Packages</title> |
| <sect3> |
| <title>javax.xml.transform.stream</title> |
| <para>This package implements stream- and URI- specific transformation APIs. |
| </para> |
| <para>The <plink>javax.xml.transform.stream.StreamSource</plink> class |
| provides methods for specifying <plink>java.io.InputStream</plink> input, |
| <plink>java.io.Reader</plink> input, and URL input in the form of strings. Even |
| if an input stream or reader is specified as the source, |
| <plink>javax.xml.transform.stream.StreamSource#setSystemId</plink> should still |
| be called, so that the transformer can know from where it should resolve |
| relative URIs. The public identifier is always optional: if the application |
| writer includes one, it will be provided as part of the |
| <plink>javax.xml.transform.SourceLocator</plink> information.</para> |
| <para>The <plink>javax.xml.transform.stream.StreamResult</plink> class |
| provides methods for specifying <plink>java.io.OutputStream</plink>, |
| <plink>java.io.Writer</plink>, or an output system ID, as the output of the |
| transformation result.</para> |
| <para>Normally streams should be used rather than readers or writers, for |
| both the Source and Result, since readers and writers already have the encoding |
| established to and from the internal Unicode format. However, there are times |
| when it is useful to write to a character stream, such as when using a |
| StringWriter in order to write to a String, or in the case of reading source |
| XML from a StringReader.</para> |
| <para>The following code fragment illustrates the use of the stream Source |
| and Result objects.</para> |
| <programlisting> // Create a TransformerFactory instance. |
| TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| InputStream xslIS = new BufferedInputStream(new FileInputStream(xslID)); |
| StreamSource xslSource = new StreamSource(xslIS); |
| // Note that if we don't do this, relative URLs cannot be resolved correctly! |
| xslSource.setSystemId(xslID); |
| |
| // Create a transformer for the stylesheet. |
| Transformer transformer = tfactory.newTransformer(xslSource); |
| |
| InputStream xmlIS = new BufferedInputStream(new FileInputStream(sourceID)); |
| StreamSource xmlSource = new StreamSource(xmlIS); |
| // Note that if we don't do this, relative URLs cannot be resolved correctly! |
| xmlSource.setSystemId(sourceID); |
| |
| // Transform the source XML to System.out. |
| transformer.transform( xmlSource, new StreamResult(System.out)); |
| </programlisting> |
| </sect3> |
| <sect3> |
| <title>javax.xml.transform.sax</title> |
| <para>This package implements SAX2-specific transformation APIs. It provides |
| classes which allow input from <plink>org.xml.sax.ContentHandler</plink> |
| events, and also classes that produce org.xml.sax.ContentHandler events. It |
| also provides methods to set the input source as an |
| <plink>org.xml.sax.XMLReader</plink>, or to use a |
| <plink>org.xml.sax.InputSource</plink> as the source. It also allows the |
| creation of a <plink>org.xml.sax.XMLFilter</plink>, which enables |
| transformations to "pull" from other transformations, and lets the transformer |
| to be used polymorphically as an <plink>org.xml.sax.XMLReader</plink>.</para> |
| <para>The <plink>javax.xml.transform.sax.SAXSource</plink> class allows the |
| setting of an <plink>org.xml.sax.XMLReader</plink> to be used for "pulling" |
| parse events, and an <plink>org.xml.sax.InputSource</plink> that may be used to |
| specify the SAX source.</para> |
| <para>The <plink>javax.xml.transform.sax.SAXResult</plink> class allows the |
| setting of a <plink>org.xml.sax.ContentHandler</plink> to be the receiver of |
| SAX2 events from the transformation. The following code fragment illustrates |
| the use of the SAXSource and SAXResult objects.</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| // Does this factory support SAX features? |
| if (tfactory.getFeature(SAXSource.FEATURE) && tfactory.getFeature(SAXResult.FEATURE)) |
| { |
| // Get a transformer. |
| Transformer transformer |
| = tfactory.newTransformer(new StreamSource(xslID)); |
| |
| // Create an reader for reading. |
| XMLReader reader = XMLReaderFactory.createXMLReader(); |
| |
| transformer.transform(new SAXSource(reader, new InputSource(sourceID)), |
| new SAXResult(new ExampleContentHandler())); |
| } |
| </programlisting> |
| <para>The <plink>javax.xml.transform.sax.SAXTransformerFactory</plink> extends |
| <plink>javax.xml.transform.TransformerFactory</plink> to provide factory |
| methods for creating <plink>javax.xml.transform.sax.TemplatesHandler</plink>, |
| <plink>javax.xml.transform.sax.TransformerHandler</plink>, and |
| <plink>org.xml.sax.XMLReader</plink> instances.</para> |
| <para>To obtain a <plink>javax.xml.transform.sax.SAXTransformerFactory</plink>, |
| the caller must cast the <plink>javax.xml.transform.TransformerFactory</plink> |
| instance returned from |
| <plink>javax.xml.transform.TransformerFactory#newInstance</plink>. For |
| example:</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| // Does this factory support the SAXTransformerFactory feature? |
| if (tfactory.getFeature(SAXTransformerFactory.FEATURE)) |
| { |
| // If so, we can safely cast. |
| SAXTransformerFactory stfactory = ((SAXTransformerFactory) tfactory); |
| |
| // A TransformerHandler is a ContentHandler that will listen for |
| // SAX events, and transform them to the result. |
| TransformerHandler handler |
| = stfactory.newTransformerHandler(new StreamSource(xslID)); |
| // ... |
| } |
| </programlisting> |
| <para>The <plink>javax.xml.transform.sax.TransformerHandler</plink> interface |
| allows a transformation to be created from SAX2 parse events, which is a "push" |
| model rather than the "pull" model that normally occurs for a transformation. |
| Normal parse events are received through the |
| <plink>org.xml.sax.ContentHandler</plink> interface, lexical events such as |
| startCDATA and endCDATA are received through the |
| <plink>org.xml.sax.ext.LexicalHandler</plink> interface, and events that signal |
| the start or end of disabling output escaping are received via |
| <plink>org.xml.sax.ContentHandler#processingInstruction</plink>, with the |
| target parameter being |
| <plink>javax.xml.transform.Result#PI_DISABLE_OUTPUT_ESCAPING</plink> and |
| <plink>javax.xml.transform.Result#PI_ENABLE_OUTPUT_ESCAPING</plink>. If |
| parameters, output properties, or other features need to be set on the |
| Transformer handler, a <plink>javax.xml.transform.Transformer</plink> reference |
| will need to be obtained from |
| <plink>javax.xml.transform.sax.TransformerHandler#getTransformer</plink>, and |
| the methods invoked from that reference. The following illustrates the feeding |
| of SAX events from an <plink>org.xml.sax.XMLReader</plink> to a |
| Transformer.</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| // Does this factory support SAX features? |
| if (tfactory.getFeature(SAXTransformerFactory.FEATURE)) |
| { |
| // If so, we can safely cast. |
| SAXTransformerFactory stfactory = ((SAXTransformerFactory) tfactory); |
| |
| // A TransformerHandler is a ContentHandler that will listen for |
| // SAX events, and transform them to the result. |
| TransformerHandler handler |
| = stfactory.newTransformerHandler(new StreamSource(xslID)); |
| |
| // Set the result handling to be a serialization to System.out. |
| handler.setResult(new StreamResult(System.out)); |
| |
| handler.getTransformer().setParameter("a-param", |
| "hello to you!"); |
| |
| // Create a reader, and set it's content handler to be the TransformerHandler. |
| XMLReader reader = XMLReaderFactory.createXMLReader(); |
| reader.setContentHandler(handler); |
| |
| // It's a good idea for the parser to send lexical events. |
| // The TransformerHandler is also a LexicalHandler. |
| reader.setProperty("http://xml.org/sax/properties/lexical-handler", handler); |
| |
| // Parse the source XML, and send the parse events to the TransformerHandler. |
| reader.parse(sourceID); |
| } |
| </programlisting> |
| <para>The <plink>javax.xml.transform.sax.TemplatesHandler</plink> interface |
| allows the creation of <plink>javax.xml.transform.Templates</plink> objects |
| from SAX2 parse events. Once the <plink>org.xml.sax.ContentHandler</plink> |
| events are complete, the Templates object may be obtained from |
| <plink>javax.xml.transform.sax.TemplatesHandler#getTemplates</plink>. Note that |
| <plink>javax.xml.transform.sax.TemplatesHandler#setSystemId</plink> should |
| normally be called in order to establish a base system ID from which relative |
| URLs may be resolved. The following code fragment illustrates the creation of a |
| Templates object from SAX2 events sent from an XMLReader.</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| // Does this factory support SAX features? |
| if (tfactory.getFeature(SAXTransformerFactory.FEATURE)) |
| { |
| // If so, we can safely cast. |
| SAXTransformerFactory stfactory = ((SAXTransformerFactory) tfactory); |
| |
| // Have the factory create a special ContentHandler that will |
| // create a Templates object. |
| TemplatesHandler handler = stfactory.newTemplatesHandler(); |
| |
| // If you don't do this, the TemplatesHandler won't know how to |
| // resolve relative URLs. |
| handler.setSystemId(xslID); |
| |
| // Create a reader, and set it's content handler to be the TemplatesHandler. |
| XMLReader reader = XMLReaderFactory.createXMLReader(); |
| reader.setContentHandler(handler); |
| |
| // Parse the source XML, and send the parse events to the TemplatesHandler. |
| reader.parse(xslID); |
| |
| // Get the Templates reference from the handler. |
| Templates templates = handler.getTemplates(); |
| |
| // Ready to transform. |
| Transformer transformer = templates.newTransformer(); |
| transformer.transform(new StreamSource(sourceID), new StreamResult(System.out)); |
| } |
| </programlisting> |
| <para>The |
| <plink>javax.xml.transform.sax.SAXTransformerFactory#newXMLFilter</plink> |
| method allows the creation of a <plink>org.xml.sax.XMLFilter</plink>, which |
| encapsulates the SAX2 notion of a "pull" transformation. The following |
| illustrates several transformations chained together. Each filter points to a |
| parent <plink>org.xml.sax.XMLReader</plink>, and the final transformation is |
| caused by invoking <plink>org.xml.sax.XMLReader#parse</plink> on the final |
| reader in the chain.</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| // Does this factory support SAX features? |
| if (tfactory.getFeature(SAXTransformerFactory.FEATURE)) |
| { |
| Templates stylesheet1 = tfactory.newTemplates(new StreamSource(xslID_1)); |
| Transformer transformer1 = stylesheet1.newTransformer(); |
| |
| SAXTransformerFactory stf = (SAXTransformerFactory)tfactory; |
| XMLReader reader = XMLReaderFactory.createXMLReader(); |
| |
| XMLFilter filter1 = stf.newXMLFilter(new StreamSource(xslID_1)); |
| XMLFilter filter2 = stf.newXMLFilter(new StreamSource(xslID_2)); |
| XMLFilter filter3 = stf.newXMLFilter(new StreamSource(xslID_3)); |
| |
| // transformer1 will use a SAX parser as it's reader. |
| filter1.setParent(reader); |
| |
| // transformer2 will use transformer1 as it's reader. |
| filter2.setParent(filter1); |
| |
| // transform3 will use transform2 as it's reader. |
| filter3.setParent(filter2); |
| |
| filter3.setContentHandler(new ExampleContentHandler()); |
| // filter3.setContentHandler(new org.xml.sax.helpers.DefaultHandler()); |
| |
| // Now, when you call transformer3 to parse, it will set |
| // itself as the ContentHandler for transform2, and |
| // call transform2.parse, which will set itself as the |
| // content handler for transform1, and call transform1.parse, |
| // which will set itself as the content listener for the |
| // SAX parser, and call parser.parse(new InputSource("xml/foo.xml")). |
| filter3.parse(new InputSource(sourceID)); |
| }</programlisting> |
| </sect3> |
| <sect3> |
| <title>javax.xml.transform.dom</title> |
| <para>This package implements DOM-specific transformation APIs.</para> |
| <para>The <plink>javax.xml.transform.dom.DOMSource</plink> class allows the |
| client of the TrAX implementation to specify a DOM |
| <plink>org.w3c.dom.Node</plink> as the source of the input tree. The model of |
| how the Transformer deals with the DOM tree in terms of mismatches with the |
| <ulink url="http://www.w3.org/TR/xslt#data-model">XSLT data model</ulink> or |
| other data models is beyond the scope of this document. Any of the nodes |
| derived from <plink>org.w3c.dom.Node</plink> are legal input.</para> |
| <para>The <plink>javax.xml.transform.dom.DOMResult</plink> class allows a |
| <plink>org.w3c.dom.Node</plink> to be specified to which result DOM nodes will |
| be appended. If an output node is not specified, the transformer will use |
| <plink>javax.xml.parsers.DocumentBuilder#newDocument</plink> to create an |
| output <plink>org.w3c.dom.Document</plink> node. If a node is specified, it |
| should be one of the following: <plink>org.w3c.dom.Document</plink>, |
| <plink>org.w3c.dom.Element</plink>, or |
| <plink>org.w3c.dom.DocumentFragment</plink>. Specification of any other node |
| type is implementation dependent and undefined by this API. If the result is a |
| <plink>org.w3c.dom.Document</plink>, the output of the transformation must have |
| a single element root to set as the document element.</para> |
| <para>The <plink>javax.xml.transform.dom.DOMLocator</plink> node may be passed |
| to <plink>javax.xml.transform.TransformerException</plink> objects, and |
| retrieved by trying to cast the result of |
| the <plink>javax.xml.transform.TransformerException#getLocator()</plink> method. |
| The implementation has no responsibility to use a DOMLocator instead of |
| a <plink>javax.xml.transform.SourceLocator</plink> (though line numbers and the |
| like do not make much sense for a DOM), so the result of getLocator must always |
| be tested with an instanceof. </para> |
| <para>The following example performs a transformation using DOM nodes as input |
| for the TransformerFactory, as input for the Transformer, and as the output of |
| the transformation.</para> |
| <programlisting> TransformerFactory tfactory = TransformerFactory.newInstance(); |
| |
| // Make sure the TransformerFactory supports the DOM feature. |
| if (tfactory.getFeature(DOMSource.FEATURE) && tfactory.getFeature(DOMResult.FEATURE)) |
| { |
| // Use javax.xml.parsers to create our DOMs. |
| DocumentBuilderFactory dfactory = DocumentBuilderFactory.newInstance(); |
| dfactory.setNamespaceAware(true); // do this always for XSLT |
| DocumentBuilder docBuilder = dfactory.newDocumentBuilder(); |
| |
| // Create the Templates from a DOM. |
| Node xslDOM = docBuilder.parse(xslID); |
| DOMSource dsource = new DOMSource(xslDOM, xslID); |
| Templates templates = tfactory.newTemplates(dsource); |
| |
| // Create the source tree in the form of a DOM. |
| Node sourceNode = docBuilder.parse(sourceID); |
| |
| // Create a DOMResult that the transformation will fill in. |
| DOMResult dresult = new DOMResult(); |
| |
| // And transform from the source DOM tree to a result DOM tree. |
| Transformer transformer = templates.newTransformer(); |
| transformer.transform(new DOMSource(sourceNode, sourceID), dresult); |
| |
| // The root of the result tree may now be obtained from |
| // the DOMResult object. |
| Node out = dresult.getNode(); |
| |
| // Serialize it to System.out for diagnostics. |
| Transformer serializer = tfactory.newTransformer(); |
| serializer.transform(new DOMSource(out), new StreamResult(System.out)); |
| }</programlisting> |
| </sect3> |
| </sect2> |
| </sect1> |
| </spec> |