|  | <?xml version="1.0"?> | 
|  |  | 
|  | <spec> | 
|  | <title>Transformation API For XML (TrAX)</title> | 
|  | <frontmatter> | 
|  | <pubdate>November 12, 2000</pubdate> | 
|  | <author><firstname>Scott</firstname> | 
|  | <surname>Boag</surname> | 
|  | <orgname>IBM Research</orgname> | 
|  | <address> | 
|  | <email>Scott_Boag@us.ibm.com</email> | 
|  | </address> | 
|  | </author></frontmatter> | 
|  | <introduction> | 
|  | <title>Introduction</title> | 
|  | <para>This overview describes the set of APIs contained in | 
|  | <ulink url="http://xml.apache.org/xalan-j/apidocs/javax/xml/transform/package-summary.html">javax.xml.transform</ulink>, <ulink url="http://xml.apache.org/xalan-j/apidocs/javax/xml/transform/stream/package-summary.html">javax.xml.transform.stream</ulink>, <ulink url="http://xml.apache.org/xalan-j/apidocs/javax/xml/transform/dom/package-summary.html">javax.xml.transform.dom</ulink>, and <ulink url="http://xml.apache.org/xalan-j/apidocs/javax/xml/transform/sax/package-summary.html">javax.xml.transform.sax</ulink>. For the sake of brevity, these interfaces are referred to | 
|  | as TrAX (Transformation API for XML). </para> | 
|  | <para>There is a broad need for Java applications to be able to transform XML | 
|  | and related tree-shaped data structures. In fact, XML is not normally very | 
|  | useful to an application without going through some sort of transformation, | 
|  | unless the semantic structure is used directly as data. Almost all XML-related | 
|  | applications need to perform transformations. Transformations may be described | 
|  | by Java code, Perl code, <ulink url="http://www.w3.org/TR/xslt">XSLT</ulink> | 
|  | Stylesheets, other types of script, or by proprietary formats. The inputs, one | 
|  | or multiple, to a transformation, may be a URL, XML stream, a DOM tree, SAX | 
|  | Events, or a proprietary format or data structure. The output types are the | 
|  | pretty much the same types as the inputs, but different inputs may need to be | 
|  | combined with different outputs.</para> | 
|  | <para>The great challenge of a transformation API is how to deal with all the | 
|  | possible combinations of inputs and outputs, without becoming specialized for | 
|  | any of the given types.</para> | 
|  | <para>The Java community will greatly benefit from a common API that will | 
|  | allow them to understand and apply a single model, write to consistent | 
|  | interfaces, and apply the transformations polymorphically. TrAX attempts to | 
|  | define a model that is clean and generic, yet fills general application | 
|  | requirements across a wide variety of uses. </para> | 
|  | <terminology> | 
|  | <title>General Terminology</title> | 
|  | <para>This section will explain some general terminology used in this | 
|  | document. Technical terminology will be explained in the Model section. In many | 
|  | cases, the general terminology overlaps with the technical terminology.</para> | 
|  | <variablelist> | 
|  | <varlistentry> | 
|  | <term>Tree</term> | 
|  | <listitem>This term, as used within this document, describes an | 
|  | abstract structure that consists of nodes or events that may be produced by | 
|  | XML. A Tree physically may be a DOM tree, a series of well balanced parse | 
|  | events (such as those coming from a SAX2 ContentHander), a series of requests | 
|  | (the result of which can describe a tree), or a stream of marked-up | 
|  | characters.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Source Tree(s)</term> | 
|  | <listitem>One or more trees that are the inputs to the | 
|  | transformation.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Result Tree(s)</term> | 
|  | <listitem>One or more trees that are the output of the | 
|  | transformation.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Transformation</term> | 
|  | <listitem>The process of consuming a stream or tree to produce | 
|  | another stream or tree.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Identity (or Copy) Transformation</term> | 
|  | <listitem>The process of transformation from a source to a result, | 
|  | making as few structural changes as possible and no informational changes. The | 
|  | term is somewhat loosely used, as the process is really a copy. from one | 
|  | "format" (such as a DOM tree, stream, or set of SAX events) to | 
|  | another.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Serialization</term> | 
|  | <listitem>The process of taking a tree and turning it into a stream. In | 
|  | some sense, a serialization is a specialized transformation.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Parsing</term> | 
|  | <listitem>The process of taking a stream and turning it into a tree. In | 
|  | some sense, parsing is a specialized transformation.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Transformer</term> | 
|  | <listitem>A Transformer is the object that executes the transformation. | 
|  | </listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Transformation instructions</term> | 
|  | <listitem>Describes the transformation. A form of code, script, or | 
|  | simply a declaration or series of declarations.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Stylesheet</term> | 
|  | <listitem>The same as "transformation instructions," except it is | 
|  | likely to be used in conjunction with <ulink | 
|  | url="http://www.w3.org/TR/xslt">XSLT</ulink>.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Templates</term> | 
|  | <listitem>Another form of "transformation instructions." In the TrAX | 
|  | interface, this term is used to describe processed or compiled transformation | 
|  | instructions. The Source flows through a Templates object to be formed into the | 
|  | Result.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>Processor</term> | 
|  | <listitem>A general term for the thing that may both process the | 
|  | transformation instructions, and perform the transformation.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>DOM</term> | 
|  | <listitem>Document Object Model, specifically referring to the | 
|  | <termref link-url="http://www.w3.org/TR/DOM-Level-2 ">Document Object Model | 
|  | (DOM) Level 2 Specification</termref>.</listitem> | 
|  | </varlistentry> | 
|  | <varlistentry> | 
|  | <term>SAX</term> | 
|  | <listitem>Simple API for XML, specifically referring to the | 
|  | <termref link-url="http://www.megginson.com/SAX/SAX2">SAX 2.0 | 
|  | release</termref>.</listitem> | 
|  | </varlistentry> | 
|  | </variablelist> | 
|  | </terminology></introduction> | 
|  | <requirements> | 
|  | <title>Requirements</title> | 
|  | <para>The following requirements have been determined from broad experience | 
|  | with XML projects from the various members participating on the JCP.</para> | 
|  | <orderedlist> | 
|  | <listitem id="requirement-simple">TrAX must provide a clean, simple | 
|  | interface for simple uses.</listitem> | 
|  | <listitem id="requirement-general">TrAX must be powerful enough to be | 
|  | applied to a wide range of uses, such as, e-commerce, content management, | 
|  | server content delivery, and client applications.</listitem> | 
|  | <listitem id="requirement-optimizeable">A processor that implements a TrAX | 
|  | interface must be optimizeable. Performance is a critical issue for most | 
|  | transformation use cases.</listitem> | 
|  | <listitem id="requirement-compiled-model">As a specialization of the above | 
|  | requirement, a TrAX processor must be able to support a compiled model, so that | 
|  | a single set of transformation instructions can be compiled, optimized, and | 
|  | applied to a large set of input sources.</listitem> | 
|  | <listitem id="requirement-independence">TrAX must not be dependent an any | 
|  | given type of transformation instructions. For instance, it must remain | 
|  | independent of <ulink url="http://www.w3.org/TR/xslt">XSLT</ulink>.</listitem> | 
|  | <listitem id="requirement-from-dom">TrAX must be able to allow processors | 
|  | to transform DOM trees.</listitem> | 
|  | <listitem id="requirement-to-dom">TrAX must be able to allow processors to | 
|  | produce DOM trees.</listitem> | 
|  | <listitem id="requirement-from-sax">TrAX must allow processors to transform | 
|  | SAX events.</listitem> | 
|  | <listitem id="requirement-to-sax">TrAX must allow processors to produce SAX | 
|  | events.</listitem> | 
|  | <listitem id="requirement-from-stream">TrAX must allow processors to | 
|  | transform streams of XML.</listitem> | 
|  | <listitem id="requirement-to-stream">TrAX must allow processors to produce | 
|  | XML, HTML, and other types of streams.</listitem> | 
|  | <listitem id="requirement-combo-input-output">TrAX must allow processors to | 
|  | implement the various combinations of inputs and outputs within a single | 
|  | processor.</listitem> | 
|  | <listitem id="requirement-limited-input-output">TrAX must allow processors | 
|  | to implement only a limited set of inputs. For instance, it should be possible | 
|  | to write a processor that implements the TrAX interfaces and that only | 
|  | processes DOM trees, not streams or SAX events.</listitem> | 
|  | <listitem id="requirement-proprietary-data-structures">TrAX should allow a | 
|  | processor to implement transformations of proprietary data structures. For | 
|  | instance, it should be possible to implement a processor that provides TrAX | 
|  | interfaces that performs transformation of JDOM trees.</listitem> | 
|  | <listitem id="requirement-serialization-props">TrAX must allow the setting | 
|  | of serialization properties, without constraint as to what the details of those | 
|  | properties are.</listitem> | 
|  | <listitem id="requirement-setting-parameters">TrAX must allow the setting | 
|  | of parameters to the transformation instructions.</listitem> | 
|  | <listitem id="requirement-namespaced-properties">TrAX must support the | 
|  | setting of parameters and properties as XML Namespaced items (i.e., qualified | 
|  | names).</listitem> | 
|  | <listitem id="requirement-relative-url-resolution">TrAX must support URL | 
|  | resolution from within the transformation, and have it return the needed data | 
|  | structure.</listitem> | 
|  | <listitem id="requirement-error-reporting">TrAX must have a mechanism for | 
|  | reporting errors and warnings to the calling application.</listitem> | 
|  | </orderedlist> </requirements> | 
|  | <model> | 
|  | <title>Model</title> | 
|  | <para>The section defines the abstract model for TrAX, apart from the details | 
|  | of the interfaces.</para> | 
|  | <para>A TRaX <termref | 
|  | link-url="pattern-TransformerFactory">TransformerFactory</termref> is an object | 
|  | that processes transformation instructions, and produces | 
|  | <termref link-url="pattern-Templates">Templates</termref> (in the technical | 
|  | terminology). A <termref link-url="pattern-Templates">Templates</termref> | 
|  | object provides a <termref | 
|  | link-url="pattern-Transformer">Transformer</termref>, which transforms one or | 
|  | more <termref link-url="pattern-Source">Source</termref>s into one or more | 
|  | <termref link-url="pattern-Result">Result</termref>s.</para> | 
|  | <para>To use the TRaX interface, you create a | 
|  | <termref link-url="pattern-TransformerFactory">TransformerFactory</termref>, | 
|  | which may directly provide a <termref | 
|  | link-url="pattern-Transformers">Transformers</termref>, or which can provide | 
|  | <termref link-url="pattern-Templates">Templates</termref> from a variety of | 
|  | <termref link-url="pattern-Source">Source</termref>s. The | 
|  | <termref link-url="pattern-Templates">Templates</termref> object is a processed | 
|  | or compiled representation of the transformation instructions, and provides a | 
|  | <termref link-url="pattern-Transformer">Transformer</termref>. The | 
|  | <termref link-url="pattern-Transformer">Transformer</termref> processes a | 
|  | <termref link-url="pattern-Transformer">Source</termref> according to the | 
|  | instructions found in the <termref | 
|  | link-url="pattern-Templates">Templates</termref>, and produces a | 
|  | <termref link-url="pattern-Result">Result</termref>.</para> | 
|  | <para>The process of transformation from a tree, either in the form of an | 
|  | object model, or in the form of parse events, into a stream, is known as | 
|  | <termref>serialization</termref>. We believe this is the most suitable term for | 
|  | this process, despite the overlap with Java object serialization.</para> | 
|  | <patterns module="TRaX"> <pattern><pattern-name | 
|  | id="pattern-Processor">Processor</pattern-name><intent>Generic concept for the | 
|  | set of objects that implement the TrAX interfaces.</intent> | 
|  | <responsibilities>Create compiled transformation instructions, transform | 
|  | sources, and manage transformation parameters and | 
|  | properties.</responsibilities><thread-safety>Only the Templates object can be | 
|  | used concurrently in multiple threads. The rest of the processor does not do | 
|  | synchronized blocking, and so may not be used to perform multiple concurrent | 
|  | operations.</thread-safety></pattern><pattern> | 
|  | <pattern-name id="pattern-TransformerFactory">TransformerFactory</pattern-name> | 
|  | <intent>Serve as a vendor-neutral Processor interface for | 
|  | <ulink url="http://www.w3.org/TR/xslt">XSLT</ulink> and similar | 
|  | processors.</intent> <responsibilities>Serve as a factory for a concrete | 
|  | implementation of an TransformerFactory, serve as a direct factory for | 
|  | Transformer objects, serve as a factory for Templates objects, and manage | 
|  | processor specific features.</responsibilities> <thread-safety>A | 
|  | TransformerFactory may not perform mulitple concurrent | 
|  | operations.</thread-safety> </pattern> <pattern> | 
|  | <pattern-name id="pattern-Templates">Templates</pattern-name> <intent>The | 
|  | runtime representation of the transformation instructions.</intent> | 
|  | <responsibilities>A data bag for transformation instructions; act as a factory | 
|  | for Transformers.</responsibilities> <thread-safety>Threadsafe for concurrent | 
|  | usage over multiple threads once construction is complete.</thread-safety> | 
|  | </pattern> <pattern> <pattern-name | 
|  | id="pattern-Transformer">Transformer</pattern-name> <intent>Act as a per-thread | 
|  | execution context for transformations, act as an interface for performing the | 
|  | transformation.</intent><responsibilities>Perform the | 
|  | transformation.</responsibilities> <thread-safety>Only one instance per thread | 
|  | is safe.</thread-safety> <notes>The Transformer is bound to the Templates | 
|  | object that created it.</notes> </pattern> <pattern> | 
|  | <pattern-name id="pattern-Source">Source</pattern-name> <intent>Serve as a | 
|  | single vendor-neutral object for multiple types of input.</intent> | 
|  | <responsibilities>Act as simple data holder for System IDs, DOM nodes, streams, | 
|  | etc.</responsibilities> <thread-safety>Threadsafe concurrently over multiple | 
|  | threads for read-only operations; must be synchronized for edit | 
|  | operations.</thread-safety> </pattern><pattern> | 
|  | <pattern-name id="pattern-Result">Result</pattern-name> | 
|  | <potential-alternate-name>ResultTarget</potential-alternate-name> <intent>Serve | 
|  | as a single object for multiple types of output, so there can be simple process | 
|  | method signatures.</intent> <responsibilities>Act as simple data holder for | 
|  | output stream, DOM node, ContentHandler, etc.</responsibilities> | 
|  | <thread-safety>Threadsafe concurrently over multiple threads for read-only, | 
|  | must be synchronized for edit.</thread-safety> </pattern> </patterns> | 
|  | </model> | 
|  | </spec> |