| <?xml version="1.0" standalone="no"?> |
| <!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> |
| |
| <s1 title="DOM Programming Guide"> |
| |
| <anchor name="DOMProgGuide"/> |
| <anchor name="JAVAandCPP"/> |
| <s2 title="Java and C++ DOM comparisons"> |
| <p>The C++ DOM API is very similar in design and use, to the |
| Java DOM API bindings. As a consequence, conversion of |
| existing Java code that makes use of the DOM to C++ is a |
| straight forward process. |
| </p> |
| <p> |
| This section outlines the differences between Java and C++ bindings. |
| </p> |
| </s2> |
| |
| <anchor name="AccessAPI"/> |
| <s2 title="Accessing the API from application code"> |
| |
| <source> |
| // C++ |
| #include <xercesc/dom/DOM.hpp></source> |
| |
| <source>// Java |
| import org.w3c.dom.*</source> |
| |
| <p>The header file <dom/DOM.hpp> includes all the |
| individual headers for the DOM API classes. </p> |
| |
| </s2> |
| |
| <anchor name="ClassNames"/> |
| <s2 title="Class Names"> |
| <p>The C++ class names are prefixed with "DOM_". The intent is |
| to prevent conflicts between DOM class names and other names |
| that may already be in use by an application or other |
| libraries that a DOM based application must link with.</p> |
| |
| <p>The use of C++ namespaces would also have solved this |
| conflict problem, but for the fact that many compilers do not |
| yet support them.</p> |
| |
| <source>DOM_Document myDocument; // C++ |
| DOM_Node aNode; |
| DOM_Text someText;</source> |
| |
| <source>Document myDocument; // Java |
| Node aNode; |
| Text someText;</source> |
| |
| <p>If you wish to use the Java class names in C++, then you need |
| to typedef them in C++. This is not advisable for the general |
| case - conflicts really do occur - but can be very useful when |
| converting a body of existing Java code to C++.</p> |
| |
| <source>typedef DOM_Document Document; |
| typedef DOM_Node Node; |
| |
| Document myDocument; // Now C++ usage is |
| // indistinguishable from Java |
| Node aNode;</source> |
| </s2> |
| |
| |
| <anchor name="ObjMemMgmt"/> |
| <s2 title="Objects and Memory Management"> |
| <p>The C++ DOM implementation uses automatic memory management, |
| implemented using reference counting. As a result, the C++ |
| code for most DOM operations is very similar to the equivalent |
| Java code, right down to the use of factory methods in the DOM |
| document class for nearly all object creation, and the lack of |
| any explicit object deletion.</p> |
| |
| <p>Consider the following code snippets </p> |
| |
| <source>// This is C++ |
| DOM_Node aNode; |
| aNode = someDocument.createElement("ElementName"); |
| DOM_Node docRootNode = someDoc.getDocumentElement(); |
| docRootNode.AppendChild(aNode);</source> |
| |
| <source>// This is Java |
| Node aNode; |
| aNode = someDocument.createElement("ElementName"); |
| Node docRootNode = someDoc.getDocumentElement(); |
| docRootNode.AppendChild(aNode);</source> |
| |
| <p>The Java and the C++ are identical on the surface, except for |
| the class names, and this similarity remains true for most DOM |
| code. </p> |
| |
| <p>However, Java and C++ handle objects in somewhat different |
| ways, making it important to understand a little bit of what |
| is going on beneath the surface.</p> |
| |
| <p>In Java, the variable <code>aNode</code> is an object reference , |
| essentially a pointer. It is initially == null, and references |
| an object only after the assignment statement in the second |
| line of the code.</p> |
| |
| <p>In C++ the variable <code>aNode</code> is, from the C++ language's |
| perspective, an actual live object. It is constructed when the |
| first line of the code executes, and DOM_Node::operator = () |
| executes at the second line. The C++ class DOM_Node |
| essentially a form of a smart-pointer; it implements much of |
| the behavior of a Java Object Reference variable, and |
| delegates the DOM behaviors to an implementation class that |
| lives behind the scenes. </p> |
| |
| <p>Key points to remember when using the C++ DOM classes:</p> |
| |
| <ul> |
| <li>Create them as local variables, or as member variables of |
| some other class. Never "new" a DOM object into the heap or |
| make an ordinary C pointer variable to one, as this will |
| greatly confuse the automatic memory management. </li> |
| |
| <li>The "real" DOM objects - nodes, attributes, CData |
| sections, whatever, do live on the heap, are created with the |
| create... methods on class DOM_Document. DOM_Node and the |
| other DOM classes serve as reference variables to the |
| underlying heap objects.</li> |
| |
| <li>The visible DOM classes may be freely copied (assigned), |
| passed as parameters to functions, or returned by value from |
| functions.</li> |
| |
| <li>Memory management of the underlying DOM heap objects is |
| automatic, implemented by means of reference counting. So long |
| as some part of a document can be reached, directly or |
| indirectly, via reference variables that are still alive in |
| the application program, the corresponding document data will |
| stay alive in the heap. When all possible paths of access have |
| been closed off (all of the application's DOM objects have |
| gone out of scope) the heap data itself will be automatically |
| deleted. </li> |
| |
| <li>There are restrictions on the ability to subclass the DOM |
| classes. </li> |
| |
| </ul> |
| |
| </s2> |
| |
| <anchor name="DOMString"/> |
| <s2 title="DOMString"> |
| <p>Class DOMString provides the mechanism for passing string |
| data to and from the DOM API. DOMString is not intended to be |
| a completely general string class, but rather to meet the |
| specific needs of the DOM API.</p> |
| |
| <p>The design derives from two primary sources: from the DOM's |
| CharacterData interface and from class <code>java.lang.string</code>.</p> |
| |
| <p>Main features are:</p> |
| |
| <ul> |
| <li>It stores Unicode text.</li> |
| |
| <li>Automatic memory management, using reference counting.</li> |
| |
| <li>DOMStrings are mutable - characters can be inserted, |
| deleted or appended.</li> |
| |
| </ul> |
| <p></p> |
| |
| <p>When a string is passed into a method of the DOM, when |
| setting the value of a Node, for example, the string is cloned |
| so that any subsequent alteration or reuse of the string by |
| the application will not alter the document contents. |
| Similarly, when strings from the document are returned to an |
| application via the DOM API, the string is cloned so that the |
| document can not be inadvertently altered by subsequent edits |
| to the string.</p> |
| |
| <note>The ICU classes are a more general solution to UNICODE |
| character handling for C++ applications. ICU is an Open |
| Source Unicode library, available at the <jump |
| href="http://oss.software.ibm.com/icu/">IBM |
| DeveloperWorks website</jump>.</note> |
| |
| </s2> |
| |
| <anchor name="EqualityTesting"/> |
| <s2 title="Equality Testing"> |
| <p>The DOMString equality operators (and all of the rest of the |
| DOM class conventions) are modeled after the Java |
| equivalents. The equals() method compares the content of the |
| string, while the == operator checks whether the string |
| reference variables (the application program variables) refer |
| to the same underlying string in memory. This is also true of |
| DOM_Node, DOM_Element, etc., in that operator == tells whether |
| the variables in the application are referring to the same |
| actual node or not. It's all very Java-like </p> |
| |
| <ul> |
| <li>bool operator == () is true if the DOMString variables |
| refer to the same underlying storage. </li> |
| |
| <li>bool equals() is true if the strings contain the same |
| characters. </li> |
| |
| </ul> |
| <p>Here is an example of how the equality operators work: </p> |
| <source>DOMString a = "Hello"; |
| DOMString b = a; |
| DOMString c = a.clone(); |
| if (b == a) // This is true |
| if (a == c) // This is false |
| if (a.equals(c)) // This is true |
| b = b + " World"; |
| if (b == a) // Still true, and the string's |
| // value is "Hello World" |
| if (a.equals(c)) // false. a is "Hello World"; |
| // c is still "Hello".</source> |
| </s2> |
| |
| <anchor name="Downcasting"/> |
| <s2 title="Downcasting"> |
| <p>Application code sometimes must cast an object reference from |
| DOM_Node to one of the classes deriving from DOM_Node, |
| DOM_Element, for example. The syntax for doing this in C++ is |
| different from that in Java.</p> |
| |
| <source>// This is C++ |
| DOM_Node aNode = someFunctionReturningNode(); |
| DOM_Element el = (DOM_Element &) aNode;</source> |
| |
| <source>// This is Java |
| Node aNode = someFunctionReturningNode(); |
| Element el = (Element) aNode;</source> |
| |
| <p>The C++ cast is not type-safe; the Java cast is checked for |
| compatible types at runtime. If necessary, a type-check can |
| be made in C++ using the node type information: </p> |
| |
| <source>// This is C++ |
| |
| DOM_Node aNode = someFunctionReturningNode(); |
| DOM_Element el; // by default, el will == null. |
| |
| if (anode.getNodeType() == DOM_Node::ELEMENT_NODE) |
| el = (DOM_Element &) aNode; |
| else |
| // aNode does not refer to an element. |
| // Do something to recover here.</source> |
| |
| </s2> |
| |
| <anchor name="Subclassing"/> |
| <s2 title="Subclassing"> |
| <p>The C++ DOM classes, DOM_Node, DOM_Attr, DOM_Document, etc., |
| are not designed to be subclassed by an application |
| program. </p> |
| |
| <p>As an alternative, the DOM_Node class provides a User Data |
| field for use by applications as a hook for extending nodes by |
| referencing additional data or objects. See the API |
| description for DOM_Node for details.</p> |
| </s2> |
| |
| </s1> |