| <?xml version="1.0" standalone="no"?> |
| <!DOCTYPE s1 SYSTEM "./dtd/document.dtd"> |
| |
| <s1 title="Programming Guide"> |
| |
| <p>This page has sections on the following topics:</p> |
| <ul> |
| <li><link anchor="SAX1ProgGuide">SAX Programming Guide</link></li> |
| <ul> |
| <li><link anchor="ConstructParser">Constructing a parser</link></li> |
| <li><link anchor="UsingSAX1API">Using the SAX API</link></li> |
| </ul> |
| <li><link anchor="SAX2ProgGuide">SAX2 Programming Guide</link></li> |
| <ul> |
| <li><link anchor="ConstructParser2">Constructing an XML Reader</link></li> |
| <li><link anchor="UsingSAX2API">Using the SAX2 API</link></li> |
| <li><link anchor="SAX2Features">Supported Features</link></li> |
| </ul> |
| <li><link anchor="DOMProgGuide">DOM Programming Guide</link></li> |
| <ul> |
| <li><link anchor="JAVAandCPP">Comparision of Java and C++ DOM's</link></li> |
| <ul> |
| <li><link anchor="AccessAPI">Accessing the API from application code</link></li> |
| <li><link anchor="ClassNames">Class Names</link></li> |
| <li><link anchor="ObjMemMgmt">Objects and Memory Management</link></li> |
| </ul> |
| <li><link anchor="DOMString">DOMString</link></li> |
| <ul> |
| <li><link anchor="EqualityTesting">Equality Testing</link></li> |
| </ul> |
| <li><link anchor="Downcasting">Downcasting</link></li> |
| <li><link anchor="Subclassing">Subclassing</link></li> |
| </ul> |
| <li><link anchor="IDOMProgGuide">Experimental IDOM Programming Guide</link></li> |
| <ul> |
| <li><link anchor="ConstructIDOMParser">Constructing a parser</link></li> |
| <li><link anchor="DOMandIDOM">Comparision of C++ DOM and IDOM</link></li> |
| <ul> |
| <li><link anchor="Motivation">Motivation behind new design</link></li> |
| <li><link anchor="IDOMClassNames">Class Names</link></li> |
| <li><link anchor="IDOMObjMemMgmt">Objects and Memory Management</link></li> |
| <li><link anchor="DOMStringXMCh">DOMString vs. XMLCh</link></li> |
| </ul> |
| </ul> |
| </ul> |
| |
| |
| <anchor name="SAX1ProgGuide"/> |
| <s2 title="SAX1 Programming Guide"> |
| |
| <anchor name="ConstructParser"/> |
| <s3 title="Constructing a parser"> |
| <p>In order to use &XercesCName; to parse XML files, you will |
| need to create an instance of the SAXParser class. The example |
| below shows the code you need in order to create an instance |
| of SAXParser. The DocumentHandler and ErrorHandler instances |
| required by the SAX API are provided using the HandlerBase |
| class supplied with &XercesCName;.</p> |
| |
| <source>int main (int argc, char* args[]) { |
| |
| try { |
| XMLPlatformUtils::Initialize(); |
| } |
| catch (const XMLException& toCatch) { |
| cout << "Error during initialization! :\n" |
| << toCatch.getMessage() << "\n"; |
| return 1; |
| } |
| |
| char* xmlFile = "x1.xml"; |
| SAXParser* parser = new SAXParser(); |
| parser->setDoValidation(true); // optional. |
| parser->setDoNamespaces(true); // optional |
| |
| DocumentHandler* docHandler = new HandlerBase(); |
| ErrorHandler* errHandler = (ErrorHandler*) docHandler; |
| parser->setDocumentHandler(docHandler); |
| parser->setErrorHandler(errHandler); |
| |
| try { |
| parser->parse(xmlFile); |
| } |
| catch (const XMLException& toCatch) { |
| cout << "\nFile not found: '" << xmlFile << "'\n" |
| << "Exception message is: \n" |
| << toCatch.getMessage() << "\n" ; |
| return -1; |
| } |
| }</source> |
| </s3> |
| |
| <anchor name="UsingSAX1API"/> |
| <s3 title="Using the SAX API"> |
| <p>The SAX API for XML parsers was originally developed for |
| Java. Please be aware that there is no standard SAX API for |
| C++, and that use of the &XercesCName; SAX API does not |
| guarantee client code compatibility with other C++ XML |
| parsers.</p> |
| |
| <p>The SAX API presents a callback based API to the parser. An |
| application that uses SAX provides an instance of a handler |
| class to the parser. When the parser detects XML constructs, |
| it calls the methods of the handler class, passing them |
| information about the construct that was detected. The most |
| commonly used handler classes are DocumentHandler which is |
| called when XML constructs are recognized, and ErrorHandler |
| which is called when an error occurs. The header files for the |
| various SAX handler classes are in |
| '<&XercesCInstallDir;>/include/sax'</p> |
| |
| <p>As a convenience, &XercesCName; provides the class |
| HandlerBase, which is a single class which is publicly derived |
| from all the Handler classes. HandlerBase's default |
| implementation of the handler callback methods is to do |
| nothing. A convenient way to get started with &XercesCName; is |
| to derive your own handler class from HandlerBase and override |
| just those methods in HandlerBase which you are interested in |
| customizing. This simple example shows how to create a handler |
| which will print element names, and print fatal error |
| messages. The source code for the sample applications show |
| additional examples of how to write handler classes.</p> |
| |
| <p>This is the header file MySAXHandler.hpp:</p> |
| <source>#include <sax/HandlerBase.hpp> |
| |
| class MySAXHandler : public HandlerBase { |
| public: |
| void startElement(const XMLCh* const, AttributeList&); |
| void fatalError(const SAXParseException&); |
| };</source> |
| |
| <p>This is the implementation file MySAXHandler.cpp:</p> |
| |
| <source>#include "MySAXHandler.hpp" |
| #include <iostream.h> |
| |
| MySAXHandler::MySAXHandler() |
| { |
| } |
| |
| MySAXHandler::startElement(const XMLCh* const name, |
| AttributeList& attributes) |
| { |
| // transcode() is an user application defined function which |
| // converts unicode strings to usual 'char *'. Look at |
| // the sample program SAXCount for an example implementation. |
| cout << "I saw element: " << transcode(name) << endl; |
| } |
| |
| MySAXHandler::fatalError(const SAXParseException& exception) |
| { |
| cout << "Fatal Error: " << transcode(exception.getMessage()) |
| << " at line: " << exception.getLineNumber() |
| << endl; |
| }</source> |
| |
| <p>The XMLCh and AttributeList types are supplied by |
| &XercesCName; and are documented in the include |
| files. Examples of their usage appear in the source code to |
| the sample applications.</p> |
| </s3> |
| </s2> |
| |
| <anchor name="SAX2ProgGuide"/> |
| <s2 title="SAX2 Programming Guide"> |
| |
| <anchor name="ConstructParser2"/> |
| <s3 title="Constructing an XML Reader"> |
| <p>In order to use &XercesCName; to parse XML files, you will |
| need to create an instance of the SAX2XMLReader class. The example |
| below shows the code you need in order to create an instance |
| of SAX2XMLReader. The ContentHandler and ErrorHandler instances |
| required by the SAX API are provided using the DefaultHandler |
| class supplied with &XercesCName;.</p> |
| |
| <source>int main (int argc, char* args[]) { |
| |
| try { |
| XMLPlatformUtils::Initialize(); |
| } |
| catch (const XMLException& toCatch) { |
| cout << "Error during initialization! :\n" |
| << toCatch.getMessage() << "\n"; |
| return 1; |
| } |
| |
| char* xmlFile = "x1.xml"; |
| SAX2XMLReader* parser = XMLReaderFactory::createXMLReader(); |
| parser->setFeature(XMLString::transcode("http://xml.org/sax/features/validation", true) // optional |
| parser->setFeature(XMLString::transcode("http://xml.org/sax/features/namespaces", true) // optional |
| |
| ContentHandler* contentHandler = new DefaultHandler(); |
| ErrorHandler* errHandler = (ErrorHandler*) contentHandler; |
| parser->setContentHandler(docHandler); |
| parser->setErrorHandler(errHandler); |
| |
| try { |
| parser->parse(xmlFile); |
| } |
| catch (const XMLException& toCatch) { |
| cout << "\nFile not found: '" << xmlFile << "'\n" |
| << "Exception message is: \n" |
| << toCatch.getMessage() << "\n" ; |
| return -1; |
| } |
| }</source> |
| </s3> |
| |
| <anchor name="UsingSAX2API"/> |
| <s3 title="Using the SAX2 API"> |
| <p>The SAX2 API for XML parsers was originally developed for |
| Java. Please be aware that there is no standard SAX2 API for |
| C++, and that use of the &XercesCName; SAX2 API does not |
| guarantee client code compatibility with other C++ XML |
| parsers.</p> |
| |
| <p>The SAX2 API presents a callback based API to the parser. An |
| application that uses SAX2 provides an instance of a handler |
| class to the parser. When the parser detects XML constructs, |
| it calls the methods of the handler class, passing them |
| information about the construct that was detected. The most |
| commonly used handler classes are ContentHandler which is |
| called when XML constructs are recognized, and ErrorHandler |
| which is called when an error occurs. The header files for the |
| various SAX2 handler classes are in |
| '<&XercesCInstallDir;>/include/sax2'</p> |
| |
| <p>As a convenience, &XercesCName; provides the class |
| DefaultHandler, which is a single class which is publicly derived |
| from all the Handler classes. DefaultHandler's default |
| implementation of the handler callback methods is to do |
| nothing. A convenient way to get started with &XercesCName; is |
| to derive your own handler class from DefaultHandler and override |
| just those methods in HandlerBase which you are interested in |
| customizing. This simple example shows how to create a handler |
| which will print element names, and print fatal error |
| messages. The source code for the sample applications show |
| additional examples of how to write handler classes.</p> |
| |
| <p>This is the header file MySAX2Handler.hpp:</p> |
| <source>#include <sax2/DefaultHandler.hpp> |
| |
| class MySAX2Handler : public DefaultHandler { |
| public: |
| void startElement( |
| const XMLCh* const uri, |
| const XMLCh* const localname, |
| const XMLCh* const qname, |
| const Attributes& attrs |
| ); |
| void fatalError(const SAXParseException&); |
| };</source> |
| |
| <p>This is the implementation file MySAX2Handler.cpp:</p> |
| |
| <source>#include "MySAX2Handler.hpp" |
| #include <iostream.h> |
| |
| MySAX2Handler::MySAX2Handler() |
| { |
| } |
| |
| MySAX2Handler::startElement(const XMLCh* const uri, |
| const XMLCh* const localname, |
| const XMLCh* const qname, |
| const Attributes& attrs) |
| { |
| // transcode() is an user application defined function which |
| // converts unicode strings to usual 'char *'. Look at |
| // the sample program SAX2Count for an example implementation. |
| cout << "I saw element: " << transcode(qname) << endl; |
| } |
| |
| MySAX2Handler::fatalError(const SAXParseException& exception) |
| { |
| cout << "Fatal Error: " << transcode(exception.getMessage()) |
| << " at line: " << exception.getLineNumber() |
| << endl; |
| }</source> |
| |
| <p>The XMLCh and Attributes types are supplied by |
| &XercesCName; and are documented in the include |
| files. Examples of their usage appear in the source code to |
| the sample applications.</p> |
| </s3> |
| |
| <anchor name="SAX2Features"/> |
| <s3 title="Xerces SAX2 Supported Features"> |
| |
| <p>The behavior of the SAX2XMLReader is dependant on the values of the following features. |
| All of the features below can be set using the <code>SAX2XMLReader::setFeature(XMLCh*,bool)</code> function. |
| None of these features can be modified in the middle of a parse, or an exception will be thrown.</p> |
| |
| <table> |
| <tr><td colspan="2"><em>http://xml.org/sax/features/namespaces</em></td></tr> |
| <tr><td><em>true:</em></td><td> Perform Namespace processing (default)</td></tr> |
| <tr><td><em>false:</em></td><td> Optionally do not perform Namespace processing</td></tr> |
| </table> |
| |
| <p/> |
| |
| <table> |
| <tr><td colspan="2"><em>http://xml.org/sax/features/namespace-prefixes</em></td></tr> |
| <tr><td><em>true:</em></td><td> Report the orignal prefixed names and attributes used for Namespace declarations (default)</td></tr> |
| <tr><td><em>false:</em></td><td> Do not report attributes used for Namespace declarations, and optionally do not report original prefixed names. </td></tr> |
| </table> |
| |
| <p/> |
| |
| <table> |
| <tr><td colspan="2"><em>http://xml.org/sax/features/validation</em></td></tr> |
| <tr><td><em>true:</em></td><td> Report all validation errors. (default)</td></tr> |
| <tr><td><em>false:</em></td><td> Do not report validation errors. </td></tr> |
| </table> |
| |
| <p/> |
| |
| <table> |
| <tr><td colspan="2"><em>http://apache.org/xml/features/validation/dynamic</em></td></tr> |
| <tr><td><em>true:</em></td><td> The parser will validate the document only if a grammar is specified. (http://xml.org/sax/features/validation must be true)</td></tr> |
| <tr><td><em>false:</em></td><td> Validation is determined by the state of the http://xml.org/sax/features/validation feature (default)</td></tr> |
| </table> |
| |
| <p/> |
| |
| <table> |
| <tr><td colspan="2"><em>http://apache.org/xml/features/validation/schema</em></td></tr> |
| <tr><td><em>true:</em></td><td> Enable the parser's schema support. (default) </td></tr> |
| <tr><td><em>false:</em></td><td> Disable the parser's schema support. </td></tr> |
| </table> |
| |
| <p/> |
| |
| <table> |
| <tr><td colspan="2"><em>http://apache.org/xml/features/validation/reuse-grammar</em></td></tr> |
| <tr><td><em>true:</em></td><td> The parser will reuse grammar information from previous parses in subsequent parses. </td></tr> |
| <tr><td><em>false:</em></td><td> The parser will not reuse any grammar information. (default)</td></tr> |
| </table> |
| |
| <p/> |
| |
| <table> |
| <tr><td colspan="2"><em>http://apache.org/xml/features/validation/reuse-validator</em> (deprecated)</td></tr> |
| <tr><td><em>true:</em></td><td> The parser will reuse grammar information from previous parses in subsequent parses. </td></tr> |
| <tr><td><em>false:</em></td><td> The parser will not reuse any grammar information. (default)</td></tr> |
| </table> |
| |
| </s3> |
| </s2> |
| |
| <anchor name="DOMProgGuide"/> |
| <s2 title="DOM Programming Guide"> |
| |
| <anchor name="JAVAandCPP"/> |
| <s3 title="Java and C++ DOM comparisons"> |
| <p>The C++ DOM API is very similar in design and use, to the |
| Java DOM API bindings. As a consequence, conversion of |
| existing Java code that makes use of the DOM to C++ is a |
| straight forward process. |
| </p> |
| <p> |
| This section outlines the differences between Java and C++ bindings. |
| </p> |
| </s3> |
| |
| <anchor name="AccessAPI"/> |
| <s3 title="Accessing the API from application code"> |
| |
| <source> |
| // C++ |
| #include <dom/DOM.hpp></source> |
| |
| <source>// Java |
| import org.w3c.dom.*</source> |
| |
| <p>The header file <dom/DOM.hpp> includes all the |
| individual headers for the DOM API classes. </p> |
| |
| </s3> |
| |
| <anchor name="ClassNames"/> |
| <s3 title="Class Names"> |
| <p>The C++ class names are prefixed with "DOM_". The intent is |
| to prevent conflicts between DOM class names and other names |
| that may already be in use by an application or other |
| libraries that a DOM based application must link with.</p> |
| |
| <p>The use of C++ namespaces would also have solved this |
| conflict problem, but for the fact that many compilers do not |
| yet support them.</p> |
| |
| <source>DOM_Document myDocument; // C++ |
| DOM_Node aNode; |
| DOM_Text someText;</source> |
| |
| <source>Document myDocument; // Java |
| Node aNode; |
| Text someText;</source> |
| |
| <p>If you wish to use the Java class names in C++, then you need |
| to typedef them in C++. This is not advisable for the general |
| case - conflicts really do occur - but can be very useful when |
| converting a body of existing Java code to C++.</p> |
| |
| <source>typedef DOM_Document Document; |
| typedef DOM_Node Node; |
| |
| Document myDocument; // Now C++ usage is |
| // indistinguishable from Java |
| Node aNode;</source> |
| </s3> |
| |
| |
| <anchor name="ObjMemMgmt"/> |
| <s3 title="Objects and Memory Management"> |
| <p>The C++ DOM implementation uses automatic memory management, |
| implemented using reference counting. As a result, the C++ |
| code for most DOM operations is very similar to the equivalent |
| Java code, right down to the use of factory methods in the DOM |
| document class for nearly all object creation, and the lack of |
| any explicit object deletion.</p> |
| |
| <p>Consider the following code snippets </p> |
| |
| <source>// This is C++ |
| DOM_Node aNode; |
| aNode = someDocument.createElement("ElementName"); |
| DOM_Node docRootNode = someDoc.getDocumentElement(); |
| docRootNode.AppendChild(aNode);</source> |
| |
| <source>// This is Java |
| Node aNode; |
| aNode = someDocument.createElement("ElementName"); |
| Node docRootNode = someDoc.getDocumentElement(); |
| docRootNode.AppendChild(aNode);</source> |
| |
| <p>The Java and the C++ are identical on the surface, except for |
| the class names, and this similarity remains true for most DOM |
| code. </p> |
| |
| <p>However, Java and C++ handle objects in somewhat different |
| ways, making it important to understand a little bit of what |
| is going on beneath the surface.</p> |
| |
| <p>In Java, the variable <code>aNode</code> is an object reference , |
| essentially a pointer. It is initially == null, and references |
| an object only after the assignment statement in the second |
| line of the code.</p> |
| |
| <p>In C++ the variable <code>aNode</code> is, from the C++ language's |
| perspective, an actual live object. It is constructed when the |
| first line of the code executes, and DOM_Node::operator = () |
| executes at the second line. The C++ class DOM_Node |
| essentially a form of a smart-pointer; it implements much of |
| the behavior of a Java Object Reference variable, and |
| delegates the DOM behaviors to an implementation class that |
| lives behind the scenes. </p> |
| |
| <p>Key points to remember when using the C++ DOM classes:</p> |
| |
| <ul> |
| <li>Create them as local variables, or as member variables of |
| some other class. Never "new" a DOM object into the heap or |
| make an ordinary C pointer variable to one, as this will |
| greatly confuse the automatic memory management. </li> |
| |
| <li>The "real" DOM objects - nodes, attributes, CData |
| sections, whatever, do live on the heap, are created with the |
| create... methods on class DOM_Document. DOM_Node and the |
| other DOM classes serve as reference variables to the |
| underlying heap objects.</li> |
| |
| <li>The visible DOM classes may be freely copied (assigned), |
| passed as parameters to functions, or returned by value from |
| functions.</li> |
| |
| <li>Memory management of the underlying DOM heap objects is |
| automatic, implemented by means of reference counting. So long |
| as some part of a document can be reached, directly or |
| indirectly, via reference variables that are still alive in |
| the application program, the corresponding document data will |
| stay alive in the heap. When all possible paths of access have |
| been closed off (all of the application's DOM objects have |
| gone out of scope) the heap data itself will be automatically |
| deleted. </li> |
| |
| <li>There are restrictions on the ability to subclass the DOM |
| classes. </li> |
| |
| </ul> |
| |
| </s3> |
| |
| <anchor name="DOMString"/> |
| <s3 title="DOMString"> |
| <p>Class DOMString provides the mechanism for passing string |
| data to and from the DOM API. DOMString is not intended to be |
| a completely general string class, but rather to meet the |
| specific needs of the DOM API.</p> |
| |
| <p>The design derives from two primary sources: from the DOM's |
| CharacterData interface and from class <code>java.lang.string</code>.</p> |
| |
| <p>Main features are:</p> |
| |
| <ul> |
| <li>It stores Unicode text.</li> |
| |
| <li>Automatic memory management, using reference counting.</li> |
| |
| <li>DOMStrings are mutable - characters can be inserted, |
| deleted or appended.</li> |
| |
| </ul> |
| <p></p> |
| |
| <p>When a string is passed into a method of the DOM, when |
| setting the value of a Node, for example, the string is cloned |
| so that any subsequent alteration or reuse of the string by |
| the application will not alter the document contents. |
| Similarly, when strings from the document are returned to an |
| application via the DOM API, the string is cloned so that the |
| document can not be inadvertently altered by subsequent edits |
| to the string.</p> |
| |
| <note>The ICU classes are a more general solution to UNICODE |
| character handling for C++ applications. ICU is an Open |
| Source Unicode library, available at the <jump |
| href="http://oss.software.ibm.com/icu/">IBM |
| DeveloperWorks website</jump>.</note> |
| |
| </s3> |
| |
| <anchor name="EqualityTesting"/> |
| <s3 title="Equality Testing"> |
| <p>The DOMString equality operators (and all of the rest of the |
| DOM class conventions) are modeled after the Java |
| equivalents. The equals() method compares the content of the |
| string, while the == operator checks whether the string |
| reference variables (the application program variables) refer |
| to the same underlying string in memory. This is also true of |
| DOM_Node, DOM_Element, etc., in that operator == tells whether |
| the variables in the application are referring to the same |
| actual node or not. It's all very Java-like </p> |
| |
| <ul> |
| <li>bool operator == () is true if the DOMString variables |
| refer to the same underlying storage. </li> |
| |
| <li>bool equals() is true if the strings contain the same |
| characters. </li> |
| |
| </ul> |
| <p>Here is an example of how the equality operators work: </p> |
| <source>DOMString a = "Hello"; |
| DOMString b = a; |
| DOMString c = a.clone(); |
| if (b == a) // This is true |
| if (a == c) // This is false |
| if (a.equals(c)) // This is true |
| b = b + " World"; |
| if (b == a) // Still true, and the string's |
| // value is "Hello World" |
| if (a.equals(c)) // false. a is "Hello World"; |
| // c is still "Hello".</source> |
| </s3> |
| |
| <anchor name="Downcasting"/> |
| <s3 title="Downcasting"> |
| <p>Application code sometimes must cast an object reference from |
| DOM_Node to one of the classes deriving from DOM_Node, |
| DOM_Element, for example. The syntax for doing this in C++ is |
| different from that in Java.</p> |
| |
| <source>// This is C++ |
| DOM_Node aNode = someFunctionReturningNode(); |
| DOM_Element el = (Element &) aNode;</source> |
| |
| <source>// This is Java |
| Node aNode = someFunctionReturningNode(); |
| Element el = (Element) aNode;</source> |
| |
| <p>The C++ cast is not type-safe; the Java cast is checked for |
| compatible types at runtime. If necessary, a type-check can |
| be made in C++ using the node type information: </p> |
| |
| <source>// This is C++ |
| |
| DOM_Node aNode = someFunctionReturningNode(); |
| DOM_Element el; // by default, el will == null. |
| |
| if (anode.getNodeType() == DOM_Node::ELEMENT_NODE) |
| el = (Element &) aNode; |
| else |
| // aNode does not refer to an element. |
| // Do something to recover here.</source> |
| |
| </s3> |
| |
| <anchor name="Subclassing"/> |
| <s3 title="Subclassing"> |
| <p>The C++ DOM classes, DOM_Node, DOM_Attr, DOM_Document, etc., |
| are not designed to be subclassed by an application |
| program. </p> |
| |
| <p>As an alternative, the DOM_Node class provides a User Data |
| field for use by applications as a hook for extending nodes by |
| referencing additional data or objects. See the API |
| description for DOM_Node for details.</p> |
| </s3> |
| |
| </s2> |
| |
| <anchor name="IDOMProgGuide"/> |
| <s2 title="Experimental IDOM Programming Guide"> |
| <p>The experimental IDOM API is a new design of the C++ DOM API. |
| Please note that this experimental IDOM API is only a prototype |
| and is subject to change.</p> |
| |
| <anchor name="ConstructIDOMParser"/> |
| <s3 title="Constructing a parser"> |
| <p>In order to use &XercesCName; to parse XML files using IDOM, you |
| will need to create an instance of the IDOMParser class. The example |
| below shows the code you need in order to create an instance of the |
| IDOMParser.</p> |
| |
| <source> |
| int main (int argc, char* args[]) { |
| |
| try { |
| XMLPlatformUtils::Initialize(); |
| } |
| catch (const XMLException& toCatch) { |
| cout << "Error during initialization! :\n" |
| << toCatch.getMessage() << "\n"; |
| return 1; |
| } |
| |
| char* xmlFile = "x1.xml"; |
| IDOMParser* parser = new IDOMParser(); |
| parser->setValidationScheme(IDOMParser::Val_Always); // optional. |
| parser->setDoNamespaces(true); // optional |
| |
| ErrorHandler* errHandler = (ErrorHandler*) new HandlerBase(); |
| parser->setErrorHandler(errHandler); |
| |
| try { |
| parser->parse(xmlFile); |
| } |
| catch (const XMLException& toCatch) { |
| cout << "\nFile not found: '" << xmlFile << "'\n" |
| << "Exception message is: \n" |
| << toCatch.getMessage() << "\n" ; |
| return -1; |
| } |
| |
| return 0; |
| } |
| </source> |
| </s3> |
| |
| <anchor name="DOMandIDOM"/> |
| <s3 title="Comparision of C++ DOM and IDOM"> |
| <p> |
| This section outlines the differences between the C++ DOM and IDOM APIs. |
| </p> |
| </s3> |
| |
| <anchor name="Motivation"/> |
| <s3 title="Motivation behind new design"> |
| <p> |
| The performance of the C++ DOM has not been as good as it |
| might be, especially for use in server style applications. |
| The DOM's reference counted automatic memory management has |
| been the biggest time consumer. The situation becomes worse |
| when running multi-threaded applications. |
| </p> |
| <p> |
| The experimental C++ IDOM is a new alternative to the C++ DOM, and aims at |
| meeting the following requirements: |
| </p> |
| <ul> |
| <li>Reduced memory footprint.</li> |
| <li>Fast.</li> |
| <li>Good scalability on multiprocessor systems.</li> |
| <li>More C++ like and less Java like.</li> |
| </ul> |
| </s3> |
| |
| <anchor name="IDOMClassNames"/> |
| <s3 title="Class Names"> |
| <p> |
| The IDOM class names are prefixed with "IDOM_". The intent is |
| to prevent conflicts between IDOM class names and DOM class names |
| that may already be in use by an application or other |
| libraries that a DOM based application must link with. |
| </p> |
| |
| |
| <source> |
| IDOM_Document* myDocument; // IDOM |
| IDOM_Node* aNode; |
| IDOM_Text* someText; |
| </source> |
| |
| <source> |
| DOM_Document myDocument; // DOM |
| DOM_Node aNode; |
| DOM_Text someText; |
| </source> |
| </s3> |
| |
| <anchor name="IDOMObjMemMgmt"/> |
| <s3 title="Objects and Memory Management"> |
| <p>The C++ IDOM implementation no longer uses reference counting for |
| automatic memory management. The storage for a DOM document is |
| associated with the document node object. Applications would use |
| normal C++ pointers to directly access the implementation objects |
| for Nodes in IDOM C++, while they would use object references in |
| DOM C++. |
| </p> |
| |
| <p>Consider the following code snippets</p> |
| |
| |
| <source> |
| // IDOM C++ |
| IDOM_Node* aNode; |
| IDOM_Node* docRootNode; |
| aNode = someDocument->createElement("ElementName"); |
| docRootNode = someDocument->getDocumentElement(); |
| docRootNode->appendChild(aNode); |
| </source> |
| |
| <source> |
| // DOM C++ |
| DOM_Node aNode; |
| DOM_Node docRootNode; |
| aNode = someDocument.createElement("ElementName"); |
| docRootNode = someDocument.getDocumentElement(); |
| docRootNode.appendChild(aNode); |
| </source> |
| |
| |
| <p>The IDOM C++ uses an independent storage allocator per document. |
| The advantage here is that allocation would require no synchronization |
| in most cases (based on the the same threading model that we |
| have now - one thread active per document, but any number of |
| documents running in parallel with separate threads). |
| </p> |
| |
| <p>The allocator does not support a delete operation at all - all |
| allocated memory would persist for the life of the document, and |
| then the larger blocks would be returned to the system without separately |
| deleting all of the individual nodes and strings within the document. |
| </p> |
| |
| <p>The C++ DOM and IDOM are similar in the use of factory methods in the |
| document class for all object creation. They differ in the object deletion |
| mechanism. |
| </p> |
| |
| <p>In C++ DOM, there is no explicit object deletion. The deallocation of |
| memory is automatically taken care of by the reference counting. |
| </p> |
| |
| <p>In C++ IDOM, there is an implict and explict object deletion. When parsing |
| a document using an IDOMParser, the storage allocated will be automatically |
| deleted when the parser instance is deleted (implicit). If a user is |
| manually building a DOM tree in memory using the document factory methods, |
| then the user needs to explicilty delete the document object to free all |
| allocated memory. |
| </p> |
| |
| <p>Consider the following code snippets: </p> |
| |
| <source> |
| // C++ IDOM - explicit deletion |
| IDOM_Document* myDocument; |
| IDOM_Node* aNode; |
| myDocument = IDOM_DOMImplementation::getImplementation()->createDocument(); |
| aNode = myDocument->createElement("ElementName"); |
| myDocument->appendChild(aNode); |
| delete myDocument; |
| </source> |
| |
| <source> |
| // C++ DOM - implicit deletion |
| IDOM_Document myDocument; |
| DOM_Node aNode; |
| myDocument = DOM_DOMImplementation::getImplementation().createDocument(); |
| aNode = myDocument.createElement("ElementName"); |
| myDocument.appendChild(aNode); |
| </source> |
| |
| <p>Key points to remember when using the C++ IDOM classes:</p> |
| |
| <ul> |
| <li>The DOM objects are accessed via C++ pointers.</li> |
| |
| <li>The DOM objects - nodes, attributes, CData |
| sections, etc., are created with the factory methods |
| (create...) in the document class.</li> |
| |
| <li>If you are manually building a DOM tree in memory, you |
| need to explicitly delete the document object. |
| Memory management will be automatically taken care of by |
| the IDOM parser when parsing an instance document.</li> |
| |
| </ul> |
| </s3> |
| |
| <anchor name="DOMStringXMCh"/> |
| <s3 title="DOMString vs. XMLCh"> |
| <p>The IDOM C++ no longer uses DOMString to pass string data to |
| and from the DOM API. Instead, the IDOM C++ uses plain, null-terminated |
| (XMLCh *) utf-16 strings. The (XMLCh*) utf-16 type string is much |
| simpler with lower overhead. All the string data would remain in |
| memory until the document object is deleted.</p> |
| |
| <source> |
| //C++ IDOM |
| const XMLCh* nodeValue = aNode->getNodeValue(); |
| </source> |
| |
| <source> |
| //C++ DOM |
| DOMString nodeValue = aNode.getNodeValue(); |
| </source> |
| </s3> |
| |
| </s2> |
| |
| </s1> |