You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@xerces.apache.org by tn...@apache.org on 2002/05/21 20:23:01 UTC
cvs commit: xml-xerces/c/doc program-dom.xml
tng 02/05/21 11:23:01
Modified: c/doc program-dom.xml
Log:
Documentation Update: DOM Programming Guide now talks about the new DOM.
Revision Changes Path
1.4 +178 -247 xml-xerces/c/doc/program-dom.xml
Index: program-dom.xml
===================================================================
RCS file: /home/cvs/xml-xerces/c/doc/program-dom.xml,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -r1.3 -r1.4
--- program-dom.xml 20 Feb 2002 21:01:56 -0000 1.3
+++ program-dom.xml 21 May 2002 18:23:01 -0000 1.4
@@ -2,264 +2,195 @@
<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd">
<s1 title="DOM Programming Guide">
-
- <anchor name="DOMProgGuide"/>
- <anchor name="JAVAandCPP"/>
- <s2 title="Java and C++ DOM comparisons">
- <p>The C++ DOM API is very similar in design and use, to the
- Java DOM API bindings. As a consequence, conversion of
- existing Java code that makes use of the DOM to C++ is a
- straight forward process.
- </p>
- <p>
- This section outlines the differences between Java and C++ bindings.
+ <anchor name="Objectives"/>
+ <s2 title="Design Objectives">
+ <p>The C++ DOM implementation is based on the
+ <jump href="ApacheDOMC++Binding.html">Apache Recommended DOM C++ binding</jump>.</p>
+ <p>The design objective aims at meeting the following requirements:
</p>
+ <ul>
+ <li>Reduced memory footprint.</li>
+ <li>Fast - especially for use in server style and multi-threaded applications.</li>
+ <li>Good scalability on multiprocessor systems.</li>
+ <li>More C++ like and less Java like.</li>
+ </ul>
</s2>
- <anchor name="AccessAPI"/>
- <s2 title="Accessing the API from application code">
+ <anchor name="ConstructXercesDOMParser"/>
+ <s2 title="Constructing a XercesDOMParser">
+ <p>In order to use &XercesCName; to parse XML files using DOM, you
+ will need to create an instance of the XercesDOMParser class. The example
+ below shows the code you need in order to create an instance of the
+ XercesDOMParser.</p>
+
+ <source>
+#include <xercesc/parsers/XercesDOMParser.hpp>
+#include <xercesc/dom/DOM.hpp>
+#include <xercesc/sax/HandlerBase.hpp>
+#include <xercesc/util/XMLString.hpp>
+
+int main (int argc, char* args[]) {
+
+ try {
+ XMLPlatformUtils::Initialize();
+ }
+ catch (const XMLException& toCatch) {
+ char* message = XMLString::transcode(toCatch.getMessage());
+ cout << "Error during initialization! :\n"
+ << message << "\n";
+ delete [] message;
+ return 1;
+ }
+
+ char* xmlFile = "x1.xml";
+ XercesDOMParser* parser = new XercesDOMParser();
+ parser->setValidationScheme(XercesDOMParser::Val_Always); // optional.
+ parser->setDoNamespaces(true); // optional
+
+ ErrorHandler* errHandler = (ErrorHandler*) new HandlerBase();
+ parser->setErrorHandler(errHandler);
+
+ try {
+ parser->parse(xmlFile);
+ }
+ catch (const XMLException& toCatch) {
+ char* message = XMLString::transcode(toCatch.getMessage());
+ cout << "Exception message is: \n"
+ << message << "\n";
+ delete [] message;
+ return -1;
+ }
+ catch (const DOMException& toCatch) {
+ char* message = XMLString::transcode(toCatch.getMessage());
+ cout << "Exception message is: \n"
+ << message << "\n";
+ delete [] message;
+ return -1;
+ }
+ catch (...) {
+ cout << "Unexpected Exception \n" ;
+ return -1;
+ }
+
+ delete parser;
+ delete errHandler;
+ return 0;
+}
+ </source>
+ </s2>
+
+ <anchor name="UsingDOMAPI"/>
+ <s2 title="Using DOM API">
+ <anchor name="AccessAPI"/>
+ <s3 title="Accessing API from application code">
<source>
-// C++
#include <xercesc/dom/DOM.hpp></source>
-<source>// Java
-import org.w3c.dom.*</source>
-
<p>The header file <dom/DOM.hpp> includes all the
individual headers for the DOM API classes. </p>
- </s2>
-
- <anchor name="ClassNames"/>
- <s2 title="Class Names">
- <p>The C++ class names are prefixed with "DOM_". The intent is
- to prevent conflicts between DOM class names and other names
- that may already be in use by an application or other
- libraries that a DOM based application must link with.</p>
-
- <p>The use of C++ namespaces would also have solved this
- conflict problem, but for the fact that many compilers do not
- yet support them.</p>
-
-<source>DOM_Document myDocument; // C++
-DOM_Node aNode;
-DOM_Text someText;</source>
-
-<source>Document myDocument; // Java
-Node aNode;
-Text someText;</source>
-
- <p>If you wish to use the Java class names in C++, then you need
- to typedef them in C++. This is not advisable for the general
- case - conflicts really do occur - but can be very useful when
- converting a body of existing Java code to C++.</p>
-
-<source>typedef DOM_Document Document;
-typedef DOM_Node Node;
-
-Document myDocument; // Now C++ usage is
- // indistinguishable from Java
-Node aNode;</source>
- </s2>
-
-
- <anchor name="ObjMemMgmt"/>
- <s2 title="Objects and Memory Management">
- <p>The C++ DOM implementation uses automatic memory management,
- implemented using reference counting. As a result, the C++
- code for most DOM operations is very similar to the equivalent
- Java code, right down to the use of factory methods in the DOM
- document class for nearly all object creation, and the lack of
- any explicit object deletion.</p>
-
- <p>Consider the following code snippets </p>
-
-<source>// This is C++
-DOM_Node aNode;
-aNode = someDocument.createElement("ElementName");
-DOM_Node docRootNode = someDoc.getDocumentElement();
-docRootNode.AppendChild(aNode);</source>
-
-<source>// This is Java
-Node aNode;
-aNode = someDocument.createElement("ElementName");
-Node docRootNode = someDoc.getDocumentElement();
-docRootNode.AppendChild(aNode);</source>
-
- <p>The Java and the C++ are identical on the surface, except for
- the class names, and this similarity remains true for most DOM
- code. </p>
-
- <p>However, Java and C++ handle objects in somewhat different
- ways, making it important to understand a little bit of what
- is going on beneath the surface.</p>
-
- <p>In Java, the variable <code>aNode</code> is an object reference ,
- essentially a pointer. It is initially == null, and references
- an object only after the assignment statement in the second
- line of the code.</p>
-
- <p>In C++ the variable <code>aNode</code> is, from the C++ language's
- perspective, an actual live object. It is constructed when the
- first line of the code executes, and DOM_Node::operator = ()
- executes at the second line. The C++ class DOM_Node
- essentially a form of a smart-pointer; it implements much of
- the behavior of a Java Object Reference variable, and
- delegates the DOM behaviors to an implementation class that
- lives behind the scenes. </p>
-
- <p>Key points to remember when using the C++ DOM classes:</p>
-
- <ul>
- <li>Create them as local variables, or as member variables of
- some other class. Never "new" a DOM object into the heap or
- make an ordinary C pointer variable to one, as this will
- greatly confuse the automatic memory management. </li>
-
- <li>The "real" DOM objects - nodes, attributes, CData
- sections, whatever, do live on the heap, are created with the
- create... methods on class DOM_Document. DOM_Node and the
- other DOM classes serve as reference variables to the
- underlying heap objects.</li>
-
- <li>The visible DOM classes may be freely copied (assigned),
- passed as parameters to functions, or returned by value from
- functions.</li>
-
- <li>Memory management of the underlying DOM heap objects is
- automatic, implemented by means of reference counting. So long
- as some part of a document can be reached, directly or
- indirectly, via reference variables that are still alive in
- the application program, the corresponding document data will
- stay alive in the heap. When all possible paths of access have
- been closed off (all of the application's DOM objects have
- gone out of scope) the heap data itself will be automatically
- deleted. </li>
-
- <li>There are restrictions on the ability to subclass the DOM
- classes. </li>
-
- </ul>
-
- </s2>
-
- <anchor name="DOMString"/>
- <s2 title="DOMString">
- <p>Class DOMString provides the mechanism for passing string
- data to and from the DOM API. DOMString is not intended to be
- a completely general string class, but rather to meet the
- specific needs of the DOM API.</p>
-
- <p>The design derives from two primary sources: from the DOM's
- CharacterData interface and from class <code>java.lang.string</code>.</p>
-
- <p>Main features are:</p>
+ </s3>
- <ul>
- <li>It stores Unicode text.</li>
-
- <li>Automatic memory management, using reference counting.</li>
-
- <li>DOMStrings are mutable - characters can be inserted,
- deleted or appended.</li>
-
- </ul>
- <p></p>
-
- <p>When a string is passed into a method of the DOM, when
- setting the value of a Node, for example, the string is cloned
- so that any subsequent alteration or reuse of the string by
- the application will not alter the document contents.
- Similarly, when strings from the document are returned to an
- application via the DOM API, the string is cloned so that the
- document can not be inadvertently altered by subsequent edits
- to the string.</p>
-
- <note>The ICU classes are a more general solution to UNICODE
- character handling for C++ applications. ICU is an Open
- Source Unicode library, available at the <jump
- href="http://oss.software.ibm.com/icu/">IBM
- DeveloperWorks website</jump>.</note>
-
- </s2>
-
- <anchor name="EqualityTesting"/>
- <s2 title="Equality Testing">
- <p>The DOMString equality operators (and all of the rest of the
- DOM class conventions) are modeled after the Java
- equivalents. The equals() method compares the content of the
- string, while the == operator checks whether the string
- reference variables (the application program variables) refer
- to the same underlying string in memory. This is also true of
- DOM_Node, DOM_Element, etc., in that operator == tells whether
- the variables in the application are referring to the same
- actual node or not. It's all very Java-like </p>
-
- <ul>
- <li>bool operator == () is true if the DOMString variables
- refer to the same underlying storage. </li>
-
- <li>bool equals() is true if the strings contain the same
- characters. </li>
-
- </ul>
- <p>Here is an example of how the equality operators work: </p>
-<source>DOMString a = "Hello";
-DOMString b = a;
-DOMString c = a.clone();
-if (b == a) // This is true
-if (a == c) // This is false
-if (a.equals(c)) // This is true
-b = b + " World";
-if (b == a) // Still true, and the string's
- // value is "Hello World"
-if (a.equals(c)) // false. a is "Hello World";
- // c is still "Hello".</source>
- </s2>
-
- <anchor name="Downcasting"/>
- <s2 title="Downcasting">
- <p>Application code sometimes must cast an object reference from
- DOM_Node to one of the classes deriving from DOM_Node,
- DOM_Element, for example. The syntax for doing this in C++ is
- different from that in Java.</p>
-
-<source>// This is C++
-DOM_Node aNode = someFunctionReturningNode();
-DOM_Element el = (DOM_Element &) aNode;</source>
-
-<source>// This is Java
-Node aNode = someFunctionReturningNode();
-Element el = (Element) aNode;</source>
-
- <p>The C++ cast is not type-safe; the Java cast is checked for
- compatible types at runtime. If necessary, a type-check can
- be made in C++ using the node type information: </p>
-
-<source>// This is C++
-
-DOM_Node aNode = someFunctionReturningNode();
-DOM_Element el; // by default, el will == null.
-
-if (anode.getNodeType() == DOM_Node::ELEMENT_NODE)
- el = (DOM_Element &) aNode;
-else
- // aNode does not refer to an element.
- // Do something to recover here.</source>
-
- </s2>
-
- <anchor name="Subclassing"/>
- <s2 title="Subclassing">
- <p>The C++ DOM classes, DOM_Node, DOM_Attr, DOM_Document, etc.,
- are not designed to be subclassed by an application
- program. </p>
-
- <p>As an alternative, the DOM_Node class provides a User Data
- field for use by applications as a hook for extending nodes by
- referencing additional data or objects. See the API
- description for DOM_Node for details.</p>
+ <anchor name="DOMClassNames"/>
+ <s3 title="Class Names">
+ <p>
+ The DOM class names are prefixed with "DOM", e.g. "DOMNode". The intent is
+ to prevent conflicts between DOM class names and other names
+ that may already be in use by an application or other
+ libraries that a DOM based application must link with.</p>
+
+ <source>
+ DOMDocument* myDocument;
+ DOMNode* aNode;
+ DOMText* someText;
+ </source>
+
+ </s3>
+
+ <anchor name="DOMObjMgmt"/>
+ <s3 title="Objects Management">
+ <p>Applications would use normal C++ pointers to directly access the
+ implementation objects for Nodes in C++ DOM.
+ </p>
+
+ <p>Consider the following code snippets</p>
+
+
+ <source>
+ DOMNode* aNode;
+ DOMNode* docRootNode;
+ aNode = someDocument->createElement(L"ElementName");
+ docRootNode = someDocument->getDocumentElement();
+ docRootNode->appendChild(aNode);
+ </source>
+
+ </s3>
+
+
+ <anchor name="DOMMemMgmt"/>
+ <s3 title="Memory Management">
+ <p>The C++ DOM implementation requires users to call the DOM release() function
+ to indicate the release of any orphaned Node from the application.
+ Please see <jump href="ApacheDOMC++Binding.html#release">
+ Apache Recommended DOM C++ binding</jump> for details.</p>
+ <p>For example</p>
+ <source>
+ //
+ // Create a small document tree
+ //
+
+ {
+ DOMDocument* doc = DOMImplementation::getImplementation()->createDocument(L"", L"root", 0);
+
+ DOMElement* root = doc->getDocumentElement();
+
+ DOMElement* e1 = doc->createElement(L"FirstElement");
+ root->appendChild(e1);
+
+ DOMElement* e2 = doc->createElement(L"SecondElement");
+ root->appendChild(e2);
+
+ DOMText* textNode = doc->createTextNode(L"aTextNode");
+ e1->appendChild(textNode);
+
+ // call release() to release the resource associated with the range after done
+ DOMRange* range = doc->createRange();
+ range->release();
+
+ // removedElement is an orphaned node, call release() to release associated resource
+ DOMElement* removedElement = root->removeChild(e2);
+ removedElement->release();
+
+ // no need to release this returned object which is owned by implementation
+ DOMNodeList* nodeList = doc->getElementsByTagName(L"*");
+
+ // done with the document, call release() to release the entire document resources
+ doc->release();
+ };
+ </source>
+ </s3>
+
+ <anchor name="XMLCh"/>
+ <s3 title="String Type">
+ <p>The C++ DOM uses the plain, null-terminated (XMLCh *) utf-16 strings
+ as the String type. The (XMLCh*) utf-16 type string has low overhead.
+ All the string data would remain in memory until the document object is released.</p>
+
+ <source>
+ //C++ DOM
+ const XMLCh* nodeValue = aNode->getNodeValue();
+ </source>
+
+ </s3>
+ </s2>
+ <anchor name="Deprecated"/>
+ <s2 title="Deprecated - Java-like DOM">
+ <p>Earlier, &XercesCName; has provided a set of C++ DOM interfaces that is
+ very similar in design and use, to the Java DOM API bindings.
+ Currently, such interface has been deprecated.
+ See this <jump href="program-deprecateddom.html"> document </jump> for its programming details.
+ </p>
</s2>
</s1>
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-cvs-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-cvs-help@xml.apache.org