You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Michael Williams <MW...@shunra.co.il> on 2004/10/18 15:56:37 UTC

getChildNodes - wrote efficient list implementation

Hi Everybody,
 
I think I've written a fairly decent implementation for the DOMNodeList
to provide real array access.  Would anyone be interested in taking a
look at the code and basically checking my work?  It would be nice to be
able to see it wind up in the codebase.  I built against xerces v. 2.5
please e-mail if me for the source code.  It is in a zip file with the
proper directory structure containing only the new files and files that
have changed.   
 
Thanks,
Michael Williams
Shunra Software
 
Problem:
 
The DOMNode in xerces has a method called getChildNodes, which returns a
DOMNodeList of child elements.  This DOMNodeList basically provides a
thin wrapper for the internal linked list, and is not efficient when
doing element traversals, and direct array-type lookups with the
DOMNodeList.item call.
 
Solution:
 
The solution to this lies in creating a proper implementation of the
xerces DomNodeList class.  We will extend this class to implement the
following method
 
virtual DOMNode  *itemByTagNameNS(XMLSize_t index, const XMLCh
*namespaceURI, const XMLCh *localName) const {return 0;};
Retrieve an item in a specific collection by index, tag name, and
namespace
 
virtual XMLSize_t getLengthByTagNameNS(const XMLCh *namespaceURI, const
XMLCh *localName) const {return 0;};
Get the length of a specific collection by index, tag name, and
namespace
 
virtual bool insertElementAtByTagNameNS(DOMNode *, XMLSize_t index);
Insert an item into the list by index of the collection of elements with
matching namespace URI and localname.  Returns false if the index < 0 or
greater than the number of elements of the appropriate type (namespace
and element name match).
 
virtual DOMNode *removeElementAtByTagNameNS(XMLSize_t index, const XMLCh
*namespaceURI, const XMLCh *localName)
Remove an element by index of the collection of elements with matching
Namespace URI and localname.  Returns the removed node upon success, or
false if the index < 0 or greater than the number of elements of the
appropriate type.
 
virtual bool insertElementAt(DOMNode *, XMLSize_t index) = 0;
Inserts an element into the list of child elements by index.  Returns
false if the index < 0 or greater than the number of child elements.
 
virtual DOMNode *removeElementAt(XMLSize_t index) = 0;
Remove an element by index of the collection of elements with matching
Namespace URI and localname.  Returns the removed node upon success, or
false if the index < 0 or greater than the number of child elements.
 
Comments regarding element navigation:
 
All methods of navigating elements will probably yield about the same
level of performance, and linked list navigation or array navigation can
be mixed without any adverse performance effects.
 
Comments regarding element deletion and insertion:
 
Choose one of the following three methods for deleting and inserting
children elements into an element, and for the same element, do not mix
the methods.  Otherwise performance will most likely suffer.
 
1. Inserting and deleting by linked list methods (original xerces
methods)
2. Inserting and deleting into the vector of ALL children (not using
namespace and nodename identifiers)
3. Inserting and deleting by NS and nodename.
 
The reason is that because algorithms to insert or delete from a linked
list can be very efficient, but then to try to find the element to be
deleted in an array will yield slow performance, because there is no
good way to know the array index of the element beforehand.
 
Also, the algorithms to partition an elements children into a general
child array, and arrays by namespace and element name, will not yield
good performance if these two sets of arrays (partitioned and general)
are both used in the same element for inserting and deleting elements.
 
 


---------------------------------------------------------------------------------------------------------------
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the Shunra Software Ltd. system manager at Administrator@shunra.co.il
---------------------------------------------------------------------------------------------------------------


Re: getChildNodes - wrote efficient list implementation

Posted by Alberto Massari <am...@progress.com>.
Hi Michael,
the proper way to submit code for inclusion in Xerces is by filing a 
bug/PCR in Jira, attaching the code/patch you wrote 
(http://nagoya.apache.org/jira/secure/BrowseProject.jspa?id=10510). Be 
careful to read (or have your lawyers do that for you) the terms of the 
Apache license and then check the check box "Grant license to ASF for 
inclusion in ASF works", otherwise your code cannot even be taken into 
consideration.

Alberto

P.S. Please also remove your disclaimer from your messages to the list; it 
doesn't apply in this case, as the recipient of your message is a public 
mailing list archived on public web sites.

At 15.56 18/10/2004 +0200, Michael Williams wrote:
>Hi Everybody,
>
>I think I've written a fairly decent implementation for the DOMNodeList to 
>provide real array access.  Would anyone be interested in taking a look at 
>the code and basically checking my work?  It would be nice to be able to 
>see it wind up in the codebase.  I built against xerces v. 2.5 please 
>e-mail if me for the source code.  It is in a zip file with the proper 
>directory structure containing only the new files and files that have 
>changed.
>
>Thanks,
>Michael Williams
>Shunra Software
>
>Problem:
>
>The DOMNode in xerces has a method called getChildNodes, which returns a 
>DOMNodeList of child elements.  This DOMNodeList basically provides a thin 
>wrapper for the internal linked list, and is not efficient when doing 
>element traversals, and direct array-type lookups with the 
>DOMNodeList.item call.
>
>Solution:
>
>The solution to this lies in creating a proper implementation of the 
>xerces DomNodeList class.  We will extend this class to implement the 
>following method
>
>virtual DOMNode  *itemByTagNameNS(XMLSize_t index, const XMLCh 
>*namespaceURI, const XMLCh *localName) const {return 0;};
>Retrieve an item in a specific collection by index, tag name, and namespace
>
>virtual XMLSize_t getLengthByTagNameNS(const XMLCh *namespaceURI, const 
>XMLCh *localName) const {return 0;};
>Get the length of a specific collection by index, tag name, and namespace
>
>virtual bool insertElementAtByTagNameNS(DOMNode *, XMLSize_t index);
>Insert an item into the list by index of the collection of elements with 
>matching namespace URI and localname.  Returns false if the index < 0 or 
>greater than the number of elements of the appropriate type (namespace and 
>element name match).
>
>virtual DOMNode *removeElementAtByTagNameNS(XMLSize_t index, const XMLCh 
>*namespaceURI, const XMLCh *localName)
>Remove an element by index of the collection of elements with matching 
>Namespace URI and localname.  Returns the removed node upon success, or 
>false if the index < 0 or greater than the number of elements of the 
>appropriate type.
>
>virtual bool insertElementAt(DOMNode *, XMLSize_t index) = 0;
>Inserts an element into the list of child elements by index.  Returns 
>false if the index < 0 or greater than the number of child elements.
>
>virtual DOMNode *removeElementAt(XMLSize_t index) = 0;
>Remove an element by index of the collection of elements with matching 
>Namespace URI and localname.  Returns the removed node upon success, or 
>false if the index < 0 or greater than the number of child elements.
>
>Comments regarding element navigation:
>
>All methods of navigating elements will probably yield about the same 
>level of performance, and linked list navigation or array navigation can 
>be mixed without any adverse performance effects.
>
>Comments regarding element deletion and insertion:
>
>Choose one of the following three methods for deleting and inserting 
>children elements into an element, and for the same element, do not mix 
>the methods.  Otherwise performance will most likely suffer.
>
>1. Inserting and deleting by linked list methods (original xerces methods)
>2. Inserting and deleting into the vector of ALL children (not using 
>namespace and nodename identifiers)
>3. Inserting and deleting by NS and nodename.
>
>The reason is that because algorithms to insert or delete from a linked 
>list can be very efficient, but then to try to find the element to be 
>deleted in an array will yield slow performance, because there is no good 
>way to know the array index of the element beforehand.
>
>Also, the algorithms to partition an elements children into a general 
>child array, and arrays by namespace and element name, will not yield good 
>performance if these two sets of arrays (partitioned and general) are both 
>used in the same element for inserting and deleting elements.
>
>
>
>
>---------------------------------------------------------------------------------------------------------------
>This email and any files transmitted with it are confidential and
>intended solely for the use of the individual or entity to whom they
>are addressed. If you have received this email in error please notify
>the Shunra Software Ltd. system manager at Administrator@shunra.co.il
>---------------------------------------------------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org