You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by MA...@dstsystems.com on 2003/06/06 18:20:50 UTC

A question about DOMTreeWalker::nextSibling




I'm running into a problem in my DOMElement wrapper class thats confusing
me.  I have a method, getElementList, that fills a provided vector with
Elements that are subelements of the current one, and have a given name.
The user can optionally search through the entire subtree, or limit the
search to immediate children only. If they want to search the entire
subtree, I use the DOMTreeWalker::nextNode method.  If not, I use
DOMTreeWalker::nextSibling.  Here's the code for the method:

void Element::getElementList( vector< Element * > &list,
                              const XMLCh *name,
                              bool searchChildren )
{
  DOMElementSearchFilter filt( name );
  DOMTreeWalker *tree = mDoc->createTreeWalker( mElement,

DOMNodeFilter::SHOW_ELEMENT,
                                                &filt,
                                                false );
  DOMNode * ( DOMTreeWalker::*next )();
  DOMNode *curr;
  if ( searchChildren )
  {
    next = &DOMTreeWalker::nextNode;
    curr = (tree->*next)();
  }
  else
  {
    next = &DOMTreeWalker::nextSibling;
    curr = tree->firstChild();
  }
  while ( curr )
  {
    list.push_back( getElementObject( static_cast< DOMElement *>(curr) ) );
    curr = (tree->*next)();
  }

}

The filter provided rejects any non DOMElement node, and skips any
DOMElement node that does not have the provided name.

My problem is that nextSibling also seems to be checking children.  When
set a breakpoint in my filter, and processing this XML:
<?xml version="1.0" encoding="ISO-8859-1"?>
<X>
    <AA></AA>
    <bb></bb>
    <cc>
        <aaa></aaa>
        <bbb>
            <aaaa></aaaa>
            <bbbb></bbbb>
            <cccc></cccc>
        </bbb>
    </cc>
    <dd></dd>
    <ee></ee>
</X>

looking for elements named "bb" it does not do what I expect.  On the first
call to nextSibling( when bb is current ) I see the cc node( ok, its
skipped ), and then the aaa node.  Since aaa is not a sibling of bb, I
don't understand why I'm seeing it.  From the DOM spec I got:

nextSibling
Moves the TreeWalker to the next sibling of the current node, and returns
the new node. If the current node has no visible next sibling, returns
null, and retains the current node

The definition of sibling from the spec is what you'd expect:
Two nodes are siblings if and only if they have the same parent node

So I don't see why element aaa was ever considered as a possible next
sibling.

This is a problem because the algorithm being used to search the tree is
recursive.  We have a test case where we're looking for elements in a
large( 600K+ ) document.  The elements exist as children of the document
root, near the begining of the document.  Unfortunately, after the last one
is found, the next attempt to find one ends up running out of stack space,
terminating the program. I have been able to get around the problem for the
time being, by adding the seachChildren flag, and the current DOMElement
object to the filter constructor, and rejecting any node whose parent is
not the node passed into the filter. That still means I look at all the
grandchildren( aaa and bbb in the example above ), and I don't know if that
will always be sufficient.

I suppose another option would be to do the check on name in my
getElementList method, so that nextSibling can always( until the end of the
document ) find a qualifiying node quickly.  But then I have to wonder
"Whats the point of the filter?"


Marc Robertson
Staff Consultant
AWD Development
DST Systems, Inc.


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org