You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Stephen White (JIRA)" <xe...@xml.apache.org> on 2013/12/11 17:39:06 UTC

[jira] [Created] (XERCESJ-1622) CoreDocumentImpl.freeNodeListCache can be called multiple times and results in free list corruption

Stephen White created XERCESJ-1622:
--------------------------------------

             Summary: CoreDocumentImpl.freeNodeListCache can be called multiple times and results in free list corruption
                 Key: XERCESJ-1622
                 URL: https://issues.apache.org/jira/browse/XERCESJ-1622
             Project: Xerces2-J
          Issue Type: Bug
          Components: DOM (Level 3 Core)
    Affects Versions: 2.11.0
         Environment: Tested on Linux using Java 1.7.0_25 and 1.7.0_45 (x86_64), but I don't think this issue is constrained to a particular environment
            Reporter: Stephen White


Iterating through the children of a ParentNode using indexes (as in the code below) rather than the nextSibling pointers would be very slow (O(n^2) in the number of nodes) if the list is long.  Xerces speeds this up by using a cache, fNodeListCache.

NodeList l = doc.getDocumentElement().getChildNodes();
for (int i = 0; i < l.getLength(); i++) {
    l.item(i);
}

ParentNode.nodeListItem() contains some code that allows a NodeListCache to be freed when iteration is complete.  
Freeing a NodeListCache is done inside CoreDocumentImpl.freeNodeListCache which adds the NodeListCache to a list of free caches ready for re-use by a different ParentNode, avoiding object allocations.  A NodeListCache can still be used after it is freed, as long as it hasn't been claimed by anything else yet.

If you iterate through the same child list twice then the NodeListCache will be freed multiple times (during the second iteration firstAccess will be false and, according to the logic in nodeListItem(), it'll get freed again).

Unfortunately CoreDocumentImpl.freeNodeListCache adds NodeListCaches to the free list unconditionally, so if it is called multiple times on the same NodeListCache then the cache is added to the list multiple times.  This creates a cycle in the linked list.  When this happens subsequent calls to getNodeListCache always return the same cache.  This means that in a depth-first traversal of the DOM the same cache gets used for children as for their parents, and as the children use it the state needed for the parent is lost.  The result is that things run as slowly as if the cache wasn't present at all.

We've got real code that triggers this situation when trying to build a Saxon TinyTree from a Xerces DOM, as the Saxon DOMSender iterates using item indexes rather than nextSibling pointers.

Example code to follow.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org