You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Stephen White (JIRA)" <xe...@xml.apache.org> on 2013/12/11 17:49:06 UTC

[jira] [Updated] (XERCESJ-1622) CoreDocumentImpl.freeNodeListCache can be called multiple times and results in free list corruption

     [ https://issues.apache.org/jira/browse/XERCESJ-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen White updated XERCESJ-1622:
-----------------------------------

    Attachment: Bug.java

Demonstraton of the bug.  The output relates to walking the tree in a depth-first manner, but prior to this the code iterates through a NodeList and then accesses it again in a manner that results in the same NodeListCache being added to the free list twice.

$ java -cp xerces-2_11_0/xercesImpl.jar:xerces-2_11_0/xml-apis.jar:. Bug
[root: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](1) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](2) using org.apache.xerces.dom.NodeListCache@1a0a87c
[root: null](1) using null
[p: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](1) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](2) using org.apache.xerces.dom.NodeListCache@1a0a87c
[root: null](2) using null
[p: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](1) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](2) using org.apache.xerces.dom.NodeListCache@1a0a87c
[root: null](3) using null
[p: null](0) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](1) using org.apache.xerces.dom.NodeListCache@1a0a87c
[p: null](2) using org.apache.xerces.dom.NodeListCache@1a0a87c

As you can see the calls to item(2) and item(3) on the root don't use a cache.  Each time a call to item(...) on the root Node claims the cache it looses it again straight away, as the same cache is given to the iteration over the <p> element children.

The expected output is:

$ java -cp fixedXercesImpl.jar:xerces-2_11_0/xml-apis.jar:. Bug
[root: null](0) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](0) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](1) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](2) using org.apache.xerces.dom.NodeListCache@e44d85
[root: null](1) using null
[p: null](0) using org.apache.xerces.dom.NodeListCache@9d93be
[p: null](1) using org.apache.xerces.dom.NodeListCache@9d93be
[p: null](2) using org.apache.xerces.dom.NodeListCache@9d93be
[root: null](2) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](0) using org.apache.xerces.dom.NodeListCache@9d93be
[p: null](1) using org.apache.xerces.dom.NodeListCache@9d93be
[p: null](2) using org.apache.xerces.dom.NodeListCache@9d93be
[root: null](3) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](0) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](1) using org.apache.xerces.dom.NodeListCache@e44d85
[p: null](2) using org.apache.xerces.dom.NodeListCache@e44d85

Here the NodeListCache used for iterating over the children of <root> is lost between root.item(0) and root.item(1), exactly the same as in the first case.  This is because it has been added to the free list, as Iteration is expected to have been completed.  The difference here is that once root.item(1) has reclaimed it, it stays claimed.  Calls to item(...) on the <p> elements get given a new cache.

> CoreDocumentImpl.freeNodeListCache can be called multiple times and results in free list corruption
> ---------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1622
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1622
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 3 Core)
>    Affects Versions: 2.11.0
>         Environment: Tested on Linux using Java 1.7.0_25 and 1.7.0_45 (x86_64), but I don't think this issue is constrained to a particular environment
>            Reporter: Stephen White
>         Attachments: Bug.java
>
>
> Iterating through the children of a ParentNode using indexes (as in the code below) rather than the nextSibling pointers would be very slow (O(n^2) in the number of nodes) if the list is long.  Xerces speeds this up by using a cache, fNodeListCache.
> NodeList l = doc.getDocumentElement().getChildNodes();
> for (int i = 0; i < l.getLength(); i++) {
>     l.item(i);
> }
> ParentNode.nodeListItem() contains some code that allows a NodeListCache to be freed when iteration is complete.  
> Freeing a NodeListCache is done inside CoreDocumentImpl.freeNodeListCache which adds the NodeListCache to a list of free caches ready for re-use by a different ParentNode, avoiding object allocations.  A NodeListCache can still be used after it is freed, as long as it hasn't been claimed by anything else yet.
> If you iterate through the same child list twice then the NodeListCache will be freed multiple times (during the second iteration firstAccess will be false and, according to the logic in nodeListItem(), it'll get freed again).
> Unfortunately CoreDocumentImpl.freeNodeListCache adds NodeListCaches to the free list unconditionally, so if it is called multiple times on the same NodeListCache then the cache is added to the list multiple times.  This creates a cycle in the linked list.  When this happens subsequent calls to getNodeListCache always return the same cache.  This means that in a depth-first traversal of the DOM the same cache gets used for children as for their parents, and as the children use it the state needed for the parent is lost.  The result is that things run as slowly as if the cache wasn't present at all.
> We've got real code that triggers this situation when trying to build a Saxon TinyTree from a Xerces DOM, as the Saxon DOMSender iterates using item indexes rather than nextSibling pointers.
> Example code to follow.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org