You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Michael Glavassevich <mr...@ca.ibm.com> on 2010/06/01 00:24:39 UTC
Re: IndexOutOfBoundsException within adoptNode
Hi Chad,
I believe you've found a bug. There have been problems in the past with
transferring deferred nodes from one document to another through adoptNode
(). Perhaps we missed this scenario.
Can you create a new JIRA issue [1] and a test case which demonstrates the
problem?
Thanks.
[1] https://issues.apache.org/jira/browse/XERCESJ
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
Chad La Joie <la...@itumi.biz> wrote on 05/31/2010 05:44:56 PM:
> Just a bit more data. If I create a completely new document and adopt
> all the elements in to the new document all the elements are adopted
> without a problem.
>
> On 5/31/10 4:50 PM, Chad La Joie wrote:
> > I have some code that takes a number of relatively large (~10MB) XML
> > documents, extracts various elements and then adopts them in to one of
> > the documents (i.e. if documents A, B, and C were read in and worked
on,
> > all the elements might get adopted into A).
> >
> > This works fine for some number of adoptions but eventually I always
get
> > the following error:
> > java.lang.ArrayIndexOutOfBoundsException: 32
> > at org.apache.xerces.dom.DeferredDocumentImpl.clearChunkValue(Unknown
> > Source)
> > at org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString
(Unknown
> > Source)
> > at org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString
(Unknown
> > Source)
> > at org.apache.xerces.dom.DeferredTextImpl.synchronizeData(Unknown
Source)
> > at org.apache.xerces.dom.NodeImpl.setOwnerDocument(Unknown Source)
> > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > at org.apache.xerces.dom.CoreDocumentImpl.adoptNode(Unknown Source)
> >
> > I can't see anything special or unique about the element that is being
> > adopted when the error occurs. It pretty much looks like every other
> > element that gets adopted. Given the same order of elements to be
> > adopted the error always occurs one the same element. Changes in the
> > ordering cause more or fewer elements to be adopted. This makes me
> > wonder if the issue is an internal buffer somewhere that isn't growing
> > appropriately.
> >
> > Has anyone run in to this before?
> >
> > Thanks.
> >
>
> --
> Chad La Joie
> http://itumi.biz
> trusted identities, delivered
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
Re: IndexOutOfBoundsException within adoptNode
Posted by Chad La Joie <la...@itumi.biz>.
Excellent, thanks for the tips and the bug fix.
On 6/2/10 2:52 PM, Michael Glavassevich wrote:
> Hi Chad,
>
> Chad La Joie <la...@itumi.biz> wrote on 06/02/2010 01:59:11 PM:
>
> > Thanks Michael,
> >
> > Somewhat related to this, I note that the feature 'defer-node-expansion'
> > can not be set to false when the default document factory is used[1].
>
> That's the Xerces-J 1.x documentation. The behaviour of the feature in
> Xerces-J 2.x is documented here [2]. It's default is true but you can
> set its value to false if you want to.
>
> > I understand why a person might want to defer node expansion but in
> all of
> > my cases, I have to traverse every node. I am wondering, does having a
> > deferred node decrease performance, especially over larger documents,
> > when you're going to be hitting every node?
>
> If you traverse every node then I would expect it to cost more than if
> they were constructed eagerly by the parser. More details on the
> performance implications of this feature are in an article [3] I
> co-wrote some time ago.
>
> > If so, might it be possible to allow adjusting the defer-node-expansion
> > feature even when the default factory is used?
>
> Try setting the feature to false on the DocumentBuilder.
>
> > [1] http://xerces.apache.org/xerces-j/features.html#defer-node-expansion
>
> Thanks.
>
> [2]
> http://xerces.apache.org/xerces2-j/features.html#dom.defer-node-expansion
> [3] http://www.ibm.com/developerworks/library/x-perfap2.html/#N100CB
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
--
Chad La Joie
http://itumi.biz
trusted identities, delivered
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org
Re: IndexOutOfBoundsException within adoptNode
Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Chad,
Chad La Joie <la...@itumi.biz> wrote on 06/02/2010 01:59:11 PM:
> Thanks Michael,
>
> Somewhat related to this, I note that the feature 'defer-node-expansion'
> can not be set to false when the default document factory is used[1].
That's the Xerces-J 1.x documentation. The behaviour of the feature in
Xerces-J 2.x is documented here [2]. It's default is true but you can set
its value to false if you want to.
> I understand why a person might want to defer node expansion but in all
of
> my cases, I have to traverse every node. I am wondering, does having a
> deferred node decrease performance, especially over larger documents,
> when you're going to be hitting every node?
If you traverse every node then I would expect it to cost more than if they
were constructed eagerly by the parser. More details on the performance
implications of this feature are in an article [3] I co-wrote some time
ago.
> If so, might it be possible to allow adjusting the defer-node-expansion
> feature even when the default factory is used?
Try setting the feature to false on the DocumentBuilder.
> [1] http://xerces.apache.org/xerces-j/features.html#defer-node-expansion
Thanks.
[2]
http://xerces.apache.org/xerces2-j/features.html#dom.defer-node-expansion
[3] http://www.ibm.com/developerworks/library/x-perfap2.html/#N100CB
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
Re: IndexOutOfBoundsException within adoptNode
Posted by Chad La Joie <la...@itumi.biz>.
Thanks Michael,
Somewhat related to this, I note that the feature 'defer-node-expansion'
can not be set to false when the default document factory is used[1]. I
understand why a person might want to defer node expansion but in all of
my cases, I have to traverse every node. I am wondering, does having a
deferred node decrease performance, especially over larger documents,
when you're going to be hitting every node? If so, might it be possible
to allow adjusting the defer-node-expansion feature even when the
default factory is used?
[1] http://xerces.apache.org/xerces-j/features.html#defer-node-expansion
On 6/2/10 11:56 AM, Michael Glavassevich wrote:
> Thanks Chad. The test was very helpful and seems to be passing now after
> the fix I just committed. We have to force a full expansion of the node
> being transferred while it still has a reference to its original
> document and weren't doing that when both DOMs (source and target) were
> initially built by a parser.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> Chad La Joie <la...@itumi.biz> wrote on 06/02/2010 08:26:32 AM:
>
> > Okay, bug with test case submitted:
> >
> > https://issues.apache.org/jira/browse/XERCESJ-1450
> >
> > On 5/31/10 6:24 PM, Michael Glavassevich wrote:
> > > Hi Chad,
> > >
> > > I believe you've found a bug. There have been problems in the past with
> > > transferring deferred nodes from one document to another through
> > > adoptNode(). Perhaps we missed this scenario.
> > >
> > > Can you create a new JIRA issue [1] and a test case which demonstrates
> > > the problem?
> > >
> > > Thanks.
> > >
> > > [1] https://issues.apache.org/jira/browse/XERCESJ
> > >
> > > Michael Glavassevich
> > > XML Parser Development
> > > IBM Toronto Lab
> > > E-mail: mrglavas@ca.ibm.com
> > > E-mail: mrglavas@apache.org
> > >
> > > Chad La Joie <la...@itumi.biz> wrote on 05/31/2010 05:44:56 PM:
> > >
> > > > Just a bit more data. If I create a completely new document and adopt
> > > > all the elements in to the new document all the elements are adopted
> > > > without a problem.
> > > >
> > > > On 5/31/10 4:50 PM, Chad La Joie wrote:
> > > > > I have some code that takes a number of relatively large
> (~10MB) XML
> > > > > documents, extracts various elements and then adopts them in to
> one of
> > > > > the documents (i.e. if documents A, B, and C were read in and
> > > worked on,
> > > > > all the elements might get adopted into A).
> > > > >
> > > > > This works fine for some number of adoptions but eventually I
> > > always get
> > > > > the following error:
> > > > > java.lang.ArrayIndexOutOfBoundsException: 32
> > > > > at
> org.apache.xerces.dom.DeferredDocumentImpl.clearChunkValue(Unknown
> > > > > Source)
> > > > > at
> > > org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
> > > > > Source)
> > > > > at
> > > org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
> > > > > Source)
> > > > > at org.apache.xerces.dom.DeferredTextImpl.synchronizeData(Unknown
> > > Source)
> > > > > at org.apache.xerces.dom.NodeImpl.setOwnerDocument(Unknown Source)
> > > > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown
> Source)
> > > > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown
> Source)
> > > > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown
> Source)
> > > > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown
> Source)
> > > > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown
> Source)
> > > > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown
> Source)
> > > > > at org.apache.xerces.dom.CoreDocumentImpl.adoptNode(Unknown Source)
> > > > >
> > > > > I can't see anything special or unique about the element that
> is being
> > > > > adopted when the error occurs. It pretty much looks like every
> other
> > > > > element that gets adopted. Given the same order of elements to be
> > > > > adopted the error always occurs one the same element. Changes
> in the
> > > > > ordering cause more or fewer elements to be adopted. This makes me
> > > > > wonder if the issue is an internal buffer somewhere that isn't
> growing
> > > > > appropriately.
> > > > >
> > > > > Has anyone run in to this before?
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > > > --
> > > > Chad La Joie
> > > > http://itumi.biz
> > > > trusted identities, delivered
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> > > > For additional commands, e-mail: j-users-help@xerces.apache.org
> > >
> >
> > --
> > Chad La Joie
> > http://itumi.biz
> > trusted identities, delivered
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> > For additional commands, e-mail: j-users-help@xerces.apache.org
>
--
Chad La Joie
http://itumi.biz
trusted identities, delivered
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org
Re: IndexOutOfBoundsException within adoptNode
Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Thanks Chad. The test was very helpful and seems to be passing now after
the fix I just committed. We have to force a full expansion of the node
being transferred while it still has a reference to its original document
and weren't doing that when both DOMs (source and target) were initially
built by a parser.
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
Chad La Joie <la...@itumi.biz> wrote on 06/02/2010 08:26:32 AM:
> Okay, bug with test case submitted:
>
> https://issues.apache.org/jira/browse/XERCESJ-1450
>
> On 5/31/10 6:24 PM, Michael Glavassevich wrote:
> > Hi Chad,
> >
> > I believe you've found a bug. There have been problems in the past with
> > transferring deferred nodes from one document to another through
> > adoptNode(). Perhaps we missed this scenario.
> >
> > Can you create a new JIRA issue [1] and a test case which demonstrates
> > the problem?
> >
> > Thanks.
> >
> > [1] https://issues.apache.org/jira/browse/XERCESJ
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: mrglavas@ca.ibm.com
> > E-mail: mrglavas@apache.org
> >
> > Chad La Joie <la...@itumi.biz> wrote on 05/31/2010 05:44:56 PM:
> >
> > > Just a bit more data. If I create a completely new document and
adopt
> > > all the elements in to the new document all the elements are adopted
> > > without a problem.
> > >
> > > On 5/31/10 4:50 PM, Chad La Joie wrote:
> > > > I have some code that takes a number of relatively large (~10MB)
XML
> > > > documents, extracts various elements and then adopts them in to
one of
> > > > the documents (i.e. if documents A, B, and C were read in and
> > worked on,
> > > > all the elements might get adopted into A).
> > > >
> > > > This works fine for some number of adoptions but eventually I
> > always get
> > > > the following error:
> > > > java.lang.ArrayIndexOutOfBoundsException: 32
> > > > at org.apache.xerces.dom.DeferredDocumentImpl.clearChunkValue
(Unknown
> > > > Source)
> > > > at
> > org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
> > > > Source)
> > > > at
> > org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
> > > > Source)
> > > > at org.apache.xerces.dom.DeferredTextImpl.synchronizeData(Unknown
> > Source)
> > > > at org.apache.xerces.dom.NodeImpl.setOwnerDocument(Unknown Source)
> > > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown
Source)
> > > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown
Source)
> > > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown
Source)
> > > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown
Source)
> > > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown
Source)
> > > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown
Source)
> > > > at org.apache.xerces.dom.CoreDocumentImpl.adoptNode(Unknown
Source)
> > > >
> > > > I can't see anything special or unique about the element that is
being
> > > > adopted when the error occurs. It pretty much looks like every
other
> > > > element that gets adopted. Given the same order of elements to be
> > > > adopted the error always occurs one the same element. Changes in
the
> > > > ordering cause more or fewer elements to be adopted. This makes me
> > > > wonder if the issue is an internal buffer somewhere that isn't
growing
> > > > appropriately.
> > > >
> > > > Has anyone run in to this before?
> > > >
> > > > Thanks.
> > > >
> > >
> > > --
> > > Chad La Joie
> > > http://itumi.biz
> > > trusted identities, delivered
> > >
> > >
---------------------------------------------------------------------
> > > To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> > > For additional commands, e-mail: j-users-help@xerces.apache.org
> >
>
> --
> Chad La Joie
> http://itumi.biz
> trusted identities, delivered
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
Re: IndexOutOfBoundsException within adoptNode
Posted by Chad La Joie <la...@itumi.biz>.
Okay, bug with test case submitted:
https://issues.apache.org/jira/browse/XERCESJ-1450
On 5/31/10 6:24 PM, Michael Glavassevich wrote:
> Hi Chad,
>
> I believe you've found a bug. There have been problems in the past with
> transferring deferred nodes from one document to another through
> adoptNode(). Perhaps we missed this scenario.
>
> Can you create a new JIRA issue [1] and a test case which demonstrates
> the problem?
>
> Thanks.
>
> [1] https://issues.apache.org/jira/browse/XERCESJ
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> Chad La Joie <la...@itumi.biz> wrote on 05/31/2010 05:44:56 PM:
>
> > Just a bit more data. If I create a completely new document and adopt
> > all the elements in to the new document all the elements are adopted
> > without a problem.
> >
> > On 5/31/10 4:50 PM, Chad La Joie wrote:
> > > I have some code that takes a number of relatively large (~10MB) XML
> > > documents, extracts various elements and then adopts them in to one of
> > > the documents (i.e. if documents A, B, and C were read in and
> worked on,
> > > all the elements might get adopted into A).
> > >
> > > This works fine for some number of adoptions but eventually I
> always get
> > > the following error:
> > > java.lang.ArrayIndexOutOfBoundsException: 32
> > > at org.apache.xerces.dom.DeferredDocumentImpl.clearChunkValue(Unknown
> > > Source)
> > > at
> org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
> > > Source)
> > > at
> org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
> > > Source)
> > > at org.apache.xerces.dom.DeferredTextImpl.synchronizeData(Unknown
> Source)
> > > at org.apache.xerces.dom.NodeImpl.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.CoreDocumentImpl.adoptNode(Unknown Source)
> > >
> > > I can't see anything special or unique about the element that is being
> > > adopted when the error occurs. It pretty much looks like every other
> > > element that gets adopted. Given the same order of elements to be
> > > adopted the error always occurs one the same element. Changes in the
> > > ordering cause more or fewer elements to be adopted. This makes me
> > > wonder if the issue is an internal buffer somewhere that isn't growing
> > > appropriately.
> > >
> > > Has anyone run in to this before?
> > >
> > > Thanks.
> > >
> >
> > --
> > Chad La Joie
> > http://itumi.biz
> > trusted identities, delivered
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> > For additional commands, e-mail: j-users-help@xerces.apache.org
>
--
Chad La Joie
http://itumi.biz
trusted identities, delivered
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org
Re: IndexOutOfBoundsException within adoptNode
Posted by Chad La Joie <la...@itumi.biz>.
Hey Michael,
Yeah, I'll try to cook up a test case in the next day or two and get a
bug in for it.
On 5/31/10 6:24 PM, Michael Glavassevich wrote:
> Hi Chad,
>
> I believe you've found a bug. There have been problems in the past with
> transferring deferred nodes from one document to another through
> adoptNode(). Perhaps we missed this scenario.
>
> Can you create a new JIRA issue [1] and a test case which demonstrates
> the problem?
>
> Thanks.
>
> [1] https://issues.apache.org/jira/browse/XERCESJ
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> Chad La Joie <la...@itumi.biz> wrote on 05/31/2010 05:44:56 PM:
>
> > Just a bit more data. If I create a completely new document and adopt
> > all the elements in to the new document all the elements are adopted
> > without a problem.
> >
> > On 5/31/10 4:50 PM, Chad La Joie wrote:
> > > I have some code that takes a number of relatively large (~10MB) XML
> > > documents, extracts various elements and then adopts them in to one of
> > > the documents (i.e. if documents A, B, and C were read in and
> worked on,
> > > all the elements might get adopted into A).
> > >
> > > This works fine for some number of adoptions but eventually I
> always get
> > > the following error:
> > > java.lang.ArrayIndexOutOfBoundsException: 32
> > > at org.apache.xerces.dom.DeferredDocumentImpl.clearChunkValue(Unknown
> > > Source)
> > > at
> org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
> > > Source)
> > > at
> org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
> > > Source)
> > > at org.apache.xerces.dom.DeferredTextImpl.synchronizeData(Unknown
> Source)
> > > at org.apache.xerces.dom.NodeImpl.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ParentNode.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.ElementImpl.setOwnerDocument(Unknown Source)
> > > at org.apache.xerces.dom.CoreDocumentImpl.adoptNode(Unknown Source)
> > >
> > > I can't see anything special or unique about the element that is being
> > > adopted when the error occurs. It pretty much looks like every other
> > > element that gets adopted. Given the same order of elements to be
> > > adopted the error always occurs one the same element. Changes in the
> > > ordering cause more or fewer elements to be adopted. This makes me
> > > wonder if the issue is an internal buffer somewhere that isn't growing
> > > appropriately.
> > >
> > > Has anyone run in to this before?
> > >
> > > Thanks.
> > >
> >
> > --
> > Chad La Joie
> > http://itumi.biz
> > trusted identities, delivered
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> > For additional commands, e-mail: j-users-help@xerces.apache.org
>
--
Chad La Joie
http://itumi.biz
trusted identities, delivered
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org