You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Herry <he...@yahoo.com> on 2002/05/09 13:23:43 UTC

Concurrent Access of DOM tree

Hi,
I have an application that requires simultaneous
access of the DOM tree.
Having created a DOM tree, I created a test case that
simulates 1100 threads access to a DOM tree, each
thread searching the DOM tree 15 times at 3 leaf
nodes.
The DOM tree was pre-loaded into memory before the
test was run. 
I got the following error (I am actually using
weblogic's flavour of xerces)

java.lang.NullPointerException
at
weblogic.apache.xerces.dom.DeferredElementNSImpl.synchronizeData(DeferredElementNSImpl.java:147)
        at
weblogic.apache.xerces.dom.ElementImpl.getTagName(ElementImpl.java:270)

Does anybody have any idea how to get TRUE concurrent
access to leaf nodes in the same DOM tree?

Cheers,
Herry

__________________________________________________
Do You Yahoo!?
Yahoo! Kickin' Party - Win a 5-star getaway to exotic Bali!
http://kickin.yahoo.com.sg

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Concurrent Access of DOM tree

Posted by Andy Clark <an...@apache.org>.
Herry wrote:
> so if I do require thread safety for my application,
> what are the possible solutions (at least as far as
> xerces is concern). Will traversing the entire tree
> during the initialisation stage solve the problem?

You can synchronize on the document but then this would
then "lock" the entire tree no matter where it's being
accessed by the various threads. So finer grained locking
would be preferred.

Here's a simple solution for better synchronization
access of the document: always synchronize on the *node* 
that you're making the call on. 

When you parse an XML file and are returned the document, 
you cannot be guaranteed that all of the sub-nodes (or 
data) has been properly instantiated and linked together. 
But if all of your code that tries to access the Document 
methods (children, etc) is synchronized on the Document 
object, then there won't be any problem. For example:

  Document doc = /* ... */

  Element root = null;
  synchronized (doc) {
    root = doc.getDocumentElement();
  }

Accessing other nodes in the document would be done the
same way:

  Node child = null;
  synchronized (root) {
    child = root.getFirstChild();
  }

How much you synchronize -- whether it's whole blocks of
code or just the access of each node is up to you. It's
more work on your part but it's the only way to make
sure that you don't have problems accessing the document
from multiple threads.

-- 
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Concurrent Access of DOM tree

Posted by Herry <he...@yahoo.com>.
Hi,
so if I do require thread safety for my application,
what are the possible solutions (at least as far as
xerces is concern). Will traversing the entire tree
during the initialisation stage solve the problem?

Thanks.

Cheers,
Herry.

 --- Andy Clark <an...@apache.org> wrote: > Herry Koh
wrote:
> > thanks for the reply. So if I want to have
> concurrent access to the
> > DOM, I guess one way to do it will be to disable
> the Deferred
> > feature. The way to do it is clearly described in
> the Xerces docs,
> 
> You're missing the entire point. Please re-read
> Elena's
> post because she hits the nail right on the head. In
> short,
> you can *never* assume that any implementation of
> the DOM
> interface *guarantees* thread-safety. There's
> nothing in
> the spec that mandates it and it's just not smart
> programming.
> 
> This is not a deficiency or a bug in the Xerces DOM
> impl.
> And even in the non-deferred case, you should not
> assume
> that even read-only access is thread-safe. There may
> be
> (and probably are!) cases where the non-deferred DOM
> impl
> only creates objects (like node-lists, etc) when
> they are
> requested. This is simply done for performance and
> is
> completely allowed by the interface.
> 
> -- 
> Andy Clark * andyc@apache.org
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail:
> xerces-j-user-help@xml.apache.org
>  

__________________________________________________
Do You Yahoo!?
Yahoo! Kickin' Party - Win a 5-star getaway to exotic Bali!
http://kickin.yahoo.com.sg

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Concurrent Access of DOM tree

Posted by Andy Clark <an...@apache.org>.
Herry Koh wrote:
> thanks for the reply. So if I want to have concurrent access to the
> DOM, I guess one way to do it will be to disable the Deferred
> feature. The way to do it is clearly described in the Xerces docs,

You're missing the entire point. Please re-read Elena's
post because she hits the nail right on the head. In short,
you can *never* assume that any implementation of the DOM
interface *guarantees* thread-safety. There's nothing in
the spec that mandates it and it's just not smart
programming.

This is not a deficiency or a bug in the Xerces DOM impl.
And even in the non-deferred case, you should not assume
that even read-only access is thread-safe. There may be
(and probably are!) cases where the non-deferred DOM impl
only creates objects (like node-lists, etc) when they are
requested. This is simply done for performance and is
completely allowed by the interface.

-- 
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Concurrent Access of DOM tree

Posted by Herry Koh <he...@yahoo.com>.
Hi,
thanks for the reply. So if I want to have concurrent access to the 
DOM, I guess one way to do it will be to disable the Deferred 
feature. The way to do it is clearly described in the Xerces docs, 
but it advices developers to use xerces's DOMParser class. Is it 
not possible to use the DocumentBuilderFactory.setAttribute to 
make it more impl. independent?

Cheers,
Herry

On 10 May 2002, at 10:13, Elena Litani wrote:

> Herry Koh wrote:
> > apologies if I didn't make myself clear the last mail.
> > By 'access' I mean just to *read* the leaf nodes of the DOM tree. In
> > this scenario, I believe there is no need for thread safety. 
> 
> Your assumption is incorrect. DOM is just a set of interfaces and
> interfaces do not define how the DOM should be implemented. For example,
> Xerces Deferred DOM is a compact sturucture that deferres creating nodes
> untill you acces (read) those. At the time you acces a node the
> appropriate DOM nodes are created and the deferred structure is cleared
> (that explains why you are getting "null").
> 
> Since you have no way of knowing the actual implementation, you must
> always synchronize access (read or write) to the DOM tree.
> 
> -- 
> Elena Litani / IBM Toronto
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Concurrent Access of DOM tree

Posted by Elena Litani <el...@ca.ibm.com>.
Herry Koh wrote:
> apologies if I didn't make myself clear the last mail.
> By 'access' I mean just to *read* the leaf nodes of the DOM tree. In
> this scenario, I believe there is no need for thread safety. 

Your assumption is incorrect. DOM is just a set of interfaces and
interfaces do not define how the DOM should be implemented. For example,
Xerces Deferred DOM is a compact sturucture that deferres creating nodes
untill you acces (read) those. At the time you acces a node the
appropriate DOM nodes are created and the deferred structure is cleared
(that explains why you are getting "null").

Since you have no way of knowing the actual implementation, you must
always synchronize access (read or write) to the DOM tree.

-- 
Elena Litani / IBM Toronto

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Concurrent Access of DOM tree

Posted by Herry Koh <he...@yahoo.com>.
Hi
apologies if I didn't make myself clear the last mail.
By 'access' I mean just to *read* the leaf nodes of the DOM tree. In 
this scenario, I believe there is no need for thread safety. The error I 
wrote occurred when I am doing thousands of pure 'read' operations 
concurrently, after the DOM tree has been loaded into memory.
Thread safety is only required if I am doing 'write' with one thread 
and 'read' with the other at the same time, which I am not doing in 
this scenario.

I suspect this has something to do with the DeferredDOM impl, 
perhaps, while one thread is accessing the leaf node for the first 
time, another thread came in, thought that the leaf node has 
already been access and attempts to retrieve the values, resulting 
in NULL pointer. Is that possible? Just a guess.

Please forgive me if I don't have enough understanding of the 
xerces DOM impl, but logically speaking, thread safety problem 
should not arise from a pure 'read' action.

Cheers,
Herry


On 9 May 2002, at 23:44, Andy Clark wrote:

> Herry wrote:
> > Having created a DOM tree, I created a test case that
> > simulates 1100 threads access to a DOM tree, each
> 
> The DOM implementation is *not* designed to be thread-safe
> for performance reasons. There is no reason to penalize
> *everyone* for the overhead of synchronization for those
> people that want to access the DOM from multiple threads.
> Therefore, you are required to add the appropriate locks
> to your application code.
> 
> -- 
> Andy Clark * andyc@apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Concurrent Access of DOM tree

Posted by Andy Clark <an...@apache.org>.
Herry wrote:
> Having created a DOM tree, I created a test case that
> simulates 1100 threads access to a DOM tree, each

The DOM implementation is *not* designed to be thread-safe
for performance reasons. There is no reason to penalize
*everyone* for the overhead of synchronization for those
people that want to access the DOM from multiple threads.
Therefore, you are required to add the appropriate locks
to your application code.

-- 
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org