You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by John Franey <jf...@ssi-corp.com> on 2000/02/01 22:44:28 UTC

Element.getAttribute returns two different values on same node

Element.getAttribute(String attributeName) returns different values 
when called by two different threads for the same attribute that 
exists in the DOM.  One thread receives "", the other receive "the 
Attribute's Value".

Element.getAttibute(String name) is a read only operation.  Hence, I 
wouldn't expect to synchronize invocation to it.  The DOM spec doesn't 
guarantee syncrhonization, on modify operations.  But isn't reasonble 
to expect that two  threads invoking the same read only function would 
get the same result?

According to the (xml4j_2_0_15) src, ElementImpl maintains attributes 
in a cached nodelist. A boolean field in ElementImpl, 'syncData' is 
checked to see if the nodelist has yet to be populated.  The 
'syncData' and the cache are not properly synchronized against corruption:

Lets assume the attribute "name" is requested by two threads at the 
same time.  The first thread reads syncData as false, sets it to true 
and begins to load the cache.  Next, the second thread, evaulates 
syncData as true and reads from the cache, but the first thread has 
not fully loaded the cache yet.  Indeed, the requested attribute, 
"name" is not yet loaded.  So getAttribute in the second thread 
returns "".  The first thread finishes loading the attribute cache, 
looks for "name" finds it and returns the value specified in the 
document, which is NOT "".  So, to threads, performing a read 
operation, get two different values.

A possible fix is to apply a test, getlock, test again pattern, which 
I think wa
s first suggested by Doug Schmidt.  Test for syncData==false, then 
synchronize locking on the ElementImpl, then test for syncData==false 
again, then populate the hash.  In this way, the lock is only obtained 
on when syncData is false (or when the cache is not loaded) but 
population of the cache is still performed under a lock.  Hence in a 
single thread environment, only one synchronize call would be made per 
ElementImpl, instead of one for every call to getAttribute.

I tried this and came into all sorts of problems.

Do the developers of xerces view this as a bug to fix or the 
application of xerces must provide syncrhonization?  By the way, for 
my application to provide synchronizisation I need to sync on every 
get* call in the DOM nodes, because all DOM node implementation behave 
in this way.  Please don't say I have to do this.


Thanks,
John                 




Re: Element.getAttribute returns two different values on same node

Posted by Ralf Pfeiffer <rp...@apache.org>.
John,
The points you have raised are completely valid. However, the Xerces DOM is
not synchronized against multi-threading. It is not thread-safe. The decision
made was that the cost of synchronizing the DOM in terms of performance
degragation is  too high. And also, the need for thread-safety is not the
common case, so everyone would have to pay for this. Currently,  those who
need to multithread access must do their own synchronization. I realize this
is not such good news for you.

Also, the variable syncData, is slightly misnamed. It exists to allow the
loading of the cached values from the internal "pools", only once. It has
nothing to do with threading and syncronization, per se, but is affected in
the way you descibed.

Regards,
-rip

John Franey wrote:

> Element.getAttribute(String attributeName) returns different values
> when called by two different threads for the same attribute that
> exists in the DOM.  One thread receives "", the other receive "the
> Attribute's Value".
>
> Element.getAttibute(String name) is a read only operation.  Hence, I
> wouldn't expect to synchronize invocation to it.  The DOM spec doesn't
> guarantee syncrhonization, on modify operations.  But isn't reasonble
> to expect that two  threads invoking the same read only function would
> get the same result?
>
> According to the (xml4j_2_0_15) src, ElementImpl maintains attributes
> in a cached nodelist. A boolean field in ElementImpl, 'syncData' is
> checked to see if the nodelist has yet to be populated.  The
> 'syncData' and the cache are not properly synchronized against corruption:
>
> Lets assume the attribute "name" is requested by two threads at the
> same time.  The first thread reads syncData as false, sets it to true
> and begins to load the cache.  Next, the second thread, evaulates
> syncData as true and reads from the cache, but the first thread has
> not fully loaded the cache yet.  Indeed, the requested attribute,
> "name" is not yet loaded.  So getAttribute in the second thread
> returns "".  The first thread finishes loading the attribute cache,
> looks for "name" finds it and returns the value specified in the
> document, which is NOT "".  So, to threads, performing a read
> operation, get two different values.
>
> A possible fix is to apply a test, getlock, test again pattern, which
> I think wa
> s first suggested by Doug Schmidt.  Test for syncData==false, then
> synchronize locking on the ElementImpl, then test for syncData==false
> again, then populate the hash.  In this way, the lock is only obtained
> on when syncData is false (or when the cache is not loaded) but
> population of the cache is still performed under a lock.  Hence in a
> single thread environment, only one synchronize call would be made per
> ElementImpl, instead of one for every call to getAttribute.
>
> I tried this and came into all sorts of problems.
>
> Do the developers of xerces view this as a bug to fix or the
> application of xerces must provide syncrhonization?  By the way, for
> my application to provide synchronizisation I need to sync on every
> get* call in the DOM nodes, because all DOM node implementation behave
> in this way.  Please don't say I have to do this.
>
> Thanks,
> John


Re: Element.getAttribute returns two different values on same node

Posted by Andy Clark <an...@apache.org>.
John Franey wrote:
> Do the developers of xerces view this as a bug to fix or the
> application of xerces must provide syncrhonization?  By the way, for
> my application to provide synchronizisation I need to sync on every
> get* call in the DOM nodes, because all DOM node implementation behave
> in this way.  Please don't say I have to do this.

There is no easy way around it. If you are accessing the same 
document reference from multiple threads, you must provide the 
synchronization. The Xerces DOM implementation provides no
locking and does not claim to be thread safe.

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org