You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by John Franey <jf...@ssi-corp.com> on 2000/02/01 22:44:28 UTC
Element.getAttribute returns two different values on same node
Element.getAttribute(String attributeName) returns different values
when called by two different threads for the same attribute that
exists in the DOM. One thread receives "", the other receive "the
Attribute's Value".
Element.getAttibute(String name) is a read only operation. Hence, I
wouldn't expect to synchronize invocation to it. The DOM spec doesn't
guarantee syncrhonization, on modify operations. But isn't reasonble
to expect that two threads invoking the same read only function would
get the same result?
According to the (xml4j_2_0_15) src, ElementImpl maintains attributes
in a cached nodelist. A boolean field in ElementImpl, 'syncData' is
checked to see if the nodelist has yet to be populated. The
'syncData' and the cache are not properly synchronized against corruption:
Lets assume the attribute "name" is requested by two threads at the
same time. The first thread reads syncData as false, sets it to true
and begins to load the cache. Next, the second thread, evaulates
syncData as true and reads from the cache, but the first thread has
not fully loaded the cache yet. Indeed, the requested attribute,
"name" is not yet loaded. So getAttribute in the second thread
returns "". The first thread finishes loading the attribute cache,
looks for "name" finds it and returns the value specified in the
document, which is NOT "". So, to threads, performing a read
operation, get two different values.
A possible fix is to apply a test, getlock, test again pattern, which
I think wa
s first suggested by Doug Schmidt. Test for syncData==false, then
synchronize locking on the ElementImpl, then test for syncData==false
again, then populate the hash. In this way, the lock is only obtained
on when syncData is false (or when the cache is not loaded) but
population of the cache is still performed under a lock. Hence in a
single thread environment, only one synchronize call would be made per
ElementImpl, instead of one for every call to getAttribute.
I tried this and came into all sorts of problems.
Do the developers of xerces view this as a bug to fix or the
application of xerces must provide syncrhonization? By the way, for
my application to provide synchronizisation I need to sync on every
get* call in the DOM nodes, because all DOM node implementation behave
in this way. Please don't say I have to do this.
Thanks,
John
Re: Element.getAttribute returns two different values on same node
Posted by Ralf Pfeiffer <rp...@apache.org>.
John,
The points you have raised are completely valid. However, the Xerces DOM is
not synchronized against multi-threading. It is not thread-safe. The decision
made was that the cost of synchronizing the DOM in terms of performance
degragation is too high. And also, the need for thread-safety is not the
common case, so everyone would have to pay for this. Currently, those who
need to multithread access must do their own synchronization. I realize this
is not such good news for you.
Also, the variable syncData, is slightly misnamed. It exists to allow the
loading of the cached values from the internal "pools", only once. It has
nothing to do with threading and syncronization, per se, but is affected in
the way you descibed.
Regards,
-rip
John Franey wrote:
> Element.getAttribute(String attributeName) returns different values
> when called by two different threads for the same attribute that
> exists in the DOM. One thread receives "", the other receive "the
> Attribute's Value".
>
> Element.getAttibute(String name) is a read only operation. Hence, I
> wouldn't expect to synchronize invocation to it. The DOM spec doesn't
> guarantee syncrhonization, on modify operations. But isn't reasonble
> to expect that two threads invoking the same read only function would
> get the same result?
>
> According to the (xml4j_2_0_15) src, ElementImpl maintains attributes
> in a cached nodelist. A boolean field in ElementImpl, 'syncData' is
> checked to see if the nodelist has yet to be populated. The
> 'syncData' and the cache are not properly synchronized against corruption:
>
> Lets assume the attribute "name" is requested by two threads at the
> same time. The first thread reads syncData as false, sets it to true
> and begins to load the cache. Next, the second thread, evaulates
> syncData as true and reads from the cache, but the first thread has
> not fully loaded the cache yet. Indeed, the requested attribute,
> "name" is not yet loaded. So getAttribute in the second thread
> returns "". The first thread finishes loading the attribute cache,
> looks for "name" finds it and returns the value specified in the
> document, which is NOT "". So, to threads, performing a read
> operation, get two different values.
>
> A possible fix is to apply a test, getlock, test again pattern, which
> I think wa
> s first suggested by Doug Schmidt. Test for syncData==false, then
> synchronize locking on the ElementImpl, then test for syncData==false
> again, then populate the hash. In this way, the lock is only obtained
> on when syncData is false (or when the cache is not loaded) but
> population of the cache is still performed under a lock. Hence in a
> single thread environment, only one synchronize call would be made per
> ElementImpl, instead of one for every call to getAttribute.
>
> I tried this and came into all sorts of problems.
>
> Do the developers of xerces view this as a bug to fix or the
> application of xerces must provide syncrhonization? By the way, for
> my application to provide synchronizisation I need to sync on every
> get* call in the DOM nodes, because all DOM node implementation behave
> in this way. Please don't say I have to do this.
>
> Thanks,
> John
Re: Element.getAttribute returns two different values on same node
Posted by Andy Clark <an...@apache.org>.
John Franey wrote:
> Do the developers of xerces view this as a bug to fix or the
> application of xerces must provide syncrhonization? By the way, for
> my application to provide synchronizisation I need to sync on every
> get* call in the DOM nodes, because all DOM node implementation behave
> in this way. Please don't say I have to do this.
There is no easy way around it. If you are accessing the same
document reference from multiple threads, you must provide the
synchronization. The Xerces DOM implementation provides no
locking and does not claim to be thread safe.
--
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org