You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2003/05/22 18:06:52 UTC

DO NOT REPLY [Bug 6848] - Document Synchronization problem

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6848>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6848

Document Synchronization problem

vnicolici@zapp.ro changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |



------- Additional Comments From vnicolici@zapp.ro  2003-05-22 16:06 -------
This is crazy. If the DOM is not thread safe even for "reading", this means 
that writes occur in the Xerces DOM code when an API user is reading data from 
the DOM.

This means that this DOM implementation uses some form of lazy intitalization 
for its internal structures, and that the initialization is not properly 
synchronized.

This happens probably for performance reasons. From my point of view, this is a 
feature designed to trick benchmarking software, because after the parse method 
is called, the document is not fully parsed. Of course, this could grately 
improve performance if the application needs only a small subset of the data 
available in the XML document, and if the document is parsed multiple times 
during the virtual machine life time.

However, in my applications, the XML data is parsed one time for the entire VM 
life time, and then the references to the DOMs obtained are stored in static 
variables and then used concurrently by multiple threads. If you force me to 
use synchronize every time I use a DOM instance, this means I need to add 
hundreds, possibly thousands of synchronize structures to my code, one for each 
time I need to uses a DOM/Xerces unsafe method.

If I take another approach and synchronize access to the entire code sequence 
that accesses the DOM in order to reduce the number of synchronize structures, 
this means that any other thead safe calls inside this synchronize block will 
also be synchronized. If those thread safe calls take a long time, this change 
will have severe performance implications.

Anyway, you should at least provide a parameter to the parse calls to allow the 
user to request "read thread safe" DOMs from the parser.

Another way to solve this problem is to modify my code in this way: Immediately 
after the DOM is obtained from the parser, access all the data I might use in 
the application (elements, element attributes etc.) to force full 
parsing/initialization for the data I will need later in the multi-threaded 
code.

It is obvious that this feature is needed by many developers, and there are 
many developers that are not even aware of this bug, beacause it occurs mostly 
on production systems under heavy load

If you still do not want to solve this bug, does anybody know where can I find 
a XML DOM implementation for Java that is thread safe for reading?

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org