You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by nikolayo <ni...@travelstoremaker.com> on 2008/10/07 14:48:21 UTC

Re: Making Xerces DOM thread-safe for read

Sorry for resurrecting this but it is probably a very important issue for a
big
number of people who e.g. use DOM in JEE environment. My questions are :

1/ Is there any new development over the last year or so since the
discussion 
    occurred or is status unchanged?

2/ Is it at least thread safe approach to use cloned copies of Documents. If
I 
    provide multiple threads with copies generated from a single Document by 
    multiple calls to nodeClone() then can those threads safely use their
personal 
    cloned copies? For read only? For read/write? I think in this scenario
threads 
    should be granted full  R/W ownership of their cloned copies but I do
not 
    know whether this is the case in Xerces (and I can imagine deferred
execution 
    implementations of cloneNode which would not provide this type of thread
safety).

Regards
Nikolay



Mark Goodhand-2 wrote:
> 
> All,
> 
> I understand that DOM provides no thread safety guarantees, and that  
> the current Xerces implementation is not thread safe.  I've seen the  
> issue come up on the lists a few times over the past couple of  
> years.  As future hardware improvements are likely to come in the  
> form of more cores rather than higher clock speeds, and with XML  
> documents getting larger, I expect user requests for thread-safety  
> (at least for unchanging documents) will only increase.
> 
> As I understand it, it is performance-improving caches that prevent  
> Xerces DOM from being thread-safe (for read).  If true, this is  
> ironic, considering the massive performance benefits that could come  
> from parallel processing of documents.
> 
> Is the current stance a matter of principle (you don't believe the  
> Xerces DOM should ever be made thread-safe for read) or a practical  
> constraint due to limited resources (you'd like the Xerces DOM to be  
> thread-safe, but other issues/enhancements have higher priority)?
> 
> Regards,
> 
> Mark.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Making-Xerces-DOM-thread-safe-for-read-tp14230584p19857414.html
Sent from the Xerces - J - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


RE: Making Xerces DOM thread-safe for read

Posted by Ludger Buenger <lu...@realobjects.com>.
Michael Glavassevich wrote on 10/14/2008 06:02:54 AM:

 

> In Xerces I'm pretty sure a cloned Document is completely disconnected from the original with possibly

> the exception of user data your application may have copied over. It should generally be safe to use it in

> a different thread than the original Document.

 

With one exception:

In Xerces event listener Management is done upon a partially static data structure and thus registering or unregistering event listeners upon otherwise completely disconnected documents concurrently is subject to race conditions possibly causing corrupt event listener entries as I tried to discuss three weeks ago[1].

 

If anyone is interested, I have a fix for this which is already working fine in our local application.

 

But before I can place a report in jira I need feedback about the related issue I raised two weeks ago because it will influence how it should be done: 

Are event listener entries supposed to be serializable or it is sufficient to keep them transient only?

 

Because of the above mentioned static LCount data structure, event listener entries are already not properly serializable and thus broken in Xerces and need a fix anyhow. 

 

 

Best Regards,

 

Ludger Bünger

 

 

 

[1] http://mail-archives.apache.org/mod_mbox/xerces-j-dev/200809.mbox/%3C96458C389BF1724995FFD7FAD88926371140EA@ex.nc-sb.de%3E

 


Re: Making Xerces DOM thread-safe for read

Posted by nikolayo <ni...@travelstoremaker.com>.
Thank you, Michael. 

Regards
Nikolay


Michael Glavassevich-3 wrote:
> 
> 
> Hi Nikolay,
> 
> nikolayo <ni...@travelstoremaker.com> wrote on 10/13/2008
> 04:15:54 AM:
> 
>> Thank you, Michael. I realize that this is more of a DOM problem rather
> than
>> Xerces problem (and BTW - thank you and everybody out there for Xerces)
> but Xerces
>> is an influential product and Xerces developers must be some of the best
> DOM
>> experts that there are. So if you do not mind the question : what is your
>> take on using DOM documents in multi-threaded environment?
> 
> In general you need to be careful and synchronize your code such that no
> more than one thread accesses a DOM instance at a time. You need to do
> this
> because the DOM specification says nothing about thread-safety. In order
> to
> ensure your application will work with any DOM implementation you must
> assume the worst case. If your application doesn't need to work with
> arbitrary DOM implementations (and don't mind tying it to a specific
> implementation) you could select one which guarantees thread-safety for
> read, write or some set of methods (I don't know of one which has such
> guarantees) and synchronize more optimistically or not at all if the DOM
> implementation you're using allows you to do that.
> 
>> If general DOM discussion is off topic for you, I can also narrow this
> down to :
>> what is the safe bet for cloning of documents  in Xerces so that it is
> safe to use
>> clones in different threads?
> 
> In Xerces I'm pretty sure a cloned Document is completely disconnected
> from
> the original with possibly the exception of user data [1] your application
> may have copied over. It should generally be safe to use it in a different
> thread than the original Document.
> 
>> Regards
>> Nikolay
> 
> Thanks.
> 
> [1]
> http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#UserDataHandler
> 
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
> 

-- 
View this message in context: http://www.nabble.com/Making-Xerces-DOM-thread-safe-for-read-tp14230584p19969865.html
Sent from the Xerces - J - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: Making Xerces DOM thread-safe for read

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Nikolay,

nikolayo <ni...@travelstoremaker.com> wrote on 10/13/2008
04:15:54 AM:

> Thank you, Michael. I realize that this is more of a DOM problem rather
than
> Xerces problem (and BTW - thank you and everybody out there for Xerces)
but Xerces
> is an influential product and Xerces developers must be some of the best
DOM
> experts that there are. So if you do not mind the question : what is your
> take on using DOM documents in multi-threaded environment?

In general you need to be careful and synchronize your code such that no
more than one thread accesses a DOM instance at a time. You need to do this
because the DOM specification says nothing about thread-safety. In order to
ensure your application will work with any DOM implementation you must
assume the worst case. If your application doesn't need to work with
arbitrary DOM implementations (and don't mind tying it to a specific
implementation) you could select one which guarantees thread-safety for
read, write or some set of methods (I don't know of one which has such
guarantees) and synchronize more optimistically or not at all if the DOM
implementation you're using allows you to do that.

> If general DOM discussion is off topic for you, I can also narrow this
down to :
> what is the safe bet for cloning of documents  in Xerces so that it is
safe to use
> clones in different threads?

In Xerces I'm pretty sure a cloned Document is completely disconnected from
the original with possibly the exception of user data [1] your application
may have copied over. It should generally be safe to use it in a different
thread than the original Document.

> Regards
> Nikolay

Thanks.

[1]
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#UserDataHandler

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

Re: Making Xerces DOM thread-safe for read

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Nikolay,

nikolayo <ni...@travelstoremaker.com> wrote on 10/13/2008
04:15:54 AM:

> Thank you, Michael. I realize that this is more of a DOM problem rather
than
> Xerces problem (and BTW - thank you and everybody out there for Xerces)
but Xerces
> is an influential product and Xerces developers must be some of the best
DOM
> experts that there are. So if you do not mind the question : what is your
> take on using DOM documents in multi-threaded environment?

In general you need to be careful and synchronize your code such that no
more than one thread accesses a DOM instance at a time. You need to do this
because the DOM specification says nothing about thread-safety. In order to
ensure your application will work with any DOM implementation you must
assume the worst case. If your application doesn't need to work with
arbitrary DOM implementations (and don't mind tying it to a specific
implementation) you could select one which guarantees thread-safety for
read, write or some set of methods (I don't know of one which has such
guarantees) and synchronize more optimistically or not at all if the DOM
implementation you're using allows you to do that.

> If general DOM discussion is off topic for you, I can also narrow this
down to :
> what is the safe bet for cloning of documents  in Xerces so that it is
safe to use
> clones in different threads?

In Xerces I'm pretty sure a cloned Document is completely disconnected from
the original with possibly the exception of user data [1] your application
may have copied over. It should generally be safe to use it in a different
thread than the original Document.

> Regards
> Nikolay

Thanks.

[1]
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#UserDataHandler

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

Re: Making Xerces DOM thread-safe for read

Posted by nikolayo <ni...@travelstoremaker.com>.
Thank you, Michael. I realize that this is more of a DOM problem rather than
Xerces 
problem (and BTW - thank you and everybody out there for Xerces) but Xerces 
is an influential product and Xerces developers must be some of the best DOM 
experts that there are. So if you do not mind the question : what is your
take on 
using DOM documents in multi-threaded environment?  If general DOM
discussion 
is off topic for you, I can also narrow this down to : what is the safe bet
for 
cloning of documents  in Xerces so that it is safe to use clones in
different threads?

Regards
Nikolay


Michael Glavassevich-3 wrote:
> 
> Hi Nikolay,
> 
> nikolayo <ni...@travelstoremaker.com> wrote on 10/07/2008
> 08:48:21 AM:
> 
>> Sorry for resurrecting this but it is probably a very important issue for
> a big
>> number of people who e.g. use DOM in JEE environment. My questions are :
>>
>> 1/ Is there any new development over the last year or so since the
>> discussion occurred or is status unchanged?
> 
> No development and unlikely to happen for the reasons Joe and I gave last
> year when this thread was started.
> 
>> 2/ Is it at least thread safe approach to use cloned copies of Documents.
> If I
>>     provide multiple threads with copies generated from a single Document
> by
>>     multiple calls to nodeClone() then can those threads safely use their
> personal
>>     cloned copies? For read only? For read/write? I think in this
> scenario threads
>>     should be granted full  R/W ownership of their cloned copies but I do
> not
>>     know whether this is the case in Xerces (and I can imagine deferred
>>     execution implementations of cloneNode which would not provide this
> type of
>>     thread safety).
> 
> Should probably work with Xerces but no guarantee that it will work in
> general. Aside from having no general guarantee of thread-safety, the
> behaviour of cloneNode() [1] for the Document node is implementation
> dependent.
> 
>> Regards
>> Nikolay
> 
> Thanks.
> 
> [1]
> http://xerces.apache.org/xerces2-j/javadocs/api/org/w3c/dom/Node.html#cloneNode(boolean)
> 
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
> 

-- 
View this message in context: http://www.nabble.com/Making-Xerces-DOM-thread-safe-for-read-tp14230584p19950528.html
Sent from the Xerces - J - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: Making Xerces DOM thread-safe for read

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Nikolay,

nikolayo <ni...@travelstoremaker.com> wrote on 10/07/2008
08:48:21 AM:

> Sorry for resurrecting this but it is probably a very important issue for
a big
> number of people who e.g. use DOM in JEE environment. My questions are :
>
> 1/ Is there any new development over the last year or so since the
> discussion occurred or is status unchanged?

No development and unlikely to happen for the reasons Joe and I gave last
year when this thread was started.

> 2/ Is it at least thread safe approach to use cloned copies of Documents.
If I
>     provide multiple threads with copies generated from a single Document
by
>     multiple calls to nodeClone() then can those threads safely use their
personal
>     cloned copies? For read only? For read/write? I think in this
scenario threads
>     should be granted full  R/W ownership of their cloned copies but I do
not
>     know whether this is the case in Xerces (and I can imagine deferred
>     execution implementations of cloneNode which would not provide this
type of
>     thread safety).

Should probably work with Xerces but no guarantee that it will work in
general. Aside from having no general guarantee of thread-safety, the
behaviour of cloneNode() [1] for the Document node is implementation
dependent.

> Regards
> Nikolay

Thanks.

[1]
http://xerces.apache.org/xerces2-j/javadocs/api/org/w3c/dom/Node.html#cloneNode(boolean)

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

Re: Making Xerces DOM thread-safe for read

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Nikolay,

nikolayo <ni...@travelstoremaker.com> wrote on 10/07/2008
08:48:21 AM:

> Sorry for resurrecting this but it is probably a very important issue for
a big
> number of people who e.g. use DOM in JEE environment. My questions are :
>
> 1/ Is there any new development over the last year or so since the
> discussion occurred or is status unchanged?

No development and unlikely to happen for the reasons Joe and I gave last
year when this thread was started.

> 2/ Is it at least thread safe approach to use cloned copies of Documents.
If I
>     provide multiple threads with copies generated from a single Document
by
>     multiple calls to nodeClone() then can those threads safely use their
personal
>     cloned copies? For read only? For read/write? I think in this
scenario threads
>     should be granted full  R/W ownership of their cloned copies but I do
not
>     know whether this is the case in Xerces (and I can imagine deferred
>     execution implementations of cloneNode which would not provide this
type of
>     thread safety).

Should probably work with Xerces but no guarantee that it will work in
general. Aside from having no general guarantee of thread-safety, the
behaviour of cloneNode() [1] for the Document node is implementation
dependent.

> Regards
> Nikolay

Thanks.

[1]
http://xerces.apache.org/xerces2-j/javadocs/api/org/w3c/dom/Node.html#cloneNode(boolean)

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org