You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Martijn Hendriks <ma...@gx.nl> on 2007/09/14 16:24:51 UTC

Synchronized methods in ItemManager

Hi all,

The ItemManager has a small number of synchronized methods such as
getItem(ItemId id), which is heavily used. I cannot figure out why these
calls must be synchronized...can somebody give me an idea? Thanks!

Best wishes,

Martijn


--

Martijn Hendriks
<GX> creative online development B.V.
 
t: 024 - 3888 261
f: 024 - 3888 621
e: martijnh@gx.nl
 
Wijchenseweg 111
6538 SW Nijmegen
http://www.gx.nl/ 

RE: Synchronized methods in ItemManager

Posted by Martijn Hendriks <ma...@gx.nl>.
Hi,

> > we have the feeling that the synchronization overhead 
> begins to count on a multiprocessor (8 CPUs) platform.
> 
> Do you have some test case where this can be verified?

No, not yet but we're working on it. As soon as we have some results
I'll get back on this.

Best wishes,

Martijn

Re: Synchronized methods in ItemManager

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

> we have the feeling that the synchronization overhead begins to count on a multiprocessor (8 CPUs) platform.

Do you have some test case where this can be verified?

> on a single session instance I personally prefer a more
> coarse grained synchronization

I agree. In my view, an application should not access the same session
within multiple threads. But if the application does, synchronization
is required within Jackrabbit to avoid corruption. Where synchronize
on the session is required, this should be done on a high level to
improve performance.

Regards,
Thomas

Re: Synchronized methods in ItemManager

Posted by Padraic Hannon <pi...@wasabicowboy.com>.
Reading this post and going through some of the work I have done re. 
syncronization in the persistence manager I wonder if the real issue is 
not threads sharing a session, which it is clear they shouldn't, but 
that each session is tied to a persistence manager and an item manager 
which are both heavily synchronized. I understand the need to keep 
caches in sync etc, but it seems that we could be more performant (and I 
know I still owe the group numbers for performance :-) if we could use 
another pattern such as reader/writer locks, etc. to ensure consistency 
but allow for multiple simultaneous reads and writes.

-paddy

Re: Synchronized methods in ItemManager

Posted by Marcel Reutegger <ma...@gmx.net>.
Julian Reschke wrote:
> Marcel Reutegger wrote:
>> Julian Reschke wrote:
>>> Marcel Reutegger wrote:
>>>> our official statement still is you may use multiple threads on a 
>>>> session that just read but a single thread on a session that writes. in 
>>>
>>> - Could you please provide a pointer to that statement?
>>
>> well, maybe 'official' is the wrong term. whenever these kind of 
>> questions arise on the mailing list, we tell people that reading from 
>> multiple thread is probably safe, while writing is not. searching the 
>> mail archive should give you some of those statements.
> 
> I see. Note that you now said "is probably safe" :-)

yeah, I looked up the relevant messages in the archive and found that we only 
say 'probably' ;)

>>> Finally, how does that translate to JCR2SPI and the SPI interfaces? 
>>> It seems we need to clarify the thread-safety of 
>>> spi.RepositoryService and spi.SessionInfo...
>>
>> I agree, we should definitively do that. Again, I think jcr2spi should 
>> be entirely thread safe for any kind of operation, while the SPI level 
>> is IMO debatable, because it is at a lower level.
> 
> FWIW, we should make up our minds what we want to agree, and clearly 
> document that. I don't care a lot about what we say, but I'm not 
> convinced that guaranteeing more than JSR-170 says would be good for 
> interoperability of clients.

well, the main reason behind my call for a thread-safe implementation is for 
clients that use a session from multiple sessions *by mistake* and not on 
purpose. the repository should not break in any case.

regards
  marcel

Re: Synchronized methods in ItemManager

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

> but I'm not convinced that guaranteeing more than JSR-170
> says would be good for interoperability of clients.

Maybe trying to detect concurrent access and throwing an exception
would be an option? From HashMap javadocs: "Fail-fast iterators throw
ConcurrentModificationException on a best-effort basis... the
fail-fast behavior of iterators should be used only to detect bugs..."

That way we don't have to synchronize, but also reduce the risk of
corrupted internal structures on concurrent access within the same
session.

Thomas

Re: Synchronized methods in ItemManager

Posted by Julian Reschke <ju...@gmx.de>.
Marcel Reutegger wrote:
> Julian Reschke wrote:
>> Marcel Reutegger wrote:
>>> our official statement still is you may use multiple threads on a 
>>> session that just read but a single thread on a session that writes. in 
>>
>> - Could you please provide a pointer to that statement?
> 
> well, maybe 'official' is the wrong term. whenever these kind of 
> questions arise on the mailing list, we tell people that reading from 
> multiple thread is probably safe, while writing is not. searching the 
> mail archive should give you some of those statements.

I see. Note that you now said "is probably safe" :-)

>> Finally, how does that translate to JCR2SPI and the SPI interfaces? It 
>> seems we need to clarify the thread-safety of spi.RepositoryService 
>> and spi.SessionInfo...
> 
> I agree, we should definitively do that. Again, I think jcr2spi should 
> be entirely thread safe for any kind of operation, while the SPI level 
> is IMO debatable, because it is at a lower level.

FWIW, we should make up our minds what we want to agree, and clearly 
document that. I don't care a lot about what we say, but I'm not 
convinced that guaranteeing more than JSR-170 says would be good for 
interoperability of clients.

Best regards, Julian


Re: Synchronized methods in ItemManager

Posted by Marcel Reutegger <ma...@gmx.net>.
Julian Reschke wrote:
> Marcel Reutegger wrote:
>> our official statement still is you may use multiple threads on a 
>> session that just read but a single thread on a session that writes. in 
> 
> - Could you please provide a pointer to that statement?

well, maybe 'official' is the wrong term. whenever these kind of questions arise 
on the mailing list, we tell people that reading from multiple thread is 
probably safe, while writing is not. searching the mail archive should give you 
some of those statements.

> Finally, how does that translate to JCR2SPI and the SPI interfaces? It 
> seems we need to clarify the thread-safety of spi.RepositoryService and 
> spi.SessionInfo...

I agree, we should definitively do that. Again, I think jcr2spi should be 
entirely thread safe for any kind of operation, while the SPI level is IMO 
debatable, because it is at a lower level.


regards
  marcel

Re: Synchronized methods in ItemManager

Posted by Julian Reschke <ju...@gmx.de>.
Marcel Reutegger wrote:
> our official statement still is you may use multiple threads on a 
> session that just read but a single thread on a session that writes. in 

- Could you please provide a pointer to that statement?

- Also, we need to keep in mind that even if Jackrabbit allows that, JCR 
   in general doesn't. So an application taking advantage of that may 
break on different JCR implementations.

Finally, how does that translate to JCR2SPI and the SPI interfaces? It 
seems we need to clarify the thread-safety of spi.RepositoryService and 
spi.SessionInfo...

> ...

Best regards, Julian

Re: Synchronized methods in ItemManager

Posted by Marcel Reutegger <ma...@gmx.net>.
> I should have explained my question in more detail. The Javadoc of the
> ItemManager states that there's one ItemManager per Session: it is
> created in the constructor of SessionImpl. Sessions are not thread-safe
> by specification. Because some methods in the ItemManager are
> synchronized, an ItemManager instance of a Sesssion can also be accessed
> by other threads than the one that is using the Session that created
> that ItemManager. I am just wondering when that could happen.

our official statement still is you may use multiple threads on a session that 
just read but a single thread on a session that writes. in addition there's also 
the scenario where one thread saves changes on session and those changes get 
propagated to other sessions that are accessed by other threads.

as for multiple threads on a single session instance I personally prefer a more 
coarse grained synchronization to avoid monitor contention. see: JCR-890.

regards
  marcel

RE: Synchronized methods in ItemManager

Posted by Martijn Hendriks <ma...@gx.nl>.
Hi,

I should have explained my question in more detail. The Javadoc of the
ItemManager states that there's one ItemManager per Session: it is
created in the constructor of SessionImpl. Sessions are not thread-safe
by specification. Because some methods in the ItemManager are
synchronized, an ItemManager instance of a Sesssion can also be accessed
by other threads than the one that is using the Session that created
that ItemManager. I am just wondering when that could happen.

The reason I am asking this is because we have the feeling that the
synchronization overhead begins to count on a multiprocessor (8 CPUs)
platform.

Best regards,

Martijn

--

Martijn Hendriks
<GX> creative online development B.V.
 
t: 024 - 3888 261
f: 024 - 3888 621
e: martijnh@gx.nl
 
Wijchenseweg 111
6538 SW Nijmegen
http://www.gx.nl/  

> -----Original Message-----
> From: Thomas Mueller [mailto:thomas.tom.mueller@gmail.com] 
> Sent: Friday, September 14, 2007 6:12 PM
> To: dev@jackrabbit.apache.org
> Subject: Re: Synchronized methods in ItemManager
> 
> Hi,
> 
> > where the second thread comes from
> 
> The application can use multiple threads.
> Jackrabbit needs to protect itself from that.
> 
> Thomas
> 

Re: Synchronized methods in ItemManager

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

> where the second thread comes from

The application can use multiple threads.
Jackrabbit needs to protect itself from that.

Thomas

RE: Synchronized methods in ItemManager

Posted by Martijn Hendriks <ma...@gx.nl>.
Hi Stefan,

Thanks for your reply, but I still don't understand. Without synchronization, cache entries can only be overwritten when multiple threads have concurrent access, isn't it? It is not clear to me where the second thread comes from.

Best wishes,

Martijn


-----Oorspronkelijk bericht-----
Van: Stefan Guggisberg [mailto:stefan.guggisberg@gmail.com]
Verzonden: vr 14-9-2007 17:47
Aan: dev@jackrabbit.apache.org
Onderwerp: Re: Synchronized methods in ItemManager
 
hi martijn,

On 9/14/07, Martijn Hendriks <ma...@gx.nl> wrote:
> Hi all,
>
> The ItemManager has a small number of synchronized methods such as
> getItem(ItemId id), which is heavily used. I cannot figure out why these
> calls must be synchronized...can somebody give me an idea? Thanks!

the ItemManager maintains a cache of  NodeImpl/PropertyImpl instances.
#getItem(itemId) e.g. is synchronized in order to guarantee the cache
integrity. the following code fragment needs to be synchronized, otherwise
cache entries might get accidentally overwritten:

        // check cache
        ItemImpl item = retrieveItem(id);
        if (item == null) {
            ....
            // create instance of item
            item = createItemInstance(id);
        }

cheers
stefan

>
> Best wishes,
>
> Martijn
>
>
> --
>
> Martijn Hendriks
> <GX> creative online development B.V.
>
> t: 024 - 3888 261
> f: 024 - 3888 621
> e: martijnh@gx.nl
>
> Wijchenseweg 111
> 6538 SW Nijmegen
> http://www.gx.nl/
>


Re: Synchronized methods in ItemManager

Posted by Stefan Guggisberg <st...@gmail.com>.
hi martijn,

On 9/14/07, Martijn Hendriks <ma...@gx.nl> wrote:
> Hi all,
>
> The ItemManager has a small number of synchronized methods such as
> getItem(ItemId id), which is heavily used. I cannot figure out why these
> calls must be synchronized...can somebody give me an idea? Thanks!

the ItemManager maintains a cache of  NodeImpl/PropertyImpl instances.
#getItem(itemId) e.g. is synchronized in order to guarantee the cache
integrity. the following code fragment needs to be synchronized, otherwise
cache entries might get accidentally overwritten:

        // check cache
        ItemImpl item = retrieveItem(id);
        if (item == null) {
            ....
            // create instance of item
            item = createItemInstance(id);
        }

cheers
stefan

>
> Best wishes,
>
> Martijn
>
>
> --
>
> Martijn Hendriks
> <GX> creative online development B.V.
>
> t: 024 - 3888 261
> f: 024 - 3888 621
> e: martijnh@gx.nl
>
> Wijchenseweg 111
> 6538 SW Nijmegen
> http://www.gx.nl/
>