You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Narendra Sharma <na...@gmail.com> on 2010/05/17 03:37:50 UTC

Persistence Manager - Query on Concurrency

I am learning about Apache JackRabbit and trying to understand the Bundle
Persistence Manager. I observed that the methods (load and exists) in the
AbstractBundlePersistenceManag
er are synchronized. Is there any specific reason why they are synchronized?
The only reason I can think of is that it has been kept so to keep both
cache and persistence store in sync. Is this reason "the reason"?

Also, If I understand correctly there will be a single instance of
Persistence Manager. If this correct then synchronization of methods like
load, exists and store would mean that only one operation can happen at a
time. This is a serious limitation esp when the repository is large and
cannot fit in the cache.

What is the best way to handle multiple concurrent writes through
Persistence Manager? Shall I implement my own Persistence Manager and not
extend AbstractBundlePersistenceManager? Do you see any design and
implementation issues that would prevent me from doing this? What are other
alternatives?

Thanks,
Naren

Re: Persistence Manager - Query on Concurrency

Posted by Alexander Klimetschek <ak...@day.com>.
On Tue, May 18, 2010 at 18:03, Narendra Sharma
<na...@gmail.com> wrote:
> The reason I asked this query is because I noticed lot of threads (70-75%)
> BLOCKED in ISMLocking (tried with both DefaultISMLocking and
> FineGrainedISMLocking). What I understand is that when a session is saved
> the data gets written to persistence manager and the events are sent to
> listeners. Some of these listeners are synchronous listeners like
> SearchManager which in turn updates the Lucene index.
>
> All the 70-75% threads are blocked in either acquireReadLock or
> acquireWriteLock. The number of threads in my test are large (between 400 to
> 800). The question is why are there so many blocks?

Could you share your test case (and the jackrabbit version used)? If
all those threads each have sessions that are writing to the
repository, some blocking might be "normal" due to the high level of
contention. Just a guess, though.

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Re: Persistence Manager - Query on Concurrency

Posted by Narendra Sharma <na...@gmail.com>.
Okay I believe you :)

The reason I asked this query is because I noticed lot of threads (70-75%)
BLOCKED in ISMLocking (tried with both DefaultISMLocking and
FineGrainedISMLocking). What I understand is that when a session is saved
the data gets written to persistence manager and the events are sent to
listeners. Some of these listeners are synchronous listeners like
SearchManager which in turn updates the Lucene index.

All the 70-75% threads are blocked in either acquireReadLock or
acquireWriteLock. The number of threads in my test are large (between 400 to
800). The question is why are there so many blocks? The answer according to
me is the threads are busy doing some operation either at persistence
manager layer or Lucene indexing. I am trying to debug that.
Meanwhile if anyone has any suggestion on debugging this (thread blocked)
issue, please share the same. Please share tips specific to JackRabbit and
not generic debugging tips for multithreaded apps.

Thanks,
Naren

On Tue, May 18, 2010 at 3:15 AM, Alexander Klimetschek <ak...@day.com>wrote:

> On Mon, May 17, 2010 at 20:46, Narendra Sharma
> <na...@gmail.com> wrote:
> > I am not sure if I understand it correctly. How would many sessions help
> if
> > there is single instance of Persistence Manager and methods in it are
> > synchronized. All threads will be blocked and only one will go through.
> >
> > This is very basic issue. So either I am missing something or I am not
> using
> > it the right way.
>
> Just believe us and all the people using Jackrabbit for years already ;-)
>
> A large part of Jackrabbit core is the management of transient and
> persistent item states (nodes and properties) within so-called item
> state managers. The relevant item state manager ensures
> synchronized/serialized calls to the persistence manager used.
>
> Regards,
> Alex
>
> --
> Alexander Klimetschek
> alexander.klimetschek@day.com
>

Re: Persistence Manager - Query on Concurrency

Posted by Alexander Klimetschek <ak...@day.com>.
On Mon, May 17, 2010 at 20:46, Narendra Sharma
<na...@gmail.com> wrote:
> I am not sure if I understand it correctly. How would many sessions help if
> there is single instance of Persistence Manager and methods in it are
> synchronized. All threads will be blocked and only one will go through.
>
> This is very basic issue. So either I am missing something or I am not using
> it the right way.

Just believe us and all the people using Jackrabbit for years already ;-)

A large part of Jackrabbit core is the management of transient and
persistent item states (nodes and properties) within so-called item
state managers. The relevant item state manager ensures
synchronized/serialized calls to the persistence manager used.

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Re: Persistence Manager - Query on Concurrency

Posted by Narendra Sharma <na...@gmail.com>.
I am not sure if I understand it correctly. How would many sessions help if
there is single instance of Persistence Manager and methods in it are
synchronized. All threads will be blocked and only one will go through.

This is very basic issue. So either I am missing something or I am not using
it the right way.

Thanks,
Naren

On Mon, May 17, 2010 at 3:56 AM, Alexander Klimetschek <ak...@day.com>wrote:

> On Mon, May 17, 2010 at 05:10, Narendra Sharma
> <na...@gmail.com> wrote:
> > This doesn't answer the question about synchronization in
> > AbstractBundlePersistenceManager.
> >
> > I am aware of using Session. However, all session will eventually access
> > Persistence Manager to fetch data.
>
> Access is synchronized inside Jackrabbit, above the persistence
> manager. As Rakesh already mentioned: "You create as many session
> instances as you need (essentially one Session per thread of
> execution)."
>
> Regards,
> Alex
>
> --
> Alexander Klimetschek
> alexander.klimetschek@day.com
>

Re: Persistence Manager - Query on Concurrency

Posted by Alexander Klimetschek <ak...@day.com>.
On Mon, May 17, 2010 at 05:10, Narendra Sharma
<na...@gmail.com> wrote:
> This doesn't answer the question about synchronization in
> AbstractBundlePersistenceManager.
>
> I am aware of using Session. However, all session will eventually access
> Persistence Manager to fetch data.

Access is synchronized inside Jackrabbit, above the persistence
manager. As Rakesh already mentioned: "You create as many session
instances as you need (essentially one Session per thread of
execution)."

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Re: Persistence Manager - Query on Concurrency

Posted by Narendra Sharma <na...@gmail.com>.
This doesn't answer the question about synchronization in
AbstractBundlePersistenceManager.

I am aware of using Session. However, all session will eventually access
Persistence Manager to fetch data.

Thanks,
Naren

On Sun, May 16, 2010 at 7:29 PM, Rakesh Vidyadharan <ra...@sptci.com>wrote:

>
> On 16 May 2010, at 20:37, Narendra Sharma wrote:
>
> > I am learning about Apache JackRabbit and trying to understand the Bundle
> > Persistence Manager. I observed that the methods (load and exists) in the
> > AbstractBundlePersistenceManag
> > er are synchronized. Is there any specific reason why they are
> synchronized?
> > The only reason I can think of is that it has been kept so to keep both
> > cache and persistence store in sync. Is this reason "the reason"?
> >
> > Also, If I understand correctly there will be a single instance of
> > Persistence Manager. If this correct then synchronization of methods like
> > load, exists and store would mean that only one operation can happen at a
> > time. This is a serious limitation esp when the repository is large and
> > cannot fit in the cache.
> >
> > What is the best way to handle multiple concurrent writes through
> > Persistence Manager? Shall I implement my own Persistence Manager and not
> > extend AbstractBundlePersistenceManager? Do you see any design and
> > implementation issues that would prevent me from doing this? What are
> other
> > alternatives?
> >
> > Thanks,
> > Naren
>
> You should work with the JCR API and not the JackRabbit API (other than for
> the initial configuration of the Repository).  You have Session interfaces
> in the JCR API that you use.  You create as many session instances as you
> need (essentially one Session per thread of execution).
>
> Rakesh

Re: Persistence Manager - Query on Concurrency

Posted by Rakesh Vidyadharan <ra...@sptci.com>.
On 16 May 2010, at 20:37, Narendra Sharma wrote:

> I am learning about Apache JackRabbit and trying to understand the Bundle
> Persistence Manager. I observed that the methods (load and exists) in the
> AbstractBundlePersistenceManag
> er are synchronized. Is there any specific reason why they are synchronized?
> The only reason I can think of is that it has been kept so to keep both
> cache and persistence store in sync. Is this reason "the reason"?
> 
> Also, If I understand correctly there will be a single instance of
> Persistence Manager. If this correct then synchronization of methods like
> load, exists and store would mean that only one operation can happen at a
> time. This is a serious limitation esp when the repository is large and
> cannot fit in the cache.
> 
> What is the best way to handle multiple concurrent writes through
> Persistence Manager? Shall I implement my own Persistence Manager and not
> extend AbstractBundlePersistenceManager? Do you see any design and
> implementation issues that would prevent me from doing this? What are other
> alternatives?
> 
> Thanks,
> Naren

You should work with the JCR API and not the JackRabbit API (other than for the initial configuration of the Repository).  You have Session interfaces in the JCR API that you use.  You create as many session instances as you need (essentially one Session per thread of execution).

Rakesh