You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jackrabbit.apache.org by Ian Boston <ie...@tfd.co.uk> on 2010/10/02 03:15:10 UTC

Concurrent Use of System Session in ACLProvider Question.

Hi,
I have been trying to work out why Sling is showing Single Threaded behaviour with authenticated users to areas of the JCR that are not read public.

I think that this is because there are concurrent threads using the ACLProvider via the DefaultSecurityManager sharing single system session that is synchronised inside the ItemManager.getNode et al. method.

This does not happen where the JCR is public read since resolution of the ACLs is bypassed and so no single threading is seen.

So the question:
Am I diagnosing the problem correctly ?
What would be the impact of customising this area of the code base to use one systemSession per thread and avoid the single threadedness, which is having a big impact on the performance of our deployments ?

TIA
Ian

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Ian Boston <ie...@tfd.co.uk>.

On 4 Oct 2010, at 09:36, Jukka Zitting wrote:

> Hi,
> 
> On Mon, Oct 4, 2010 at 10:05 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>> Do you think your work on JCR-2699 will make reads truly concurrent with
>> no blocking or will there always be a part of the code base that is
>> essentially single threaded?
> 
> The one big remaining synchronization block is that in the persistence
> managers where they control access to the underlying persistence
> store. This was a hard requirement for our older database persistence
> managers that had to synchronize access to the potentially
> thread-unsafe single JDBC connection they were using. Thanks to the
> connection pooling support contributed in JCR-1456 we should now be
> able to avoid also that problem. I'll be looking at that shortly.
> 
> Of course, the next concurrency bottleneck will then be the underlying
> persistence store and ultimately the disk where the content gets
> stored, but there are existing solutions (clustering, RAID, etc.) for
> that.

Have you thought of replacing the LRUMap instances with a LRU map  impl based on ConcurrentHashMap that evicts when full. I did a brain dead Map impl and replaced the LRUMaps in AbstractPrincipalProvider and DefaultPrincipalProvider and I think I saw significantly fewer blocked sections of the thread traces in YourKit.

Ian

> 
> BR,
> 
> Jukka Zitting

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Ian Boston <ie...@tfd.co.uk>.

On 4 Oct 2010, at 09:36, Jukka Zitting wrote:

> Hi,
> 
> On Mon, Oct 4, 2010 at 10:05 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>> Do you think your work on JCR-2699 will make reads truly concurrent with
>> no blocking or will there always be a part of the code base that is
>> essentially single threaded?
> 
> The one big remaining synchronization block is that in the persistence
> managers where they control access to the underlying persistence
> store. This was a hard requirement for our older database persistence
> managers that had to synchronize access to the potentially
> thread-unsafe single JDBC connection they were using. Thanks to the
> connection pooling support contributed in JCR-1456 we should now be
> able to avoid also that problem. I'll be looking at that shortly.
> 
> Of course, the next concurrency bottleneck will then be the underlying
> persistence store and ultimately the disk where the content gets
> stored, but there are existing solutions (clustering, RAID, etc.) for
> that.


IIUC that means that once read and into the SharedItemManager read access will be concurrent ?


I am less worried about 1st time read or concurrent writes although ultimately they too become important.
Thanks
Ian

> 
> BR,
> 
> Jukka Zitting

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Mon, Oct 4, 2010 at 10:05 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> Do you think your work on JCR-2699 will make reads truly concurrent with
> no blocking or will there always be a part of the code base that is
> essentially single threaded?

The one big remaining synchronization block is that in the persistence
managers where they control access to the underlying persistence
store. This was a hard requirement for our older database persistence
managers that had to synchronize access to the potentially
thread-unsafe single JDBC connection they were using. Thanks to the
connection pooling support contributed in JCR-1456 we should now be
able to avoid also that problem. I'll be looking at that shortly.

Of course, the next concurrency bottleneck will then be the underlying
persistence store and ultimately the disk where the content gets
stored, but there are existing solutions (clustering, RAID, etc.) for
that.

BR,

Jukka Zitting

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Ian Boston <ie...@tfd.co.uk>.

On 3 Oct 2010, at 13:16, Jukka Zitting wrote:

> Hi,
> 
> On Sun, Oct 3, 2010 at 12:51 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>>  Just in case it helps, I have commented and attached patches for the partial
>> solution that I am working on which eliminate blocking by simply not sharing
>> the SystemSession in the AccessControlProvider.
> 
> Excellent, thanks!
> 
> BR,
> 
> Jukka Zitting

For posterity:
One other fix mentioned on the jira was to remove synchronisation from the cache surrounding the LRUMap in the AbrstractPrincipalProvider, and protect the operation from failure with try {} catch {} (LRUMap throws NPE on concurrent access). 

This has now freed our code base up to the extent that read concurrency is being limited by the SharedItemManager, however we still are limited to around 1000 requests per second regardless of number of cores, indicating although faster, its still single threaded. What I don't know is if being fast but single threaded is good enough to support the number of users we have to support per JVM.

Jukka, 
Do you think your work on JCR-2699 will make reads truly concurrent with no blocking or will there always be a part of the code base that is essentially single threaded? 
I ask because this might become a deal breaker for us, and I would rather not raise expectations, or ask too for too much from the Jackrabbit community, if our usage of Jackrabbit  is not going to match the aims of Jackrabbit.

Thanks
Ian

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Sun, Oct 3, 2010 at 12:51 PM, Ian Boston <ie...@tfd.co.uk> wrote:
> Just in case it helps, I have commented and attached patches for the partial
> solution that I am working on which eliminate blocking by simply not sharing
> the SystemSession in the AccessControlProvider.

Excellent, thanks!

BR,

Jukka Zitting

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Ian Boston <ie...@tfd.co.uk>.

On 2 Oct 2010, at 19:02, Jukka Zitting wrote:

> Hi,
> 
> On Sat, Oct 2, 2010 at 7:37 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>> making the SystemSession a non singleton and binding AccessControl
>> providers to threads as well as workspaces eliminates the blocking caused
>> by shared use of the SystemSession, however the server is still behaving
>> in a single threaded way with only minimal throughput increase between
>> 1,2,3,4 concurrent requests. Progress, but still investigating.
> 
> See JCR-2699 [1] for some related work I've recently been doing on
> this front. There are a number of concurrent use bottlenecks in
> Jackrabbit, and I've been trying to get rid of them one by one in
> preparation for the 2.2 release.
> 
> [1] https://issues.apache.org/jira/browse/JCR-2699

I see from the commits that you have been focusing on the lower levels which will have broader impact.
 Just in case it helps, I have commented and attached patches for the partial solution that I am working on which eliminate blocking by simply not sharing the SystemSession in the AccessControlProvider. I am sure that this has undesirable side effects but it does appear to work.

Thanks for the pointer, looking forward to upgrading to 2.2 with your fixes in place.
Ian

> 
> BR,
> 
> Jukka Zitting

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Sat, Oct 2, 2010 at 7:37 PM, Ian Boston <ie...@tfd.co.uk> wrote:
> making the SystemSession a non singleton and binding AccessControl
> providers to threads as well as workspaces eliminates the blocking caused
> by shared use of the SystemSession, however the server is still behaving
> in a single threaded way with only minimal throughput increase between
> 1,2,3,4 concurrent requests. Progress, but still investigating.

See JCR-2699 [1] for some related work I've recently been doing on
this front. There are a number of concurrent use bottlenecks in
Jackrabbit, and I've been trying to get rid of them one by one in
preparation for the 2.2 release.

[1] https://issues.apache.org/jira/browse/JCR-2699

BR,

Jukka Zitting

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Ian Boston <ie...@tfd.co.uk>.

On 2 Oct 2010, at 10:20, Ian Boston wrote:

> 
> On 2 Oct 2010, at 02:15, Ian Boston wrote:
> 
>> Hi,
>> I have been trying to work out why Sling is showing Single Threaded behaviour with authenticated users to areas of the JCR that are not read public.
>> 
>> I think that this is because there are concurrent threads using the ACLProvider via the DefaultSecurityManager sharing single system session that is synchronised inside the ItemManager.getNode et al. method.
>> 
>> This does not happen where the JCR is public read since resolution of the ACLs is bypassed and so no single threading is seen.
>> 
>> So the question:
>> Am I diagnosing the problem correctly ?
>> What would be the impact of customising this area of the code base to use one systemSession per thread and avoid the single threadedness, which is having a big impact on the performance of our deployments ?
>> 
>> TIA
>> Ian
> 
> 
> I have had a look at fixing this and its a major modification to all Access Control Providers as the single systemSession per workspace is saved in the abstract base class. It happens in all Jackrabbit instances since around 1.6 when item tree is not 100% read granted.
> 
> Any pointers on this would be really appreciated as its becoming a deal breaker for our use of Sling and Jackrabbit. We cant use Jackrabbit if it is a single threaded server for read access where some read is denied.
> 
> Ian

Quick update, 
making the SystemSession a non singleton and binding AccessControl providers to threads as well as workspaces eliminates the blocking caused by shared use of the SystemSession, however the server is still behaving in a single threaded way with only minimal throughput increase between 1,2,3,4 concurrent requests. Progress, but still investigating.

Is it safe to use a per AccessControlProvider System session (ie 1 per thread for the life of the thread)?

Ian

Re: Concurrent Use of System Session in ACLProvider Question.

Posted by Ian Boston <ie...@tfd.co.uk>.

On 2 Oct 2010, at 02:15, Ian Boston wrote:

> Hi,
> I have been trying to work out why Sling is showing Single Threaded behaviour with authenticated users to areas of the JCR that are not read public.
> 
> I think that this is because there are concurrent threads using the ACLProvider via the DefaultSecurityManager sharing single system session that is synchronised inside the ItemManager.getNode et al. method.
> 
> This does not happen where the JCR is public read since resolution of the ACLs is bypassed and so no single threading is seen.
> 
> So the question:
> Am I diagnosing the problem correctly ?
> What would be the impact of customising this area of the code base to use one systemSession per thread and avoid the single threadedness, which is having a big impact on the performance of our deployments ?
> 
> TIA
> Ian

I have had a look at fixing this and its a major modification to all Access Control Providers as the single systemSession per workspace is saved in the abstract base class. It happens in all Jackrabbit instances since around 1.6 when item tree is not 100% read granted.

 Any pointers on this would be really appreciated as its becoming a deal breaker for our use of Sling and Jackrabbit. We cant use Jackrabbit if it is a single threaded server for read access where some read is denied.

Ian