You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Alan D. Cabrera" <li...@toolazydogs.com> on 2007/09/15 00:53:12 UTC

System slowdown

I'm using JackRabbit v1.3.1 with the ObjectPersistenceManager.  I  
noticed that after a large number of binary inserts, about one  
million, that inserting takes over three times as long. When I  
restart the server it becomes much more lively.  I notice that the  
info from the CacheManager during its resizeAll the size was at 4k.   
Not sure if that's at all relevant.  BTW, the machine has a ton of  
memory and nothing else is on it.

Any ideas?  I'm happy to provide more information.


Regards,
Alan


Re: System slowdown

Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.
On Sep 22, 2007, at 9:44 AM, Alan D. Cabrera wrote:

>
> On Sep 22, 2007, at 3:32 AM, Jukka Zitting wrote:
>
>> Hi,
>>
>> On 9/22/07, Alan D. Cabrera <li...@toolazydogs.com> wrote:
>>> On Sep 20, 2007, at 12:34 AM, Thomas Mueller wrote:
>>>> Yes, using RMI does make a difference. Is RMI required in your  
>>>> case?
>>>> Because not using it would be another speed up.
>>>
>>> I have multiple websites using the same JCR server.  Is there  
>>> another
>>> protocol adapter that I can use?
>>
>> Do you run all the websites within the same servlet container?
>
> Nope.  I misspoke, I should have said I have a cluster of servers  
> serving up the same web site.
>
>> If no, then you may want to look at the clustering feature.
>
> Some questions:
>
> Why would clustering be better than RMI?  It's not immediately  
> clear to me why it's faster since it seems that we are just pushing  
> network chatter down to the journal and persistence manager.
>
> Are file based blob stores inherently non-transactional or does one  
> just need to be written?

Can someone take a crack at my questions?


Regards,
Alan


Re: System slowdown

Posted by Thomas Mueller <th...@gmail.com>.
Yes.

Regards,
Thomas

Re: System slowdown

Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.
On Oct 19, 2007, at 11:33 PM, Thomas Mueller wrote:

> Hi,
>
>>>> Are file based blob stores inherently non-transactional or does one
>>>> just need to be written?
>>> Just needs to be written.
>> I need one and am happy to try to write one.  Do you have any
>> pointers before I embark on this endeavor?
>
> Sorry, I don't have any pointers, but instead of improving blob store,
> I think you should have a look at the data store. Other than with the
> blob store, transactions are not a problem in the data store. The plan
> is to enable it for Jackrabbit 1.4.

So are you saying that if I start using that new fangled data store a  
non-transactional blob store will be moot?


Regards,
Alan


Re: System slowdown

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

> >> Are file based blob stores inherently non-transactional or does one
> >> just need to be written?
> > Just needs to be written.
> I need one and am happy to try to write one.  Do you have any
> pointers before I embark on this endeavor?

Sorry, I don't have any pointers, but instead of improving blob store,
I think you should have a look at the data store. Other than with the
blob store, transactions are not a problem in the data store. The plan
is to enable it for Jackrabbit 1.4. See:

http://issues.apache.org/jira/browse/JCR-926

I think we have too much choice in Jackrabbit. There are so many
persistence managers, and different ways to store blobs...

Thomas

Re: System slowdown

Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.
On Oct 2, 2007, at 4:41 PM, Jukka Zitting wrote:

>> Are file based blob stores inherently non-transactional or does one
>> just need to be written?
>
> Just needs to be written.

I need one and am happy to try to write one.  Do you have any  
pointers before I embark on this endeavor?


Regards,
Alan


Re: System slowdown

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 9/22/07, Alan D. Cabrera <li...@toolazydogs.com> wrote:
> Why would clustering be better than RMI?  It's not immediately clear
> to me why it's faster since it seems that we are just pushing network
> chatter down to the journal and persistence manager.

In principle you are right, but in practice the current JCR-RMI
implementation ends up performing way more network traffic than the
current clustering solution. JCR-RMI basically does a remote method
invocation for each JCR API call that you make, whereas the clustering
solution supports extensive caching and bundling of data to avoid many
of the network roundtrips.

> Are file based blob stores inherently non-transactional or does one
> just need to be written?

Just needs to be written.

BR,

Jukka Zitting

Re: System slowdown

Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.
On Sep 22, 2007, at 3:32 AM, Jukka Zitting wrote:

> Hi,
>
> On 9/22/07, Alan D. Cabrera <li...@toolazydogs.com> wrote:
>> On Sep 20, 2007, at 12:34 AM, Thomas Mueller wrote:
>>> Yes, using RMI does make a difference. Is RMI required in your case?
>>> Because not using it would be another speed up.
>>
>> I have multiple websites using the same JCR server.  Is there another
>> protocol adapter that I can use?
>
> Do you run all the websites within the same servlet container?

Nope.  I misspoke, I should have said I have a cluster of servers  
serving up the same web site.

> If no, then you may want to look at the clustering feature.

Some questions:

Why would clustering be better than RMI?  It's not immediately clear  
to me why it's faster since it seems that we are just pushing network  
chatter down to the journal and persistence manager.

Are file based blob stores inherently non-transactional or does one  
just need to be written?


Regards,
Alan



Re: System slowdown

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 9/22/07, Alan D. Cabrera <li...@toolazydogs.com> wrote:
> On Sep 20, 2007, at 12:34 AM, Thomas Mueller wrote:
> > Yes, using RMI does make a difference. Is RMI required in your case?
> > Because not using it would be another speed up.
>
> I have multiple websites using the same JCR server.  Is there another
> protocol adapter that I can use?

Do you run all the websites within the same servlet container?

If yes, then you should use a model 2 deployment with the repository
bound to JNDI.

If no, then you may want to look at the clustering feature.

BR,

Jukka Zitting

Re: System slowdown

Posted by Thomas Mueller <th...@gmail.com>.
> I have multiple websites using the same JCR server.  Is there another
> protocol adapter that I can use?

What about 'mixed' access: the main application in embedded mode, and
the others using RMI.

Thomas

Re: System slowdown

Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.
On Sep 20, 2007, at 12:34 AM, Thomas Mueller wrote:

> Hi,
>
>> No particular reason.  I didn't know that the others were better.
>> What are the differences between the three?
>
> I have updated the PersistenceManagerFAQ in the Wiki:
> http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ
>
> The information there is still incomplete, but the bundle persistence
> managers were not documented at all before. The
> ObjectPersistenceManager has 3 disadvantages in my view:
>
> - if the jvm process is killed the repository might turn inconsistent
> - non transactional
> - slow


Cool, I'll switch.

>>>  How long is three times as long?
>> I'll benchmark this for you
>
> This is not required at this time
>
>> I'm using RMI, if that's relevant.  I'll try to make a simple
>> reproducible example, if that will help
>
> Yes, using RMI does make a difference. Is RMI required in your case?
> Because not using it would be another speed up.

I have multiple websites using the same JCR server.  Is there another  
protocol adapter that I can use?


Regards,
Alan


Re: System slowdown

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

Maybe 'non transactional' is the wrong expression. Such persistence
managers (for example the ObjectPersistenceManager) usually work OK,
you can use transactions in Jackrabbit and so on, the exception is
'atomicy' in a crash. When the process is stopped while a transaction
is persisted (power failure, process killed, Runtime.halt() called, VM
crash), some data of a transaction may be committed and some not.
Theoretically, some nodes may even be corrupt (depending how and when
the system crashed). But the algorithms used are minimizing this risk,
for example the parent node is written last so in 99% of the cases
there is no problem even after a crash.

As far as I know, the database persistence managers (bundle and
simple) are safe, all others are not fully. Of course it depends on
how safe a database is (if you have a power failure, even databases
get corrupted sometimes, depending on the file system / hard drive).

I will update the wiki.

Thomas




On 9/20/07, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On 9/20/07, KÖLL Claus <C....@tirol.gv.at> wrote:
> > so my opinion is it should be regardless which peristence manager you use
> > jackrabbit should be always transactional if a operation runs inside a transaction
> > context. is this true or not ?
>
> You need to have a persistence manager that can store the transaction
> changelog as a single atomic operation, which as of now means the
> database persistence managers. An underlying database transaction is
> still needed for that atomic change, but it's not a part of the
> externally managed transaction context.
>
> BR,
>
> Jukka Zitting
>

Re: System slowdown

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 9/20/07, KÖLL Claus <C....@tirol.gv.at> wrote:
> so my opinion is it should be regardless which peristence manager you use
> jackrabbit should be always transactional if a operation runs inside a transaction
> context. is this true or not ?

You need to have a persistence manager that can store the transaction
changelog as a single atomic operation, which as of now means the
database persistence managers. An underlying database transaction is
still needed for that atomic change, but it's not a part of the
externally managed transaction context.

BR,

Jukka Zitting

AW: System slowdown

Posted by KÖLL Claus <C....@TIROL.GV.AT>.
hi thomas,

just for clarification ..

you sayed that one disadvantage of the ObjectPersistenceManager
is the non transational behaviour

there was a lot of discussion about transactions in jackrabbit and 
the statement that i get from this discussion is that jackrabbit "itself" (the session) is the
XAResource in a transcation context

so my opinion is it should be regardless which peristence manager you use
jackrabbit should be always transactional if a operation runs inside a transaction context.
is this true or not ?


BR,
claus

-----Ursprüngliche Nachricht-----
Von: Thomas Mueller [mailto:thomas.tom.mueller@gmail.com] 
Gesendet: Donnerstag, 20. September 2007 09:35
An: dev@jackrabbit.apache.org
Betreff: Re: System slowdown

Hi,

> No particular reason.  I didn't know that the others were better.
> What are the differences between the three?

I have updated the PersistenceManagerFAQ in the Wiki:
http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ

The information there is still incomplete, but the bundle persistence
managers were not documented at all before. The
ObjectPersistenceManager has 3 disadvantages in my view:

- if the jvm process is killed the repository might turn inconsistent
- non transactional
- slow

>>  How long is three times as long?
> I'll benchmark this for you

This is not required at this time

> I'm using RMI, if that's relevant.  I'll try to make a simple
> reproducible example, if that will help

Yes, using RMI does make a difference. Is RMI required in your case?
Because not using it would be another speed up.

Thomas

Re: System slowdown

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

> No particular reason.  I didn't know that the others were better.
> What are the differences between the three?

I have updated the PersistenceManagerFAQ in the Wiki:
http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ

The information there is still incomplete, but the bundle persistence
managers were not documented at all before. The
ObjectPersistenceManager has 3 disadvantages in my view:

- if the jvm process is killed the repository might turn inconsistent
- non transactional
- slow

>>  How long is three times as long?
> I'll benchmark this for you

This is not required at this time

> I'm using RMI, if that's relevant.  I'll try to make a simple
> reproducible example, if that will help

Yes, using RMI does make a difference. Is RMI required in your case?
Because not using it would be another speed up.

Thomas

Re: System slowdown

Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.
On Sep 17, 2007, at 7:33 AM, Thomas Mueller wrote:

> Hi,
>
> I think the ObjectPersistenceManager should not be used. Maybe it is
> time to deprecate it... Is there some specific reason why you can't
> use a more modern persistence manager, for example
> BundleFsPersistenceManager or BundleDbPersistenceManager? As far as I
> know, both can access binary data more efficiently.

No particular reason.  I didn't know that the others were better.   
What are the differences between the three?

>> about one million, that inserting takes over three times as long.
>
> Generally large repositories are slower than small ones. This is also
> the case for other storage systems. How long is three times as long?

I'll benchmark this for you but only if I don't switch to  
BundleFsPersistenceManager.

>> I notice that the info from the CacheManager during its resizeAll  
>> the size was at 4k.
>
> This sounds like you have opened sessions but did not close them.
> Could you please verify that your application closes all sessions?

I'm pretty sure that I'm closing them all in Filters/Interceptors.   
I'm using RMI, if that's relevant.  I'll try to make a simple  
reproducible example, if that will help


Regards,
Alan



Re: System slowdown

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

I think the ObjectPersistenceManager should not be used. Maybe it is
time to deprecate it... Is there some specific reason why you can't
use a more modern persistence manager, for example
BundleFsPersistenceManager or BundleDbPersistenceManager? As far as I
know, both can access binary data more efficiently.

> about one million, that inserting takes over three times as long.

Generally large repositories are slower than small ones. This is also
the case for other storage systems. How long is three times as long?

> I notice that the info from the CacheManager during its resizeAll the size was at 4k.

This sounds like you have opened sessions but did not close them.
Could you please verify that your application closes all sessions?

Regards,
Thomas