You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by MARTINEZ Antonio <An...@alcatel-lucent.com> on 2009/02/06 02:07:52 UTC

Jackrabbit standalone through RMI

Hello,

After getting the standalone working I have verified that the queries
are about 10 times slower than same queries in my cluster setup.
I have seen warnings in this mailing list and the code stating that the
RMI is not optimized for performance.

Could I get more information on the technical reasons why there is such
a big performance gap ?
I would like to understand what would take to improve that.

Thanks in advance,
Antonio

-----Original Message-----
From: MARTINEZ Antonio [mailto:Antonio.Martinez@alcatel-lucent.com] 
Sent: Tuesday, January 20, 2009 4:05 PM
To: users@jackrabbit.apache.org
Subject: RE: Bootstrap Jackrabbit without container

I found my answer in the release notes of Jackrabbit 1.5.2.
I was not aware of the standalone server process Regards, Antonio
 

-----Original Message-----
From: MARTINEZ Antonio [mailto:Antonio.Martinez@alcatel-lucent.com]
Sent: Monday, January 19, 2009 7:44 PM
To: users@jackrabbit.apache.org
Subject: Bootstrap Jackrabbit without container

Hello,

I'm currently using Jackrabbit 1.4.4 in a cluster configuration (3 App
servers running JBoss).
The persistence manager I'm using is MySql, which is running in a DB
server (actually 2 DB servers in redundant configuration)

Everything is working fine, except that now I'm facing the problem that
I can not perform an online backup (since the local indexing of one of
the App servers is required). On top of that, in the near future I need
to support geo-redundancy, which seams kind of impossible with the
current configuration.


To be able to support online backup and geo redundancy, the data (DB +
local index) should actually be in the DB server, to simplify the
scripts in charge of those features.

I'm thinking of changing the architecture and run a single JackRabbit
instance in the DB server and use RMI to access the repository from the
App servers.

Now, in the DB server we are not running any container, just MySql, so
my question is if there would be any issue bootstrapping Jackrabbit
without container.


Any other experiences running this configuration are really appreciated:
any architecture issue with this approach, performance, etc..


Thanks,
Antonio

Re: Jackrabbit standalone through RMI

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Fri, Feb 6, 2009 at 2:07 AM, MARTINEZ Antonio
<An...@alcatel-lucent.com> wrote:
> Could I get more information on the technical reasons why there is such
> a big performance gap ?

The main reason for the performance problem is the design of the RMI
layer. The layer optimizes for API coverage and correctness instead of
performance by mapping almost all JCR API calls to remote method
calls. And since JCR is a very fine-grained API (e.g. each property
access is a separate method call), this results in a large number of
network roundtrips even for simple operations.

> I would like to understand what would take to improve that.

By far the biggest performance improvement could be achieved by making
the granularity of the network calls coarser and caching the retrieved
information on the client side. For example, a getProperty() call
could also retrieve the (non-binary) value(s) of the property so that
a getValue() call on the retrieved property can be executed locally.
Or more notably, the return value of a getNode() call could contain
all the properties and child node references of a node to boost calls
like getProperties() and getNodes(). Of course the price of such
changes is increased complexity and potential cache coherence issues.

Currently the most active front on implementing faster JCR remoting is
the SPI over WebDAV layer that's currently being developed in the
Jackrabbit sandbox. Please join the dev@ list if you're interested in
participating in the development effort.

BR,

Jukka Zitting