You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Andy Depue <an...@marathon-man.com> on 2005/07/14 17:51:43 UTC

Considering Jackrabbit

We've been using Slide for a short time here, and have decided to move away 
from it (we knew we'd have to eventually).  Anyway, there aren't many options 
out there when it comes to open source document repositories in Java, and we 
are looking at Jackrabbit.  I've narrowed down two areas of need for us where 
Jackrabbit seems weak:

1. Using a relational database for storage, which would allow us to leverage 
all the work we have already put into making sure our relational database is 
scalable, fault tolerant, etc...
2. Client/server.  We will need "model 3":   
http://incubator.apache.org/jackrabbit/arch/deploy/howto-model3.html - our 
document repository will be an integral part of our application, which will 
be running in a cluster.

Could anyone share their experience in these two areas?  Jackrabbit appears to 
be weak in both areas at the moment, from what I can tell.  There doesn't 
appear to be any JDBC persistence manager, and I'm not sure how mature the 
ORM persistence manager is?  For model 3, the RMI implementation looks too 
weak (not optimized - slow over WAN), and it's hard to tell how mature the 
WebDAV implementation is at a glance.

  - Andy

Re: Considering Jackrabbit

Posted by Stefan Guggisberg <st...@gmail.com>.
On 7/18/05, Serge Huber <sh...@jahia.com> wrote:
> 
> Hi David, andy,
> 
> David Nuescheler wrote:
> 
> >to me it seems like we have a number of people on this list
> >asking for a jdbc persistence manager, but the one that we
> >have (the "orm pm") does not receive enough development
> >support by all those people that are interested in it?
> >
> >if that really is the case then i would suggest that some
> >of the people that are using the orm pm are pitching in to
> >catch up with the rest of jackrabbit and make it
> >"production ready" ( whatever that means for an
> >unreleased project ;) )
> >
> >
> Thanks David, I'm glad somebody else than me is saying that :) (for
> those not in-the-know, I've been working on the initial implementation
> of the ORM-PM.
> 
> >i think that having another jdbc approach without scraping
> >the orm pm will only spread the seemingly small *developer*
> >base interested in supporting rdbms backed repositories even
> >thinner.
> >
> >
> I wonder btw whatever happened to the iBatis implementation that was
> done a while ago ? Any news ?
> 
> Anyway, my take on DB-based persistence managers is that they need :
> - a caching system for performance

caching is taken care of in another layer. caching on the persistence 
layer is redundant and could aversely affect the overall performance
by consuming to much memory which could be used more efficiently 
by jackrabbit.

> - to be able to handle transactions
> - to be able to handle commit/rollback

agreed. you get that for free with jdbc.

> - use mapping files to be able to change the table and column names as
> they might conflict

this depends on which approach you're taking. i favour the keep-it-simple
approach (2 column tables as discussed in other threads, i.e. no 
mapping files required). 

> - be as high-performance as possible (although we will always have the
> performance cost of network traffic)

that's always a good idea :)


cheers
stefan

> - allow for clustering
> 
> In those criteria, an ORM-based implementation seams reasonable,
> especially the Hibernate one. It is perfectly possible to do all this
> without an ORM, but it would mean re-developping part of an ORM anyway.
> 
> And as always, in terms of speed : memory > filesystem > db :)
> 
> Regards,
>   Serge Huber.
>

Re: Considering Jackrabbit

Posted by Serge Huber <sh...@jahia.com>.
David Nuescheler wrote:

>>- to be able to handle transactions
>>- to be able to handle commit/rollback
>>    
>>
>true... what is the difference between the two points?
>  
>
Well if you think in terms of pure JDBC then there is no difference. But 
for me transactions also cover memory state, so any objects handled by 
the PM (that are not also handled by the upper layers) should also be 
reverted in case a transaction fails. This of course is only necessary 
if your PM implementation has internal caches, or uses copies of 
Jackrabbit objects.

>>- use mapping files to be able to change the table and column names as
>>they might conflict
>>    
>>
>not sure. of course there are a number of different ways on how 
>someone could map a content repository to a relational database.
>however, i think there are a couple of very simple "hard coded" 
>implementations that completely avoid conflicts without 
>any configuration beyond standard jdbc.
>  
>
Well from experience I know that table and column naming is something 
you will want to change at some point (I had the case where I had to 
integrate with a DB that required very short names). It doesn't cost 
much for example to make the statements loadable from a database (I 
believe Edgar's implementation does this). In the case of the iBatis 
implementation this was also the case. There is no performance cost to 
do this (as it is done during PM initialization), and allows for some 
flexibility.

>>- allow for clustering
>>    
>>
>that is unfortunately not something that the pm can take care of
>see the jira bug and all the discussions around that.
>  
>
I guess a better wording might be "- doesn't introduce any problems for 
clustering" then :) I have been following all the discussion of 
clustering, and I do agree that there is much more to it than just the PMs.

>see above. i think the differentiators that orm brings to the table
>are not that significant to a persistence manager, while i think that 
>the additional complexity is substantial.
>  
>
Indeed the complexity worries me too. I'll have to have a go at it and 
see if I can simplify.

>>And as always, in terms of speed : memory > filesystem > db :)
>>    
>>
>well, not in general ;)
>  
>
Uh... I'm not sure you read this right. I meant to say that memory 
access is always faster than filesystem access which itself is faster 
than DB access. I would think this would be the general case. Of course, 
there can be exceptions (especially when the access design is not 
efficient).

Regards,
  Serge...

Re: Considering Jackrabbit

Posted by David Nuescheler <da...@gmail.com>.
hi serge,

thanks for the comments.

> >if that really is the case then i would suggest that some
> >of the people that are using the orm pm are pitching in to
> >catch up with the rest of jackrabbit and make it
> >"production ready" ( whatever that means for an
> >unreleased project ;) )
> Thanks David, I'm glad somebody else than me is saying that :) 
no problem ;)

> Anyway, my take on DB-based persistence managers is that they need :
> - a caching system for performance
i would largely disagree with that. since there is caching on the layer 
above the pm, i think that a pm-level cache does not really add much
other than memory consumption.

> - to be able to handle transactions
> - to be able to handle commit/rollback
true... what is the difference between the two points?

> - use mapping files to be able to change the table and column names as
> they might conflict
not sure. of course there are a number of different ways on how 
someone could map a content repository to a relational database.
however, i think there are a couple of very simple "hard coded" 
implementations that completely avoid conflicts without 
any configuration beyond standard jdbc.

> - be as high-performance as possible (although we will always have the
> performance cost of network traffic)
well, i would assume that "to be as fast as possible" really is
a non-functional requirement of any piece of software, especially
of software that tends to be a bottle neck. so to me this is
somewhat implicit.

> - allow for clustering
that is unfortunately not something that the pm can take care of
see the jira bug and all the discussions around that.

> In those criteria, an ORM-based implementation seams reasonable,
> especially the Hibernate one. It is perfectly possible to do all this
> without an ORM, but it would mean re-developping part of an ORM anyway.
see above. i think the differentiators that orm brings to the table
are not that significant to a persistence manager, while i think that 
the additional complexity is substantial.

> And as always, in terms of speed : memory > filesystem > db :)
well, not in general ;)

regards,
david

Re: Considering Jackrabbit

Posted by Serge Huber <sh...@jahia.com>.
Hi David, andy,

David Nuescheler wrote:

>to me it seems like we have a number of people on this list 
>asking for a jdbc persistence manager, but the one that we 
>have (the "orm pm") does not receive enough development 
>support by all those people that are interested in it?
>
>if that really is the case then i would suggest that some
>of the people that are using the orm pm are pitching in to 
>catch up with the rest of jackrabbit and make it 
>"production ready" ( whatever that means for an 
>unreleased project ;) )
>  
>
Thanks David, I'm glad somebody else than me is saying that :) (for 
those not in-the-know, I've been working on the initial implementation 
of the ORM-PM.

>i think that having another jdbc approach without scraping 
>the orm pm will only spread the seemingly small *developer*
>base interested in supporting rdbms backed repositories even 
>thinner.
>  
>
I wonder btw whatever happened to the iBatis implementation that was 
done a while ago ? Any news ?

Anyway, my take on DB-based persistence managers is that they need :
- a caching system for performance
- to be able to handle transactions
- to be able to handle commit/rollback
- use mapping files to be able to change the table and column names as 
they might conflict
- be as high-performance as possible (although we will always have the 
performance cost of network traffic)
- allow for clustering

In those criteria, an ORM-based implementation seams reasonable, 
especially the Hibernate one. It is perfectly possible to do all this 
without an ORM, but it would mean re-developping part of an ORM anyway.

And as always, in terms of speed : memory > filesystem > db :)

Regards,
  Serge Huber.

Re: Considering Jackrabbit

Posted by David Nuescheler <da...@gmail.com>.
> > if that really is the case then i would suggest that some
> > of the people that are using the orm pm are pitching in to
> > catch up with the rest of jackrabbit and make it
> > "production ready" ( whatever that means for an
> > unreleased project ;) )
> I am just working on it. But since I am no committer, I develop offline.
excellent. we all work "offline".
contributors submit patches to the committers.
that's how you become a committer, so please keep 
sending those patches ;)

regards,
david

Re: Considering Jackrabbit

Posted by Walter Raboch <wr...@ingen.at>.
Hi David,

> if that really is the case then i would suggest that some
> of the people that are using the orm pm are pitching in to 
> catch up with the rest of jackrabbit and make it 
> "production ready" ( whatever that means for an 
> unreleased project ;) )

I am just working on it. But since I am no committer, I develop offline.

cheers,

Walter

Re: Considering Jackrabbit

Posted by David Nuescheler <da...@gmail.com>.
hi andy,

thanks for comments, i think you are touching on a couple of issues.

> 1. Using a relational database for storage, which would allow us to leverage
> all the work we have already put into making sure our relational database is
> scalable, fault tolerant, etc...
hmm... first: keeping your relational database fault tolerant and 
consistent and keeping your repository fault tolerant and consistent
are unfortunately two separate issues even if you persist you items
in an rdbms.

to me it seems like we have a number of people on this list 
asking for a jdbc persistence manager, but the one that we 
have (the "orm pm") does not receive enough development 
support by all those people that are interested in it?

if that really is the case then i would suggest that some
of the people that are using the orm pm are pitching in to 
catch up with the rest of jackrabbit and make it 
"production ready" ( whatever that means for an 
unreleased project ;) )

i think that having another jdbc approach without scraping 
the orm pm will only spread the seemingly small *developer*
base interested in supporting rdbms backed repositories even 
thinner.

does anybody have an idea or a comment? is there a problem with
the orm implementation? why are there not enough efforts put
into it by the parties interested in it? would people favour a more
direct jdbc implementation?

> 2. Client/server.  We will need "model 3":
> http://incubator.apache.org/jackrabbit/arch/deploy/howto-model3.html - our
> document repository will be an integral part of our application, which will
> be running in a cluster.
> For model 3, the RMI implementation looks too
> weak (not optimized - slow over WAN), and it's hard to tell how mature the
> WebDAV implementation is at a glance.
jackrabbit features two flavours of webdav:

(a) complete remoting server for all the jsr-170 calls using 
delta-v, dasl, order, subscribe, ...

(b) a server that exposes the repository as a filesystem to
work easily with dav_mount, windows webfolders and the
likes.

in my experience (b) works quite nicely for "document 
repositories" since those are operating most of the time on 
a "blob+'meta'-properties" basis, and people frequently are very 
happy with the fact that they can just use the "content repository" 
as a fileserver and tie some magic to the "put" on the repository
backend.

regards,
david

Re: Considering Jackrabbit

Posted by Serge Huber <sh...@jahia.com>.
Andy Depue wrote:

>Could anyone share their experience in these two areas?  Jackrabbit appears to 
>be weak in both areas at the moment, from what I can tell.  There doesn't 
>appear to be any JDBC persistence manager, and I'm not sure how mature the 
>ORM persistence manager is?  For model 3, the RMI implementation looks too 
>weak (not optimized - slow over WAN), and it's hard to tell how mature the 
>WebDAV implementation is at a glance.
>  
>
The ORM persistence manager lags behind Jackrabbit development as it is 
an "optional" module, and my time to maintain it has been limited. I'm 
looking for people to help out though :) But I would definitely not say 
that it is ready for production use.

cheers,
  Serge...