You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Jukka Zitting <ju...@gmail.com> on 2007/06/29 16:18:04 UTC

Less flexibility

Hi,

I think this will be a bit controversial, but I would like to explore
options for making Jackrabbit *less* flexible and configurable.

Currently we have quite a few internal interfaces like
PersistenceManager, QueryHandler, FileSystem, and Journal that allow
you to fully configure various parts of the system. Many of these
interfaces were fixed in relatively early stages of development and
are now having a major effect on how the product is seen and used. We
have actually encountered a number of cases where new components or
alternatives need to go through extra hoops to comply with an
*internal* interface that might no longer be seen as the optimal
solution.

Some specific examples:

* Bundle persistence is in almost all cases more efficient than the
previous item persistence where each node and property is stored
separately. But the bundle persistence manager still needs to
explicitly simulate item persistence to comply with the
PersistenceManager interface.

* The fixed SearchIndex interface and configuration model cause us to
implement workarounds for configuring things like the synonym matching
or the new indexing rules (see
http://wiki.apache.org/jackrabbit/IndexingConfiguration). See also the
latest comments on JCR-989, especially in the light that the Lucene
SearchIndex implementation is the only real QueryHandler
implementation we have.

* FileSystem instances are being created and passed around even if
many components either just ignore them (see SearchIndex) or rather
use custom alternatives (see database persistence).

And these specific issues are just the tip of the iceberg, the real
problem is that we seem to be so accustomed to these interfaces and
the boundaries they create that we have trouble imagining what we
could do if they didn't exist or at least were more flexible.

I'm not sure what (if anything) we should do about this, especially
since there are backwards-compatibility issues to consider, but I find
it interesting to consider all the possibilities we would have
available if the only Jackrabbit configuration option that was
guaranteed to be backwards compatible was the repository home
directory. :-)

I guess even if we do nothing else about this it would still be good
to keep in mind that the internal interfaces we have now are nothing
more than internal design decisions that may or many not be valid
anymore.

BR,

Jukka Zitting

Re: Less flexibility

Posted by Christoph Kiehl <ch...@sulu3000.de>.
Thomas Mueller wrote:

> I am also for making it less configurable. For my current project
> (GlobalDataStore), dead code and deprecated interfaces are quite a
> pain. I would also remove unused persistence managers, and code that
> we know is not tested well.

+1

There are from time to time users on the list trying to use 
XmlPersistenceManager for example because it sounds somehow appealing to them 
but they would do much better using one of the much faster and maintained bundle 
persistence managers.
I would definitely like to see these old classes being deprecated.

Cheers,
Christoph


Re: Less flexibility

Posted by Thomas Mueller <th...@gmail.com>.
Hi,

I am also for making it less configurable. For my current project
(GlobalDataStore), dead code and deprecated interfaces are quite a
pain. I would also remove unused persistence managers, and code that
we know is not tested well.

Thomas

On 6/29/07, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On 6/29/07, Felix Meschberger <Fe...@day.com> wrote:
> > Yet, another problem you are actually bringing up is the configuration of
> > Jackrabbit at large: I think the current way of configuring Jackrabbit is
> > not flexible enough and needs a rework, too (hasn't this been said before
> > :-) ). Maybe your concerns (apart from the PersistenceManager problems) is
> > mainly an issue of how configuration is taking place ?
>
> Yeah, kind of... I think of the whole problem of fixed configuration
> structure another symptom of the current set of mostly fixed internal
> interfaces. Since the main structure of Jackrabbit is defined by these
> interfaces, the configuration model only needs to provide extension
> points for those interfaces and nothing else.
>
> Alternatively one could think of the fixed configuration model as the
> cause of fixed interfaces. Since changing the configuration model is
> so hard, it is also hard to change the interfaces that are being
> configured.
>
> I think you are right in raising the configuration model as a key question.
>
> BR,
>
> Jukka Zitting
>

Re: Less flexibility

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 6/29/07, Felix Meschberger <Fe...@day.com> wrote:
> Yet, another problem you are actually bringing up is the configuration of
> Jackrabbit at large: I think the current way of configuring Jackrabbit is
> not flexible enough and needs a rework, too (hasn't this been said before
> :-) ). Maybe your concerns (apart from the PersistenceManager problems) is
> mainly an issue of how configuration is taking place ?

Yeah, kind of... I think of the whole problem of fixed configuration
structure another symptom of the current set of mostly fixed internal
interfaces. Since the main structure of Jackrabbit is defined by these
interfaces, the configuration model only needs to provide extension
points for those interfaces and nothing else.

Alternatively one could think of the fixed configuration model as the
cause of fixed interfaces. Since changing the configuration model is
so hard, it is also hard to change the interfaces that are being
configured.

I think you are right in raising the configuration model as a key question.

BR,

Jukka Zitting

Re: Less flexibility

Posted by Felix Meschberger <Fe...@day.com>.
Hi Jukka,

I agree that these internal interfaces - particularly PeristenceManager
comes to my mind - we initially created to solve problems at a very early
stage of Jackrabbit development. Later they gained momentum and span off
their own life in a way, they were never intended to be used. As such, I am
particularly concerned with these interfaces, that they constitute some kind
of de-facto API (as opposed to the real API in the o.a.j.api package or
javax.jcr as a whole) but never have been treated as API. Fortunately
though, they remained relatively fixed.

I agree to seriously consider these internal interfaces and look at their
value nowadays. Yet, if it is decided to walk away from them some kind of
backwards compatibility may certainly be built using some bridges -
performance is probably not the primary goal in backwards compatibility
szenarios.

Yet, another problem you are actually bringing up is the configuration of
Jackrabbit at large: I think the current way of configuring Jackrabbit is
not flexible enough and needs a rework, too (hasn't this been said before
:-) ). Maybe your concerns (apart from the PersistenceManager problems) is
mainly an issue of how configuration is taking place ?

Just my €.02

Regards
Felix

On 6/29/07, Jukka Zitting <ju...@gmail.com> wrote:
>
> Hi,
>
> I think this will be a bit controversial, but I would like to explore
> options for making Jackrabbit *less* flexible and configurable.
>
> Currently we have quite a few internal interfaces like
> PersistenceManager, QueryHandler, FileSystem, and Journal that allow
> you to fully configure various parts of the system. Many of these
> interfaces were fixed in relatively early stages of development and
> are now having a major effect on how the product is seen and used. We
> have actually encountered a number of cases where new components or
> alternatives need to go through extra hoops to comply with an
> *internal* interface that might no longer be seen as the optimal
> solution.
>
> Some specific examples:
>
> * Bundle persistence is in almost all cases more efficient than the
> previous item persistence where each node and property is stored
> separately. But the bundle persistence manager still needs to
> explicitly simulate item persistence to comply with the
> PersistenceManager interface.
>
> * The fixed SearchIndex interface and configuration model cause us to
> implement workarounds for configuring things like the synonym matching
> or the new indexing rules (see
> http://wiki.apache.org/jackrabbit/IndexingConfiguration). See also the
> latest comments on JCR-989, especially in the light that the Lucene
> SearchIndex implementation is the only real QueryHandler
> implementation we have.
>
> * FileSystem instances are being created and passed around even if
> many components either just ignore them (see SearchIndex) or rather
> use custom alternatives (see database persistence).
>
> And these specific issues are just the tip of the iceberg, the real
> problem is that we seem to be so accustomed to these interfaces and
> the boundaries they create that we have trouble imagining what we
> could do if they didn't exist or at least were more flexible.
>
> I'm not sure what (if anything) we should do about this, especially
> since there are backwards-compatibility issues to consider, but I find
> it interesting to consider all the possibilities we would have
> available if the only Jackrabbit configuration option that was
> guaranteed to be backwards compatible was the repository home
> directory. :-)
>
> I guess even if we do nothing else about this it would still be good
> to keep in mind that the internal interfaces we have now are nothing
> more than internal design decisions that may or many not be valid
> anymore.
>
> BR,
>
> Jukka Zitting
>

Re: Less flexibility

Posted by Christoph Kiehl <ch...@sulu3000.de>.
Jukka Zitting wrote:

> * Bundle persistence is in almost all cases more efficient than the
> previous item persistence where each node and property is stored
> separately. But the bundle persistence manager still needs to
> explicitly simulate item persistence to comply with the
> PersistenceManager interface.

Hm, what do you suggest instead? AFAIK those two work quite different and there 
needs to be some interface between persistence and ISMs. So you need to have 
some kind of layer for one of them. Maybe it would have been better to introduce 
a new interface which uses the bundle pm "natively" and a wrapper from that 
interface to the old pm interface?

> * The fixed SearchIndex interface and configuration model cause us to
> implement workarounds for configuring things like the synonym matching
> or the new indexing rules (see
> http://wiki.apache.org/jackrabbit/IndexingConfiguration). See also the
> latest comments on JCR-989, especially in the light that the Lucene
> SearchIndex implementation is the only real QueryHandler
> implementation we have.

I agree that it is quite cumbersome to try to be as generic as possible here 
assuming that there will be not other implementation of SearchIndex anytime soon 
if at all. But I think it's good to have all the lucene stuff separate from the 
rest of the query subsystem. Actually I think a lot of Jackrabbit is too tightly 
coupled to efficiently write unit tests.

> * FileSystem instances are being created and passed around even if
> many components either just ignore them (see SearchIndex) or rather
> use custom alternatives (see database persistence).

I like the FileSystem interface because it enables us to use a memory based file 
system for tests for example. What I don't like is the fact that it is not 
consistenly used throughout Jackrabbit because this makes the interface useless.

> And these specific issues are just the tip of the iceberg, the real
> problem is that we seem to be so accustomed to these interfaces and
> the boundaries they create that we have trouble imagining what we
> could do if they didn't exist or at least were more flexible.
> 
> I'm not sure what (if anything) we should do about this, especially
> since there are backwards-compatibility issues to consider, but I find
> it interesting to consider all the possibilities we would have
> available if the only Jackrabbit configuration option that was
> guaranteed to be backwards compatible was the repository home
> directory. :-)

Well in most cases we could probably come up with some backwards compatible 
wrappers. Backwards compatibility shouldn't be the limiting factor. May be we 
should deprecate some of the old things like XmlPersistenceManager etc. to make 
it easier to provide backwards compatibility.

I think all the points you mentioned were good decisions from an architectural 
view. Do you have any specific change proposals regarding the configuration? How 
would an ideal configuration look like?

Cheers,
Christoph


Re: Less flexibility

Posted by Stefan Guggisberg <st...@gmail.com>.
On 7/2/07, Miro Walker <mi...@gmail.com> wrote:
> Jukka,
>
> I like the idea of making things less configurable (although only if
> it's still possible to configure things the way we need them! :-)).
>
> One area that I've always felt JR is too flexible is around use of
> different configurations for different workspaces. Maybe I'm missing
> something, but I've never been able to understand the value of having
> the versioning backing store on a database and the workspaces that are
> being versioned on a filesystem. Is anyone actually doing this?
>
> We're constantly seeing problems with this approach (corrupted
> repositories, etc.) because there's no true atomicity in versioning
> operations - perhaps if they all used the same backing store (i.e.
> same database connection) then it would be much easier to have them
> committed as part of the same transaction.
>
> How about other configurations? Is anyone out there actually making
> use of the fact that each workspace.xml is based on the
> repository.xml, but can in theory be changed? We never change them

i am ;) it allows e.g. me to migrate data from one pm implementation to another
by cloning the workspace.

cheers
stefan

> after creating them, but still have to dance through all sorts of
> hoops to, for example, change the database connection settings for the
> whole repository.
>
> I guess all these questions are hard to answer without knowing more
> about what's deployed in the field. Perhaps a good way to approach
> this would be to canvas users for what configurations are really
> useful, and then consider retiring those that no longer have a good
> case for them?
>
> Cheers,
>
> miro
>
>
> On 6/29/07, Jukka Zitting <ju...@gmail.com> wrote:
> > Hi,
> >
> > I think this will be a bit controversial, but I would like to explore
> > options for making Jackrabbit *less* flexible and configurable.
> >
> > Currently we have quite a few internal interfaces like
> > PersistenceManager, QueryHandler, FileSystem, and Journal that allow
> > you to fully configure various parts of the system. Many of these
> > interfaces were fixed in relatively early stages of development and
> > are now having a major effect on how the product is seen and used. We
> > have actually encountered a number of cases where new components or
> > alternatives need to go through extra hoops to comply with an
> > *internal* interface that might no longer be seen as the optimal
> > solution.
> >
> > Some specific examples:
> >
> > * Bundle persistence is in almost all cases more efficient than the
> > previous item persistence where each node and property is stored
> > separately. But the bundle persistence manager still needs to
> > explicitly simulate item persistence to comply with the
> > PersistenceManager interface.
> >
> > * The fixed SearchIndex interface and configuration model cause us to
> > implement workarounds for configuring things like the synonym matching
> > or the new indexing rules (see
> > http://wiki.apache.org/jackrabbit/IndexingConfiguration). See also the
> > latest comments on JCR-989, especially in the light that the Lucene
> > SearchIndex implementation is the only real QueryHandler
> > implementation we have.
> >
> > * FileSystem instances are being created and passed around even if
> > many components either just ignore them (see SearchIndex) or rather
> > use custom alternatives (see database persistence).
> >
> > And these specific issues are just the tip of the iceberg, the real
> > problem is that we seem to be so accustomed to these interfaces and
> > the boundaries they create that we have trouble imagining what we
> > could do if they didn't exist or at least were more flexible.
> >
> > I'm not sure what (if anything) we should do about this, especially
> > since there are backwards-compatibility issues to consider, but I find
> > it interesting to consider all the possibilities we would have
> > available if the only Jackrabbit configuration option that was
> > guaranteed to be backwards compatible was the repository home
> > directory. :-)
> >
> > I guess even if we do nothing else about this it would still be good
> > to keep in mind that the internal interfaces we have now are nothing
> > more than internal design decisions that may or many not be valid
> > anymore.
> >
> > BR,
> >
> > Jukka Zitting
> >
>

Re: Less flexibility

Posted by Miro Walker <mi...@gmail.com>.
Jukka,

I like the idea of making things less configurable (although only if
it's still possible to configure things the way we need them! :-)).

One area that I've always felt JR is too flexible is around use of
different configurations for different workspaces. Maybe I'm missing
something, but I've never been able to understand the value of having
the versioning backing store on a database and the workspaces that are
being versioned on a filesystem. Is anyone actually doing this?

We're constantly seeing problems with this approach (corrupted
repositories, etc.) because there's no true atomicity in versioning
operations - perhaps if they all used the same backing store (i.e.
same database connection) then it would be much easier to have them
committed as part of the same transaction.

How about other configurations? Is anyone out there actually making
use of the fact that each workspace.xml is based on the
repository.xml, but can in theory be changed? We never change them
after creating them, but still have to dance through all sorts of
hoops to, for example, change the database connection settings for the
whole repository.

I guess all these questions are hard to answer without knowing more
about what's deployed in the field. Perhaps a good way to approach
this would be to canvas users for what configurations are really
useful, and then consider retiring those that no longer have a good
case for them?

Cheers,

miro


On 6/29/07, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> I think this will be a bit controversial, but I would like to explore
> options for making Jackrabbit *less* flexible and configurable.
>
> Currently we have quite a few internal interfaces like
> PersistenceManager, QueryHandler, FileSystem, and Journal that allow
> you to fully configure various parts of the system. Many of these
> interfaces were fixed in relatively early stages of development and
> are now having a major effect on how the product is seen and used. We
> have actually encountered a number of cases where new components or
> alternatives need to go through extra hoops to comply with an
> *internal* interface that might no longer be seen as the optimal
> solution.
>
> Some specific examples:
>
> * Bundle persistence is in almost all cases more efficient than the
> previous item persistence where each node and property is stored
> separately. But the bundle persistence manager still needs to
> explicitly simulate item persistence to comply with the
> PersistenceManager interface.
>
> * The fixed SearchIndex interface and configuration model cause us to
> implement workarounds for configuring things like the synonym matching
> or the new indexing rules (see
> http://wiki.apache.org/jackrabbit/IndexingConfiguration). See also the
> latest comments on JCR-989, especially in the light that the Lucene
> SearchIndex implementation is the only real QueryHandler
> implementation we have.
>
> * FileSystem instances are being created and passed around even if
> many components either just ignore them (see SearchIndex) or rather
> use custom alternatives (see database persistence).
>
> And these specific issues are just the tip of the iceberg, the real
> problem is that we seem to be so accustomed to these interfaces and
> the boundaries they create that we have trouble imagining what we
> could do if they didn't exist or at least were more flexible.
>
> I'm not sure what (if anything) we should do about this, especially
> since there are backwards-compatibility issues to consider, but I find
> it interesting to consider all the possibilities we would have
> available if the only Jackrabbit configuration option that was
> guaranteed to be backwards compatible was the repository home
> directory. :-)
>
> I guess even if we do nothing else about this it would still be good
> to keep in mind that the internal interfaces we have now are nothing
> more than internal design decisions that may or many not be valid
> anymore.
>
> BR,
>
> Jukka Zitting
>