You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Berry van Halderen <b....@1hippo.com> on 2010/08/26 11:47:51 UTC

Improving the reregistering of node types

Dear all,

In Jackrabbit, only "trivial" nodetype changes are supported(see
o.a.j.c.nodetype.NodeTypeDefDiff) to reregister node types.  In order to
change nodetypes we're currently using a module that can basically change any
nodetype structure.  However this is based on pure jcr interaction and
therefor requires relative expensive copy actions.

Better performance could be achieved if a broader set of changes could be
supported by reregistering node types, such as adding mandatory fields or
renaming a field.  By supplying a visitor or default pattern a user
application could control how the changes would be carried out.  Even though
some visiting might be necessary in these cases, because it would not require
jcr interaction, it could be executed much faster.

Alternatively a better support of Node.setPrimaryNodeType would also solve
this.  But that also cannot handle renamed, and blindly drops subnodes and
properties.  Especially for a structure of nodes, where both parent and child
nodes require a setPrimaryNodeType I can't see this to work at the moment.

What we like to probe over the list, is what the inside crowd response would
be to extend the reregistering or setPrimaryNodeType functionality such that a
broader set of operations can be supported, with a better performance than a
pure JCR module as we have now.

Naturally you would all be worried about the amount of work involved, but if
we could contribute most of this, would such an addition seen as valuable and
accepted.  Or is this just a path that you don't like JackRabbit see moving
to.  I can see quite a few obstacles on the way of realizing some of the
changes required, but what do you think would be most problematic to take?

With kind regards,
Berry van Halderen
Hippo

Re: Improving the reregistering of node types

Posted by Stefan Guggisberg <st...@day.com>.
hi berry,

On Thu, Aug 26, 2010 at 11:47 AM, Berry van Halderen
<b....@1hippo.com> wrote:
> Dear all,
>
> In Jackrabbit, only "trivial" nodetype changes are supported(see

well, put differently, 'all but major node type changes are supported' ;)

> o.a.j.c.nodetype.NodeTypeDefDiff) to reregister node types.  In order to
> change nodetypes we're currently using a module that can basically change any
> nodetype structure.  However this is based on pure jcr interaction and
> therefor requires relative expensive copy actions.
>
> Better performance could be achieved if a broader set of changes could be
> supported by reregistering node types, such as adding mandatory fields or

the problem with e.g. adding mandatory fields is that it potentially leaves
inconsistent state, i.e. existing nodes lacking the mandatory field.

> renaming a field.  By supplying a visitor or default pattern a user
> application could control how the changes would be carried out.  Even though
> some visiting might be necessary in these cases, because it would not require
> jcr interaction, it could be executed much faster.

what would be visited? content changes triggered by type modifications?

>
> Alternatively a better support of Node.setPrimaryNodeType would also solve

how would such an improved Node.setPrimaryNodeType look like?

> this.  But that also cannot handle renamed, and blindly drops subnodes and
> properties.  Especially for a structure of nodes, where both parent and child
> nodes require a setPrimaryNodeType I can't see this to work at the moment.
>
> What we like to probe over the list, is what the inside crowd response would
> be to extend the reregistering or setPrimaryNodeType functionality such that a
> broader set of operations can be supported, with a better performance than a
> pure JCR module as we have now.

i assume you are aware of https://issues.apache.org/jira/browse/JCR-322.

while i guess that we all agree that supporting non-trivial node type changes
would be desirable, so far we've been unable to reach consensus on how to
implement it.

e.g. if we were able to efficiently and reliably determine whether a
given node type
is currently not being referenced, we could safely allow all types of
modifications.
however, this operation would IMO require a repository-wide lock, and that's
the problematic part. same applies for node type changes that would trigger
content modifications (removed properties, added mandatory properties etc).

>
> Naturally you would all be worried about the amount of work involved, but if

i am not so much worried about the amount of work. i just don't have found
the right approach...

cheers
stefan

> we could contribute most of this, would such an addition seen as valuable and
> accepted.  Or is this just a path that you don't like JackRabbit see moving
> to.  I can see quite a few obstacles on the way of realizing some of the
> changes required, but what do you think would be most problematic to take?
>
> With kind regards,
> Berry van Halderen
> Hippo
>

Re: Improving the reregistering of node types

Posted by Charles Brooking <pu...@charlie.brooking.id.au>.
On 27/08/10 02:25, Alexander Klimetschek wrote:
> And you only make node types for those things where you are sure they
> are more or less fixed. For other things you keep going with
> nt:unstructured. The same way that mandatory properties change in your
> case, you will have the opposite, ie. that things that were mandatory
> become unnecessary, so generally a more relaxed approach is good for
> the long-term. That adds some more complexity to the application logic
> accessing the content (ie. it no longer expects total integrity hold
> by the underlying storage), but this also makes it more resilient.
>    

Yes, those points are straightforward.

The example I gave was an application providing access through 
Jackrabbit's WebDAV module. That case is different from a typical webapp 
because there's no clear means of adding "application logic" (or, at 
least, there wasn't when I tried in 2009). Using a WebDAV client, users 
can create/modify arbitrary nodes and properties, which makes node types 
quite useful.

(I personally don't see the problem with changing constraints between 
software releases. My experience with writing Rails migrations, for 
example, was that I could apply maximum constraints and it wasn't a pain 
at all changing them. Some people are happy with the implications of 
nt:unstructured, but I don't see why applying node types should be seen 
as so labourious.)

Later
Charlie

Re: Improving the reregistering of node types

Posted by Alexander Klimetschek <ak...@day.com>.
On Thu, Aug 26, 2010 at 15:57, Charles Brooking
<pu...@charlie.brooking.id.au> wrote:
> That was just my use case, but it's interesting to hear of other people
> interested in node types.

In my experience you try to avoid node type changes after a product
has gone "live" and lots of content using those node types is present.
Hence you only change node types during development, where starting
with fresh content is usually no issue.

And you only make node types for those things where you are sure they
are more or less fixed. For other things you keep going with
nt:unstructured. The same way that mandatory properties change in your
case, you will have the opposite, ie. that things that were mandatory
become unnecessary, so generally a more relaxed approach is good for
the long-term. That adds some more complexity to the application logic
accessing the content (ie. it no longer expects total integrity hold
by the underlying storage), but this also makes it more resilient.

See also this paper for some more discussion around data integrity in
the storage or application layer:
http://dev.day.com/content/ddc/blog/2009/01/jcrrdbmsreport.html

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Re: Improving the reregistering of node types

Posted by Charles Brooking <pu...@charlie.brooking.id.au>.
On 26/08/10 19:47, Berry van Halderen wrote:
> In Jackrabbit, only "trivial" nodetype changes are supported(see
> o.a.j.c.nodetype.NodeTypeDefDiff) to reregister node types.  In order to
> change nodetypes we're currently using a module that can basically change any
> nodetype structure.  However this is based on pure jcr interaction and
> therefor requires relative expensive copy actions.
>    

See http://markmail.org/message/hiqvukxc7lftfspm for a previous posting 
(Re: Re-register Custom Node Types Without Destroying Repository?, Sep 
15, 2009) where I describe some steps that I used for changing node 
types. For example: "to add a mandatory property type, first add it as 
optional then create properties in nodes having the relevant node type 
before replacing the property type with its mandatory form." The code 
that I ended up with was similar to migrations in Rails. It required a 
patch that removed the if (diff.isTrivial()) condition from code in 
NodeTypeRegistry.

> Alternatively a better support of Node.setPrimaryNodeType would also solve
> this.  But that also cannot handle renamed, and blindly drops subnodes and
> properties.  Especially for a structure of nodes, where both parent and child
> nodes require a setPrimaryNodeType I can't see this to work at the moment.
>    

I submitted https://issues.apache.org/jira/browse/JCR-2011, "Replacing 
mixin type doesn't preserve properties", last year and the only response 
was that I should use nt:unstructured because "unstructured-ness is what 
JCR is optimized for." However, node types were important for the 
application I developed then because I allowed users write-access via 
the WebDAV module (in addition to access through a conventional webapp). 
Users can modify properties etc through WebDAV, so I relied on node 
types to preserve data integrity.

That was just my use case, but it's interesting to hear of other people 
interested in node types.

Later
Charlie