You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Angela Schreiber <an...@adobe.com> on 2013/02/05 17:20:40 UTC

Inconsistent behavior upon moving nodes (was: Re: When moving a Tree, can it die?)

hi jukka

i just learned that this discussion wasn't solely about the Tree's
behavior upon move/rename but also affects the JCR Nodes.

as stated in https://issues.apache.org/jira/browse/OAK-606
and https://issues.apache.org/jira/browse/OAK-607
the current behavior is IMO inconsistent between new and existing
nodes and pretty strange from a JCR API consumer point of view.

since i missed the fact that this discussion was about Nodes
as well i would like to express my concerns and disagreement
with what may have looked like lazy consensus.

kind regards
angela

ps: i would appreciate if subjects could be kept up to date when
extending the focus of the discussion.


On 10/23/12 4:31 PM, Jukka Zitting wrote:
> Hi,
>
> On Fri, Oct 19, 2012 at 11:40 PM, Jukka Zitting<ju...@gmail.com>  wrote:
>> The trouble is that a parent doesn't have a direct reference to a
>> child and can thus not tell it that it has become invalid. The only
>> way I can think of to solve this issue is for the child to ask the
>> parent about the status whenever accessed, a bit like what
>> MemoryNodeBuilder now does.
>
> In the past few MemoryNodeBuilder commits I came up with a mechanism
> that covers this case even better.
>
> Basically the idea is that each builder is permanently tied to the
> *path* from which it was acquired. If content is copied or moved
> around, the builders used to access or modify the source tree still
> refer to the old location even if it no longer exists (in which case
> they'd throw IllegalStateExceptions when accessed). If other content
> is added to replace previously removed content, then such builders
> would automatically start reflecting the new content. In short, a
> builder accessed at a given path will always behave as if it was a
> fresh new builder acquired using that same path (with an ISE thrown if
> the path doesn't exist).
>
> To make this behavior apply also to Trees in oak-core and Nodes in
> oak-jcr, we'd need to change the semantics of TreeImpl and NodeImpl
> similarly so that they'll always refer to content at the path for
> which they were originally created. Operations like Root.move() or
> Session.move() will not change the paths or parents of existing Tree
> or Node *instances*, though they may well cause those instances to
> become invalid (resulting in InvalidItemStateExceptions being thrown
> when trying to access such nodes).
>
> Such a change will cause a few existing test cases to fail, but I
> think in the big picture such change in behavior should be OK as
> AFAICT there aren't that many clients that even use move() or, if they
> do, depend on a specific behavior of existing Node instances across
> the move.
>
> BR,
>
> Jukka Zitting

Re: Inconsistent behavior upon moving nodes

Posted by Angela Schreiber <an...@adobe.com>.
in addition we still have an open TODO that the identifier
of a non-referenceable node should be as stable as possible
which means that it should include the identifier of the
parent... in other words: if one of the parent was referenceable
the identifier should be somehow combine uuid + relative path.

see https://issues.apache.org/jira/browse/OAK-101

kind regards
angela

On 2/7/13 1:05 PM, Michael Dürig wrote:
>
> Hi Dominik,
>
> This behaviour has been chosen as a trade off. The JCR specification
> explicitly allows for this. That is, the path might be the most stable
> identifier so this is what is returned for non referenceable nodes.
> Apart from that, the identifier is opaque and the fact that it resembles
> a path must not lead the client to use it as such.
>
> For clients who need "more" stable identifiers, there is always the
> option of making nodes referenceable. This puts clients into control of
> trading off the additional overhead involved for the additional
> identifier stability.
>
> Michael
>
>
> On 7.2.13 8:04, Dominik Süß wrote:
>> Sorry for jumping in without having the overall picture.
>> Is it really a good idea to use the path as identifier? For Referencable
>> Identifiers (25.1) it is pretty clear that an ID is structureindepenent, so
>> why should a nonreferencable be bound to a path. Since a node can return
>> its path I do not see a reason why the identifier should be the same. I'd
>> assume that someone implementing on that base then would now that ID is
>> something reliable when having the "move" case, while path is really
>> something structurespecific. By a combination of both you can be sure to
>> handle the same entity.
>>
>> Best regards
>> Dominik
>>
>>
>>
>> On Wed, Feb 6, 2013 at 9:43 PM, Jukka Zitting<ju...@gmail.com>wrote:
>>
>>> Hi,
>>>
>>> On Wed, Feb 6, 2013 at 8:04 PM, Marcel Reutegger<mr...@adobe.com>
>>> wrote:
>>>> They are easy to fix but very difficult to detect and diagnose. No
>>> exception is thrown,
>>>> just the behavior is unexpected. I think this is quite dangerous, since
>>> the spec is
>>>> IMO quite clear about this
>>>>
>>> http://www.day.com/specs/jcr/2.0/10_Writing.html#10.11.7%20Reflecting%20Item%20State
>>>
>>> That's not as straightforward as it sounds. For example, consider the
>>> following sequence:
>>>
>>>       Node a = session.getNode("/foo");
>>>       String id = a.getIdentifier();
>>>
>>>       session.move("/foo", "/bar");
>>>       session.getRootNode().addNode("foo");
>>>
>>>       Node b = session.getNodeByIdentifier(id);
>>>       assert a.isSame(b); // ???
>>>
>>> Should the last assertion pass or not? Interestingly it passes with
>>> both Jackrabbit 2.x and Oak, even though they return different values
>>> for b.getPath(). They're both correct!
>>>
>>> According to sections 10.11.7 and 10.11.4, item identity in such cases
>>> is determined by the item identifier, which means that a.isSame(b)
>>> should always be true. Thus, for implementations like Oak, that use
>>> the item path (up to root or the first referenceable ancestor) as the
>>> identifier of non-referenceable nodes, the effect of the above move()
>>> call should therefore be as if a.remove() had been called as otherwise
>>> the identity of a would change. As a consequence, b.getPath() should
>>> return "/foo". In Jackrabbit 2.x, where each node has a unique
>>> non-path identifier, the move() call would change the path of a and
>>> result in b.getPath() returning "/bar".
>>>
>>> The above rationale would imply that in Oak is to make sure that all
>>> session refreshes and transient moves should trigger re-evaluation of
>>> the the paths of referenceable nodes and those with a referenceable
>>> ancestor. Other nodes should keep behaving as they currently do.
>>>
>>> BR,
>>>
>>> Jukka Zitting
>>>
>>

Re: Inconsistent behavior upon moving nodes

Posted by Michael Dürig <md...@apache.org>.
Hi Dominik,

This behaviour has been chosen as a trade off. The JCR specification 
explicitly allows for this. That is, the path might be the most stable 
identifier so this is what is returned for non referenceable nodes. 
Apart from that, the identifier is opaque and the fact that it resembles 
a path must not lead the client to use it as such.

For clients who need "more" stable identifiers, there is always the 
option of making nodes referenceable. This puts clients into control of 
trading off the additional overhead involved for the additional 
identifier stability.

Michael


On 7.2.13 8:04, Dominik Süß wrote:
> Sorry for jumping in without having the overall picture.
> Is it really a good idea to use the path as identifier? For Referencable
> Identifiers (25.1) it is pretty clear that an ID is structureindepenent, so
> why should a nonreferencable be bound to a path. Since a node can return
> its path I do not see a reason why the identifier should be the same. I'd
> assume that someone implementing on that base then would now that ID is
> something reliable when having the "move" case, while path is really
> something structurespecific. By a combination of both you can be sure to
> handle the same entity.
>
> Best regards
> Dominik
>
>
>
> On Wed, Feb 6, 2013 at 9:43 PM, Jukka Zitting <ju...@gmail.com>wrote:
>
>> Hi,
>>
>> On Wed, Feb 6, 2013 at 8:04 PM, Marcel Reutegger <mr...@adobe.com>
>> wrote:
>>> They are easy to fix but very difficult to detect and diagnose. No
>> exception is thrown,
>>> just the behavior is unexpected. I think this is quite dangerous, since
>> the spec is
>>> IMO quite clear about this
>>>
>> http://www.day.com/specs/jcr/2.0/10_Writing.html#10.11.7%20Reflecting%20Item%20State
>>
>> That's not as straightforward as it sounds. For example, consider the
>> following sequence:
>>
>>      Node a = session.getNode("/foo");
>>      String id = a.getIdentifier();
>>
>>      session.move("/foo", "/bar");
>>      session.getRootNode().addNode("foo");
>>
>>      Node b = session.getNodeByIdentifier(id);
>>      assert a.isSame(b); // ???
>>
>> Should the last assertion pass or not? Interestingly it passes with
>> both Jackrabbit 2.x and Oak, even though they return different values
>> for b.getPath(). They're both correct!
>>
>> According to sections 10.11.7 and 10.11.4, item identity in such cases
>> is determined by the item identifier, which means that a.isSame(b)
>> should always be true. Thus, for implementations like Oak, that use
>> the item path (up to root or the first referenceable ancestor) as the
>> identifier of non-referenceable nodes, the effect of the above move()
>> call should therefore be as if a.remove() had been called as otherwise
>> the identity of a would change. As a consequence, b.getPath() should
>> return "/foo". In Jackrabbit 2.x, where each node has a unique
>> non-path identifier, the move() call would change the path of a and
>> result in b.getPath() returning "/bar".
>>
>> The above rationale would imply that in Oak is to make sure that all
>> session refreshes and transient moves should trigger re-evaluation of
>> the the paths of referenceable nodes and those with a referenceable
>> ancestor. Other nodes should keep behaving as they currently do.
>>
>> BR,
>>
>> Jukka Zitting
>>
>

Re: Inconsistent behavior upon moving nodes (was: Re: When moving a Tree, can it die?)

Posted by Dominik Süß <do...@gmail.com>.
Sorry for jumping in without having the overall picture.
Is it really a good idea to use the path as identifier? For Referencable
Identifiers (25.1) it is pretty clear that an ID is structureindepenent, so
why should a nonreferencable be bound to a path. Since a node can return
its path I do not see a reason why the identifier should be the same. I'd
assume that someone implementing on that base then would now that ID is
something reliable when having the "move" case, while path is really
something structurespecific. By a combination of both you can be sure to
handle the same entity.

Best regards
Dominik



On Wed, Feb 6, 2013 at 9:43 PM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Wed, Feb 6, 2013 at 8:04 PM, Marcel Reutegger <mr...@adobe.com>
> wrote:
> > They are easy to fix but very difficult to detect and diagnose. No
> exception is thrown,
> > just the behavior is unexpected. I think this is quite dangerous, since
> the spec is
> > IMO quite clear about this
> >
> http://www.day.com/specs/jcr/2.0/10_Writing.html#10.11.7%20Reflecting%20Item%20State
>
> That's not as straightforward as it sounds. For example, consider the
> following sequence:
>
>     Node a = session.getNode("/foo");
>     String id = a.getIdentifier();
>
>     session.move("/foo", "/bar");
>     session.getRootNode().addNode("foo");
>
>     Node b = session.getNodeByIdentifier(id);
>     assert a.isSame(b); // ???
>
> Should the last assertion pass or not? Interestingly it passes with
> both Jackrabbit 2.x and Oak, even though they return different values
> for b.getPath(). They're both correct!
>
> According to sections 10.11.7 and 10.11.4, item identity in such cases
> is determined by the item identifier, which means that a.isSame(b)
> should always be true. Thus, for implementations like Oak, that use
> the item path (up to root or the first referenceable ancestor) as the
> identifier of non-referenceable nodes, the effect of the above move()
> call should therefore be as if a.remove() had been called as otherwise
> the identity of a would change. As a consequence, b.getPath() should
> return "/foo". In Jackrabbit 2.x, where each node has a unique
> non-path identifier, the move() call would change the path of a and
> result in b.getPath() returning "/bar".
>
> The above rationale would imply that in Oak is to make sure that all
> session refreshes and transient moves should trigger re-evaluation of
> the the paths of referenceable nodes and those with a referenceable
> ancestor. Other nodes should keep behaving as they currently do.
>
> BR,
>
> Jukka Zitting
>

Re: Inconsistent behavior upon moving nodes

Posted by Angela Schreiber <an...@adobe.com>.
sound good... the rest was probably a misunderstanding :-)
gruss
angela

On 2/8/13 10:39 AM, Michael Dürig wrote:
>
>
> On 8.2.13 9:10, Angela Schreiber wrote:
>> hi michael
>>
>> ok... but the subject of this thread is the behavior of nodes
>> upon move and as a simple test shows the behavior is the same
>> for both referenceable and non-referenceable nodes.
>>
>> while i agree that the behavior of "same nodes" may change
>> due to the way we define the identifier, i would still claim
>> that the way we currently implement the move is not correct
>> and the inconsistency between new and existing nodes will be
>> troublesome for the reasons marcel explained.
>
> Yes I agreed... and actually never denied that ;-)
>
> There are now various parts to this story:
>
> 1) The intended behaviour I had in mind is questionable. See Jukka's
> argument,
> 2) the intended behaviour I had in mind is not correctly implemented,
> 3) support for identifier is not there yet as you mentioned (OAK-101).
>
> I'll first fix the obvious bugs (OAK-614 was one). Afterwards we should
> agree on the "right" behaviour and implement that along side with the
> open todos in OAK-101.
>
> Michael
>
>>
>> kind regards
>> angela
>>
>>
>> On 2/7/13 2:43 PM, Michael Dürig wrote:
>>>
>>>
>>> On 7.2.13 13:40, Jukka Zitting wrote:
>>>> Hi,
>>>>
>>>> On Thu, Feb 7, 2013 at 2:28 PM, Angela Schreiber<an...@adobe.com>
>>>> wrote:
>>>>> as far as i remember we never decided to use the path as
>>>>> identifier. we said that we want to keep it as stable as
>>>>> possible... for a referenceable node Node#getIdentifier
>>>>> returns the UUID for a non-referenceable node it should
>>>>> include the parent identifier and a relative path thing.
>>>>
>>>> I think Michael was referring just to nodes that are non-referenceable
>>>> and have no referenceable ancestors. The relevant discussions are
>>>> summarized in OAK-101.
>>>
>>> Yes indeed. Thanks for clarifying.
>>>
>>> Michael
>>>
>>>>
>>>> BR,
>>>>
>>>> Jukka Zitting
>>>>

Re: Inconsistent behavior upon moving nodes

Posted by Michael Dürig <md...@apache.org>.

On 8.2.13 9:10, Angela Schreiber wrote:
> hi michael
>
> ok... but the subject of this thread is the behavior of nodes
> upon move and as a simple test shows the behavior is the same
> for both referenceable and non-referenceable nodes.
>
> while i agree that the behavior of "same nodes" may change
> due to the way we define the identifier, i would still claim
> that the way we currently implement the move is not correct
> and the inconsistency between new and existing nodes will be
> troublesome for the reasons marcel explained.

Yes I agreed... and actually never denied that ;-)

There are now various parts to this story:

1) The intended behaviour I had in mind is questionable. See Jukka's 
argument,
2) the intended behaviour I had in mind is not correctly implemented,
3) support for identifier is not there yet as you mentioned (OAK-101).

I'll first fix the obvious bugs (OAK-614 was one). Afterwards we should 
agree on the "right" behaviour and implement that along side with the 
open todos in OAK-101.

Michael

>
> kind regards
> angela
>
>
> On 2/7/13 2:43 PM, Michael Dürig wrote:
>>
>>
>> On 7.2.13 13:40, Jukka Zitting wrote:
>>> Hi,
>>>
>>> On Thu, Feb 7, 2013 at 2:28 PM, Angela Schreiber<an...@adobe.com>
>>> wrote:
>>>> as far as i remember we never decided to use the path as
>>>> identifier. we said that we want to keep it as stable as
>>>> possible... for a referenceable node Node#getIdentifier
>>>> returns the UUID for a non-referenceable node it should
>>>> include the parent identifier and a relative path thing.
>>>
>>> I think Michael was referring just to nodes that are non-referenceable
>>> and have no referenceable ancestors. The relevant discussions are
>>> summarized in OAK-101.
>>
>> Yes indeed. Thanks for clarifying.
>>
>> Michael
>>
>>>
>>> BR,
>>>
>>> Jukka Zitting
>>>

Re: Inconsistent behavior upon moving nodes

Posted by Angela Schreiber <an...@adobe.com>.
hi michael

ok... but the subject of this thread is the behavior of nodes
upon move and as a simple test shows the behavior is the same
for both referenceable and non-referenceable nodes.

while i agree that the behavior of "same nodes" may change
due to the way we define the identifier, i would still claim
that the way we currently implement the move is not correct
and the inconsistency between new and existing nodes will be
troublesome for the reasons marcel explained.

kind regards
angela


On 2/7/13 2:43 PM, Michael Dürig wrote:
>
>
> On 7.2.13 13:40, Jukka Zitting wrote:
>> Hi,
>>
>> On Thu, Feb 7, 2013 at 2:28 PM, Angela Schreiber<an...@adobe.com>  wrote:
>>> as far as i remember we never decided to use the path as
>>> identifier. we said that we want to keep it as stable as
>>> possible... for a referenceable node Node#getIdentifier
>>> returns the UUID for a non-referenceable node it should
>>> include the parent identifier and a relative path thing.
>>
>> I think Michael was referring just to nodes that are non-referenceable
>> and have no referenceable ancestors. The relevant discussions are
>> summarized in OAK-101.
>
> Yes indeed. Thanks for clarifying.
>
> Michael
>
>>
>> BR,
>>
>> Jukka Zitting
>>

Re: Inconsistent behavior upon moving nodes

Posted by Michael Dürig <md...@apache.org>.

On 7.2.13 13:40, Jukka Zitting wrote:
> Hi,
>
> On Thu, Feb 7, 2013 at 2:28 PM, Angela Schreiber <an...@adobe.com> wrote:
>> as far as i remember we never decided to use the path as
>> identifier. we said that we want to keep it as stable as
>> possible... for a referenceable node Node#getIdentifier
>> returns the UUID for a non-referenceable node it should
>> include the parent identifier and a relative path thing.
>
> I think Michael was referring just to nodes that are non-referenceable
> and have no referenceable ancestors. The relevant discussions are
> summarized in OAK-101.

Yes indeed. Thanks for clarifying.

Michael

>
> BR,
>
> Jukka Zitting
>

Re: Inconsistent behavior upon moving nodes

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Feb 7, 2013 at 2:28 PM, Angela Schreiber <an...@adobe.com> wrote:
> as far as i remember we never decided to use the path as
> identifier. we said that we want to keep it as stable as
> possible... for a referenceable node Node#getIdentifier
> returns the UUID for a non-referenceable node it should
> include the parent identifier and a relative path thing.

I think Michael was referring just to nodes that are non-referenceable
and have no referenceable ancestors. The relevant discussions are
summarized in OAK-101.

BR,

Jukka Zitting

Re: Inconsistent behavior upon moving nodes

Posted by Michael Dürig <md...@apache.org>.

On 7.2.13 13:28, Angela Schreiber wrote:
> as far as i remember we never decided to use the path as
> identifier. we said that we want to keep it as stable as
> possible... for a referenceable node Node#getIdentifier
> returns the UUID for a non-referenceable node it should
> include the parent identifier and a relative path thing.

Which is the path as long as non of the parents is referenceable.

Michael

>
> kind regards
> angela
>
> On 2/7/13 2:09 PM, Michael Dürig wrote:
>>
>>
>> On 6.2.13 20:43, Jukka Zitting wrote:
>>> The above rationale would imply that in Oak is to make sure that all
>>> session refreshes and transient moves should trigger re-evaluation of
>>> the the paths of referenceable nodes and those with a referenceable
>>> ancestor. Other nodes should keep behaving as they currently do.
>>
>> I agree. This is a direct consequence of using paths as identifiers: the
>> target node after a move *is not* the same (wrt. to isSame()) than the
>> source node under this model. In contrast to the model where each node
>> has an UUID. Here the target node after a move *is* the same than the
>> source node.
>>
>> So the correct behaviour for non referenceable nodes thus is:
>>
>>      Node foo = session.getNode("/foo");
>>      session.move("/foo", "/bar");
>>
>>      foo.getPath(); // throws IllegalItemStateException
>>
>>      session.getRootNode().addNode("foo");
>>
>>      foo.getPath(); // Evaluates to "/foo"
>>
>>
>> Michael

Re: Inconsistent behavior upon moving nodes

Posted by Angela Schreiber <an...@adobe.com>.
as far as i remember we never decided to use the path as
identifier. we said that we want to keep it as stable as
possible... for a referenceable node Node#getIdentifier
returns the UUID for a non-referenceable node it should
include the parent identifier and a relative path thing.

kind regards
angela

On 2/7/13 2:09 PM, Michael Dürig wrote:
>
>
> On 6.2.13 20:43, Jukka Zitting wrote:
>> The above rationale would imply that in Oak is to make sure that all
>> session refreshes and transient moves should trigger re-evaluation of
>> the the paths of referenceable nodes and those with a referenceable
>> ancestor. Other nodes should keep behaving as they currently do.
>
> I agree. This is a direct consequence of using paths as identifiers: the
> target node after a move *is not* the same (wrt. to isSame()) than the
> source node under this model. In contrast to the model where each node
> has an UUID. Here the target node after a move *is* the same than the
> source node.
>
> So the correct behaviour for non referenceable nodes thus is:
>
>      Node foo = session.getNode("/foo");
>      session.move("/foo", "/bar");
>
>      foo.getPath(); // throws IllegalItemStateException
>
>      session.getRootNode().addNode("foo");
>
>      foo.getPath(); // Evaluates to "/foo"
>
>
> Michael

Re: Inconsistent behavior upon moving nodes

Posted by Michael Dürig <md...@apache.org>.

On 6.2.13 20:43, Jukka Zitting wrote:
> The above rationale would imply that in Oak is to make sure that all
> session refreshes and transient moves should trigger re-evaluation of
> the the paths of referenceable nodes and those with a referenceable
> ancestor. Other nodes should keep behaving as they currently do.

I agree. This is a direct consequence of using paths as identifiers: the 
target node after a move *is not* the same (wrt. to isSame()) than the 
source node under this model. In contrast to the model where each node 
has an UUID. Here the target node after a move *is* the same than the 
source node.

So the correct behaviour for non referenceable nodes thus is:

    Node foo = session.getNode("/foo");
    session.move("/foo", "/bar");

    foo.getPath(); // throws IllegalItemStateException

    session.getRootNode().addNode("foo");

    foo.getPath(); // Evaluates to "/foo"


Michael

Re: Inconsistent behavior upon moving nodes (was: Re: When moving a Tree, can it die?)

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Wed, Feb 6, 2013 at 8:04 PM, Marcel Reutegger <mr...@adobe.com> wrote:
> They are easy to fix but very difficult to detect and diagnose. No exception is thrown,
> just the behavior is unexpected. I think this is quite dangerous, since the spec is
> IMO quite clear about this
> http://www.day.com/specs/jcr/2.0/10_Writing.html#10.11.7%20Reflecting%20Item%20State

That's not as straightforward as it sounds. For example, consider the
following sequence:

    Node a = session.getNode("/foo");
    String id = a.getIdentifier();

    session.move("/foo", "/bar");
    session.getRootNode().addNode("foo");

    Node b = session.getNodeByIdentifier(id);
    assert a.isSame(b); // ???

Should the last assertion pass or not? Interestingly it passes with
both Jackrabbit 2.x and Oak, even though they return different values
for b.getPath(). They're both correct!

According to sections 10.11.7 and 10.11.4, item identity in such cases
is determined by the item identifier, which means that a.isSame(b)
should always be true. Thus, for implementations like Oak, that use
the item path (up to root or the first referenceable ancestor) as the
identifier of non-referenceable nodes, the effect of the above move()
call should therefore be as if a.remove() had been called as otherwise
the identity of a would change. As a consequence, b.getPath() should
return "/foo". In Jackrabbit 2.x, where each node has a unique
non-path identifier, the move() call would change the path of a and
result in b.getPath() returning "/bar".

The above rationale would imply that in Oak is to make sure that all
session refreshes and transient moves should trigger re-evaluation of
the the paths of referenceable nodes and those with a referenceable
ancestor. Other nodes should keep behaving as they currently do.

BR,

Jukka Zitting

Re: Inconsistent behavior upon moving nodes (was: Re: When moving a Tree, can it die?)

Posted by Marcel Reutegger <mr...@adobe.com>.
Hi,

On 05.02.2013, at 11:47, "Jukka Zitting" <ju...@gmail.com>> wrote:

Do you have examples of where the new behavior would be troublesome
for existing code (not just new test cases)? The assumption from the
earlier discussion was that such cases should be pretty rare and easy
to fix if needed. If that assumption is incorrect, then we obviously
need to rethink the solution.

They are easy to fix but very difficult to detect and diagnose. No exception is thrown, just the behavior is unexpected. I think this is quite dangerous, since the spec is IMO quite clear about this http://www.day.com/specs/jcr/2.0/10_Writing.html#10.11.7%20Reflecting%20Item%20State

Regards
 Marcel


Re: Inconsistent behavior upon moving nodes (was: Re: When moving a Tree, can it die?)

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Feb 5, 2013 at 5:20 PM, Angela Schreiber <an...@adobe.com> wrote:
> as stated in https://issues.apache.org/jira/browse/OAK-606
> and https://issues.apache.org/jira/browse/OAK-607
> the current behavior is IMO inconsistent between new and existing
> nodes and pretty strange from a JCR API consumer point of view.
>
> since i missed the fact that this discussion was about Nodes
> as well i would like to express my concerns and disagreement
> with what may have looked like lazy consensus.

Fair enough, we can revisit the discussion.

Do you have examples of where the new behavior would be troublesome
for existing code (not just new test cases)? The assumption from the
earlier discussion was that such cases should be pretty rare and easy
to fix if needed. If that assumption is incorrect, then we obviously
need to rethink the solution.

To rehash the earlier discussion, the rationale for the new behavior
is to avoid having to use weak references for keeping track of live
node instances just so that their paths can be updated in the rare
case when they get moved around. Even though this now mostly works in
Jackrabbit 2.x, the relevant code is pretty complex, took years to
debug and probably still hides a few potential deadlocks. So if
possible, I'd really like to avoid having to do this in Oak.

BR,

Jukka Zitting

Re: Inconsistent behavior upon moving nodes

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Feb 5, 2013 at 7:27 PM, Angela Schreiber <an...@adobe.com> wrote:
> from my experience move operations are rather rare (as already
> noticed by jukka). in combination with the fact that our main use
> case in general involves very shorted-lived sessions i would assume that
> 3b should be feasible without hampering our overall goals. i would
> treat this similar to namespace remappings: the default should be
> a no-op but it might get expensive if there are a lot of transient
> move operations pending...

+1 assuming there's an easy way to implement this. As mentioned in my
other mail, I'd really like to avoid having to introduce a lot of
extra complexity or other overhead to handle this case.

BR,

Jukka Zitting

Re: Inconsistent behavior upon moving nodes

Posted by Angela Schreiber <an...@adobe.com>.
hi michael

thanks for the information. i agree that 3) currently seems the
way to go if we want to address the move issue.

from my experience move operations are rather rare (as already
noticed by jukka). in combination with the fact that our main use
case in general involves very shorted-lived sessions i would assume that
3b should be feasible without hampering our overall goals. i would
treat this similar to namespace remappings: the default should be
a no-op but it might get expensive if there are a lot of transient
move operations pending... in general i would prefer such a limitation
over the compatibility break in particular since new and existing
nodes behave differently and the issues as described in OAK-606 and
OAK-607 are likely to cause wired behavior that will be hard to track.

regards
angela


On 2/5/13 6:10 PM, Michael Dürig wrote:
>
> Hi,
>
> On 5.2.13 16:39, Michael Dürig wrote:
>> AFAICT we have three possibilities here:
>>
>> 1) drop support for tracking nodes across moves (AKA OAK-391),
>> 2) track nodes through weak reference,
>> 3) come up with another ingenious approach of tracking nodes across moves.
>
> Some more perspective on 3):
>
> Every approach will either (a) need to notify node instances about moves
> or (b) each node instance would need to check whether a move occurred
> after the respective instance has been acquired.
>
> (a) The first approach requires some kind of tracking of all node
> instances so it will be basically option 2) from above.
>
> (b) The second approach requires tracking of all move in a session and
> some information to determine whether a move has happened before a node
> was acquired or after and thus whether that move affected that node or not.
>
> Michael
>
>
>

Re: Inconsistent behavior upon moving nodes

Posted by Michael Dürig <md...@apache.org>.
Hi,

On 5.2.13 16:39, Michael Dürig wrote:
> AFAICT we have three possibilities here:
>
> 1) drop support for tracking nodes across moves (AKA OAK-391),
> 2) track nodes through weak reference,
> 3) come up with another ingenious approach of tracking nodes across moves.

Some more perspective on 3):

Every approach will either (a) need to notify node instances about moves 
or (b) each node instance would need to check whether a move occurred 
after the respective instance has been acquired.

(a) The first approach requires some kind of tracking of all node 
instances so it will be basically option 2) from above.

(b) The second approach requires tracking of all move in a session and 
some information to determine whether a move has happened before a node 
was acquired or after and thus whether that move affected that node or not.

Michael




Re: Inconsistent behavior upon moving nodes

Posted by Michael Dürig <md...@apache.org>.
Some background on this:

Tracking nodes across moves is troublesome. This is something which came 
already up in my internal prototype as early as late 2011. I mentioned 
this at our meeting in February 2012 and also that one way to handle 
this are weak references.

With the node builder approach in OAK-391 Jukka explored ways to get rid 
of weak references but came to the same conclusion: this won't work with 
moves.

Since we considered weak references to be worse than not being able to 
track nodes across moves, we dropped the support for the latter.


AFAICT we have three possibilities here:

1) drop support for tracking nodes across moves (AKA OAK-391),
2) track nodes through weak reference,
3) come up with another ingenious approach of tracking nodes across moves.

I'd be for 3) if someone can come up with something and 1) otherwise.

Michael


On 5.2.13 16:20, Angela Schreiber wrote:
> hi jukka
>
> i just learned that this discussion wasn't solely about the Tree's
> behavior upon move/rename but also affects the JCR Nodes.
>
> as stated in https://issues.apache.org/jira/browse/OAK-606
> and https://issues.apache.org/jira/browse/OAK-607
> the current behavior is IMO inconsistent between new and existing
> nodes and pretty strange from a JCR API consumer point of view.
>
> since i missed the fact that this discussion was about Nodes
> as well i would like to express my concerns and disagreement
> with what may have looked like lazy consensus.
>
> kind regards
> angela
>
> ps: i would appreciate if subjects could be kept up to date when
> extending the focus of the discussion.
>
>
> On 10/23/12 4:31 PM, Jukka Zitting wrote:
>> Hi,
>>
>> On Fri, Oct 19, 2012 at 11:40 PM, Jukka
>> Zitting<ju...@gmail.com>  wrote:
>>> The trouble is that a parent doesn't have a direct reference to a
>>> child and can thus not tell it that it has become invalid. The only
>>> way I can think of to solve this issue is for the child to ask the
>>> parent about the status whenever accessed, a bit like what
>>> MemoryNodeBuilder now does.
>>
>> In the past few MemoryNodeBuilder commits I came up with a mechanism
>> that covers this case even better.
>>
>> Basically the idea is that each builder is permanently tied to the
>> *path* from which it was acquired. If content is copied or moved
>> around, the builders used to access or modify the source tree still
>> refer to the old location even if it no longer exists (in which case
>> they'd throw IllegalStateExceptions when accessed). If other content
>> is added to replace previously removed content, then such builders
>> would automatically start reflecting the new content. In short, a
>> builder accessed at a given path will always behave as if it was a
>> fresh new builder acquired using that same path (with an ISE thrown if
>> the path doesn't exist).
>>
>> To make this behavior apply also to Trees in oak-core and Nodes in
>> oak-jcr, we'd need to change the semantics of TreeImpl and NodeImpl
>> similarly so that they'll always refer to content at the path for
>> which they were originally created. Operations like Root.move() or
>> Session.move() will not change the paths or parents of existing Tree
>> or Node *instances*, though they may well cause those instances to
>> become invalid (resulting in InvalidItemStateExceptions being thrown
>> when trying to access such nodes).
>>
>> Such a change will cause a few existing test cases to fail, but I
>> think in the big picture such change in behavior should be OK as
>> AFAICT there aren't that many clients that even use move() or, if they
>> do, depend on a specific behavior of existing Node instances across
>> the move.
>>
>> BR,
>>
>> Jukka Zitting