You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Stefan Egli <st...@apache.org> on 2017/01/31 17:07:57 UTC

incomplete diffManyChildren during a persisted branch merge

Hi,

I'm following up on failure case in oak 1.2.14 where as part of a persisted
branch merge commit hooks do not propagate through all affected changes,
resulting in an inconsistent state. It's unclear how realistic this scenario
is and/or if it's relevant, but I was able to produce such a scenario in a
test case.

Interesting thing is that it's quite easily reproducible in 1.2.14 while as
later in the 1.2 branch it takes longer for the test (which loops until it
fails) to fail. Also, it doesn't fail in trunk even after eg 500 iterations.

Does this ring a bell with anyone - diffManyChildren / wrong _modified
calculation / branch - perhaps this was fixed in trunk a while ago and not
backported?

Cheers,
Stefan
--
https://issues.apache.org/jira/browse/OAK-5557



Re: incomplete diffManyChildren during a persisted branch merge

Posted by Stefan Egli <st...@apache.org>.
On 31/01/17 18:07, "Stefan Egli" <st...@apache.org> wrote:

>I'm following up on failure case in oak 1.2.14 where as part of a
>persisted
>branch merge commit hooks do not propagate through all affected changes,
>resulting in an inconsistent state.

>https://issues.apache.org/jira/browse/OAK-5557

I believe the problem is indeed related to a rebase that happens before
merging a persisted branch. The diffManyChildren subsequently takes the
rebased revision timestamp as the minValue, instead of taking the branch's
previous purges into account. This seems to (only) occur when between the
last purge and the actual merge another session does a merge.

One possible fix I see is to detect such a situation (in diffManyChildren)
- ie check if one of the revisions is a branch revision - and fall back to
not using the _modified index then. This will definitely find all
potential child nodes - but it has the downside that it becomes
slow/doesn't scale well with a very large list of child nodes.

Other ideas?

Cheers,
Stefan



Re: incomplete diffManyChildren during a persisted branch merge

Posted by Stefan Egli <st...@apache.org>.
On 01/02/17 09:16, "Marcel Reutegger" <mr...@adobe.com> wrote:

>I think in trunk the code path is also a bit different because of
>OAK-4528. It may be possible that the issue still exists in trunk, but
>does not call diffManyChildren() anymore.
>
>What happens when you disable the journal diff mechanism in trunk with
>-Doak.disableJournalDiff=true ?

Good idea, however that alone doesn't let the test fail yet, as both the
local diff cache as well as the node children cache avoid diffManyChildren
from being used.

But if I use brute force and bypass those two caches explicitly - and at
the same time increase the test size by increasing # of nodes - then the
test fails on trunk too.

So indeed trunk seems to avoid this problem as it doesn't go into
diffManyChildren for the cases triggered by the test.

Cheers,
Stefan



Re: incomplete diffManyChildren during a persisted branch merge

Posted by Marcel Reutegger <mr...@adobe.com>.
Hi,

On 31/01/17 18:07, Stefan Egli wrote:
> Hi,
> Interesting thing is that it's quite easily reproducible in 1.2.14 while as
> later in the 1.2 branch it takes longer for the test (which loops until it
> fails) to fail. Also, it doesn't fail in trunk even after eg 500 iterations.
>
> Does this ring a bell with anyone - diffManyChildren / wrong _modified
> calculation / branch - perhaps this was fixed in trunk a while ago and not
> backported?

I'm aware of https://issues.apache.org/jira/browse/OAK-3608, but this 
one was backported and released with 1.2.8.

I think in trunk the code path is also a bit different because of 
OAK-4528. It may be possible that the issue still exists in trunk, but 
does not call diffManyChildren() anymore.

What happens when you disable the journal diff mechanism in trunk with 
-Doak.disableJournalDiff=true ?

Regards
  Marcel