You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-dev@jackrabbit.apache.org by Thomas Mueller <mu...@adobe.com> on 2013/03/06 16:33:44 UTC

SegmentNodeStore merge operations

Hi,

I wonder what is the state of the implementation of merge operations for segments and journals, and how are merges scheduled?

Regards,
Thomas

Re: SegmentNodeStore merge operations

Posted by Michael Dürig <md...@apache.org>.

On 7.3.13 11:08, Marcel Reutegger wrote:
> Hi,
>
>> I was referring to problem a, which is about validators and other
>> commit hooks not being a part of the underlying MK-level merge
>> operation and thus for example not always catching things like
>> duplicate UUIDs being introduced or hard references being broken (i.e.
>> repository invariants that span more than one node). This issue also
>> affects the MongoMK implementation, though it's yet unclear how
>> important addressing it is in practice. For some deployments it may
>> well be a hard requirement.
>
> this is what I tried to describe in http://jackrabbit.markmail.org/thread/5vh6bk7ei32jgtbg
>
> the MicroKernel only provides snapshot isolation, while a NodeStore
> implementation could also provide serializable snapshot isolation.
>
> as mentioned before, I think snapshot isolation is just fined because
> in most cases it is sufficient and allows for increased concurrency. for
> the cases where more consistency guarantees are needed, like
> unique UUIDs and hard references we should implement the
> validators accordingly and e.g. use a technique like materializing the conflict.

I think a source of misunderstanding here is the interpretation of 
serializable snapshot isolation within our immutable tree hierarchy: 
strictly speaking every two commits conflict in such a setting since the 
root node always changes. (We even experienced this with the index 
update, which had exactly this effect). However I think your idea here 
is to *not* count such conflicts and just have them merged. Similarly 
as I brought up earlier: http://markmail.org/message/5vh6bk7ei32jgtbg

So IIUC, with such an approach branches wouldn't need to rebase in the 
first place but just commit/merge. If there happens to be a conflict, 
branches would rebase then and retry the commit/merge. Finally those who 
need the stronger guarantee of serializable snapshot isolation could 
e.g. use the technique of materialising the conflict.

Michael

>
> regards
>   marcel
>

RE: SegmentNodeStore merge operations

Posted by Marcel Reutegger <mr...@adobe.com>.

Hi,

> But it doesn't address the other case I brought up, of adding a write
> restriction on the root node. It's a very rare commit that does do
> something like that, but any commits need to be aware of potential
> changes in that area. Thus, if we want to maintain the illusion of
> serialization of such changes, you'd need at least something like
> read-write locks over such problem areas.

you mean an access control restriction on the root node? yes, in this
case it could happen that someone is able to change content after
the restriction on the root node denied this modification. I guess
we'd have to assess if this is really a strict requirement. if a session
previously had access to an area it shouldn't have, something is
wrong anyway.

regards
 marcel

Re: SegmentNodeStore merge operations

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Thu, Mar 7, 2013 at 2:24 PM, Marcel Reutegger <mr...@adobe.com> wrote:
> assuming the uniqueness constraint is enforced with the p2
> index implementation in oak, each node with a distinct UUID,
> will create a distinct index node for each UUID. creating those
> nodes will not result in a conflict as defined by Michael a while
> ago.

Right, that should work for this case.

But it doesn't address the other case I brought up, of adding a write
restriction on the root node. It's a very rare commit that does do
something like that, but any commits need to be aware of potential
changes in that area. Thus, if we want to maintain the illusion of
serialization of such changes, you'd need at least something like
read-write locks over such problem areas.

BR,

Jukka Zitting

RE: SegmentNodeStore merge operations

Posted by Marcel Reutegger <mr...@adobe.com>.

> On Thu, Mar 7, 2013 at 1:57 PM, Marcel Reutegger <mr...@adobe.com>
> wrote:
> > not necessarily. e.g. if we take the unique UUID as an example,
> > two sessions can proceed concurrently when they create nodes
> > with different UUIDs, which is usually the case. the conflict only
> > materializes when you create two nodes with the same UUID.
> 
> But you can't know that in advance, so for example all XML imports
> with at least one UUID in it would need synchronization. That's
> probably a rare enough case not to worry about, but consider something
> like adding a new write restriction on the root node. That of course
> happens very rarely, but for conflict materialization to work properly
> for such cases *all* writes would need to hit that materialization
> point.

I forgot to illustrate how the UUID constraint is enforced in this case.

assuming the uniqueness constraint is enforced with the p2
index implementation in oak, each node with a distinct UUID,
will create a distinct index node for each UUID. creating those
nodes will not result in a conflict as defined by Michael a while
ago. only when two nodes with the same UUID are created,
the index nodes will result in a 'addExistingNode' conflict. the
second commit would therefore fail. not because a commit
hook detected the violation, but because the MicroKernel
implementation detected a conflict!

thus the minimum requirement for the MicroKernel implementation
is to ensure the conflict handling as currently defined (equivalent
to what I still call snapshot isolation. though, I agree with Michael
that it might not be 100% accurate because of our definition of 
NodeState.equals(). On the other hand, Tree doesn't have it...).
the important point here is that I'd like us to consider updates
to separate subtree as non-conflicting.

regards
 marcel

Re: SegmentNodeStore merge operations

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Thu, Mar 7, 2013 at 1:57 PM, Marcel Reutegger <mr...@adobe.com> wrote:
> not necessarily. e.g. if we take the unique UUID as an example,
> two sessions can proceed concurrently when they create nodes
> with different UUIDs, which is usually the case. the conflict only
> materializes when you create two nodes with the same UUID.

But you can't know that in advance, so for example all XML imports
with at least one UUID in it would need synchronization. That's
probably a rare enough case not to worry about, but consider something
like adding a new write restriction on the root node. That of course
happens very rarely, but for conflict materialization to work properly
for such cases *all* writes would need to hit that materialization
point.

> doesn't this require knowledge about the internals of the global
> invariant. how does the SegmentMK know what exactly it can undo?

My idea is to "undo" (or move to a separate conflict queue) the entire
commit that fails to validate during a merge. Commits for which this
could be a problem should be made against the root journal.

BR,

Jukka Zitting

Re: SegmentNodeStore merge operations

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Thu, Mar 7, 2013 at 2:05 PM, Michael Dürig <md...@apache.org> wrote:
> Yes I'm also a bit worried about this. Couldn't this lead to cascading
> undos? And further to violation of other application constraints when data
> gets "uncommited"? I think this is a very slippery slope...

The way I see it, the normal case for commits would be to go against
the root journal and thus never encounter this issue.

Unlike with the new MongoMK that automatically can execute
non-conflicting changes in parallel, the SegmentMK would require
explicit configuration when improved concurrency is required. In other
words, I don't think that for example the default configuration of a
SegmentMK cluster should be for each cluster node to have their own
journals. They might, but only if explicitly configured that way based
on an analysis of the application level access patterns.

The reasoning behind this design is that I think the optimistic
locking mechanism should already buy quite a bit of extra performance
for normal deployments, and the remaining cases where even more write
throughput is needed typically have some very specific and tightly
scoped access patterns for which the extra throughput is needed. And
as mentioned in another email, in my experience such access patterns
are seldom conflicting (or come with strict application-level
constraints) and generally resilient against the potential loss of
some small fraction of changes.

Thus in practice I don't see such cascading undos or constraint
violations becoming much of a problem. But that of course depends on
the amount of care put in deciding which operations should be executed
against lower-level journals. That's probably an area where good
documentation/training and more real-world experience will be needed.

BR,

Jukka Zitting

Re: SegmentNodeStore merge operations

Posted by Michael Dürig <md...@apache.org>.


On 7.3.13 11:57, Marcel Reutegger wrote:
>> On Thu, Mar 7, 2013 at 1:08 PM, Marcel Reutegger <mr...@adobe.com>
>> wrote:
>>> as mentioned before, I think snapshot isolation is just fined because
>>> in most cases it is sufficient and allows for increased concurrency. for
>>> the cases where more consistency guarantees are needed, like
>>> unique UUIDs and hard references we should implement the
>>> validators accordingly and e.g. use a technique like materializing the
>> conflict.
>>
>> That technique essentially serializes all commits that could possibly
>> violate a global invariant.
>
> not necessarily. e.g. if we take the unique UUID as an example,
> two sessions can proceed concurrently when they create nodes
> with different UUIDs, which is usually the case. the conflict only
> materializes when you create two nodes with the same UUID.

In other words: it makes the commits serializable without serializing 
them ;-)

>
>> The SegmentMK can avoid such
>> synchronization points entirely at the cost of potentially "undoing"
>> some commits that become invalid later on.
>
> doesn't this require knowledge about the internals of the global
> invariant. how does the SegmentMK know what exactly it can undo?

Yes I'm also a bit worried about this. Couldn't this lead to cascading 
undos? And further to violation of other application constraints when 
data gets "uncommited"? I think this is a very slippery slope...

Michael

>
> regards
>   marcel
>

RE: SegmentNodeStore merge operations

Posted by Marcel Reutegger <mr...@adobe.com>.

> On Thu, Mar 7, 2013 at 1:08 PM, Marcel Reutegger <mr...@adobe.com>
> wrote:
> > as mentioned before, I think snapshot isolation is just fined because
> > in most cases it is sufficient and allows for increased concurrency. for
> > the cases where more consistency guarantees are needed, like
> > unique UUIDs and hard references we should implement the
> > validators accordingly and e.g. use a technique like materializing the
> conflict.
> 
> That technique essentially serializes all commits that could possibly
> violate a global invariant.

not necessarily. e.g. if we take the unique UUID as an example,
two sessions can proceed concurrently when they create nodes
with different UUIDs, which is usually the case. the conflict only
materializes when you create two nodes with the same UUID.

> The SegmentMK can avoid such
> synchronization points entirely at the cost of potentially "undoing"
> some commits that become invalid later on.

doesn't this require knowledge about the internals of the global
invariant. how does the SegmentMK know what exactly it can undo?

regards
 marcel

Re: SegmentNodeStore merge operations

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Thu, Mar 7, 2013 at 1:08 PM, Marcel Reutegger <mr...@adobe.com> wrote:
> as mentioned before, I think snapshot isolation is just fined because
> in most cases it is sufficient and allows for increased concurrency. for
> the cases where more consistency guarantees are needed, like
> unique UUIDs and hard references we should implement the
> validators accordingly and e.g. use a technique like materializing the conflict.

That technique essentially serializes all commits that could possibly
violate a global invariant. The SegmentMK can avoid such
synchronization points entirely at the cost of potentially "undoing"
some commits that become invalid later on.

BR,

Jukka Zitting

RE: SegmentNodeStore merge operations

Posted by Marcel Reutegger <mr...@adobe.com>.

Hi,

> I was referring to problem a, which is about validators and other
> commit hooks not being a part of the underlying MK-level merge
> operation and thus for example not always catching things like
> duplicate UUIDs being introduced or hard references being broken (i.e.
> repository invariants that span more than one node). This issue also
> affects the MongoMK implementation, though it's yet unclear how
> important addressing it is in practice. For some deployments it may
> well be a hard requirement.

this is what I tried to describe in http://jackrabbit.markmail.org/thread/5vh6bk7ei32jgtbg

the MicroKernel only provides snapshot isolation, while a NodeStore
implementation could also provide serializable snapshot isolation.

as mentioned before, I think snapshot isolation is just fined because
in most cases it is sufficient and allows for increased concurrency. for
the cases where more consistency guarantees are needed, like
unique UUIDs and hard references we should implement the
validators accordingly and e.g. use a technique like materializing the conflict.

regards
 marcel

Re: SegmentNodeStore merge operations

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Thu, Mar 7, 2013 at 11:16 AM, Thomas Mueller <mu...@adobe.com> wrote:
>>So, apart from problem a (which also affects the new MongoMK), the
>>current mechanism works fine (i.e. fully parallel writes) as long as
>>the changes are non-conflicting, but runs into trouble when there are
>>conflicts.
>
> Sorry I don't understand, how does SegmentNodeStore merge affect the new
> MongoMK?

I was referring to problem a, which is about validators and other
commit hooks not being a part of the underlying MK-level merge
operation and thus for example not always catching things like
duplicate UUIDs being introduced or hard references being broken (i.e.
repository invariants that span more than one node). This issue also
affects the MongoMK implementation, though it's yet unclear how
important addressing it is in practice. For some deployments it may
well be a hard requirement.

>>* Use a more aggressive merge algorithm that automatically resolves
>>all conflicts by throwing away (or storing somewhere else) "less
>>important" changes when needed. Addresses problems b and c, problem a
>>still an issue.
>
> I'm worried our customers won't like this. It's very different from the
> behaviour of regular databases (be it relational databases, or NoSQL
> databases such as MongoDB). If it's a configurable for a certain subtree,
> for improved performance, then it's acceptable in my view, but even then
> I'm worried about the added complexity on the user/customer/developer
> side. And I'm worried that if we need to enable it to get a scalable
> solution, then it would turn people away.

Yes, that a valid and open question. My experience with the kinds of
high-volume scenarios where the write bottleneck has been a problem
that couldn't already be addressed with the optimistic locking
approach used by the SegmentMK, are cases that track user actions, log
other events or keep track of comments, likes, etc. In all such
scenarios each individual change is pretty small, seldom conflicting,
and not too important (i.e. the harm of losing one is minimal), so I'm
not too concerned about this being a problem as long as the system can
provide harder guarantees for "more important" updates.  But we'll
need some better benchmark scenarios and real-world experience to tell
how well that works in practice.

BR,

Jukka Zitting

Re: SegmentNodeStore merge operations

Posted by Thomas Mueller <mu...@adobe.com>.

Hi,

>However, as noted in OAK-633, there are a few conceptual problems with
>this approach to processing merges:
>
>a) Since validators and other commit hooks are not run during the
>merge, the result can be an internally inconsistent content tree
>(dangling references, incorrect permission store, etc.)
>
>b) The presence of conflict markers will prevent further changes to
>affected nodes until the conflict gets resolved
>
>c) There's no good way to handle more than one set of conflicts per node
>
>So, apart from problem a (which also affects the new MongoMK), the
>current mechanism works fine (i.e. fully parallel writes) as long as
>the changes are non-conflicting, but runs into trouble when there are
>conflicts.

Sorry I don't understand, how does SegmentNodeStore merge affect the new
MongoMK? Please note I was taking about SegementNodeStore merge
operations, not MicroKernel.merge. The MongoMK doesn't merge segments and
journals, instead, conflicts are detected when committing on a node level
(relying on MongoDB features).

>* Use a more aggressive merge algorithm that automatically resolves
>all conflicts by throwing away (or storing somewhere else) "less
>important" changes when needed. Addresses problems b and c, problem a
>still an issue.

I'm worried our customers won't like this. It's very different from the
behaviour of regular databases (be it relational databases, or NoSQL
databases such as MongoDB). If it's a configurable for a certain subtree,
for improved performance, then it's acceptable in my view, but even then
I'm worried about the added complexity on the user/customer/developer
side. And I'm worried that if we need to enable it to get a scalable
solution, then it would turn people away.

In my view, SegmentNodeStore merging is somewhat similar to database
synchronization (as when synchronizing the smartphone calendar with the
desktop and so on). A long time ago, I was working on such a database
synchronization solution, called PointBase UniSync and MicroSync. A
hub-and-spoke model was used, and supported multiple types of conflicts
(insert/insert, update/update, update/delete, delete/update; delete/delete
was not treated as conflict for example). Multiple conflict resolution
algorithms were supported (spoke wins, hub wins, user defined using a
resolver callback). Interestingly, the documentation is still available at
http://www.ipd.uni-karlsruhe.de/~modbprak/03-MobileDB_mit_Java/pb44/docs/un
isync/GettingStarted/

As far as I know, NoSQL databases either try to avoid
merging/synchronization (MongoDB: writes always happen on the primary), or
do it in a very simple way. For example in Cassandra, if concurrent writes
are enabled, the latest change always wins:
http://www.datastax.com/docs/1.1/dml/about_writes "The latest timestamp
always wins".

Regards,
Thomas

Re: SegmentNodeStore merge operations

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Mar 6, 2013 at 5:33 PM, Thomas Mueller <mu...@adobe.com> wrote:
> I wonder what is the state of the implementation of merge operations for
> segments and journals, and how are merges scheduled?

Basic merging functionality for the in-memory segment store there, see
the o.a.j.oak.plugins.segment.JournalTest for some examples on how
that works. Currently the merge operation needs to be explicitly
triggered.

Before implementing the same mechanism for the MongoDB store, I've
been experimenting with a few alternatives on how to schedule and
process merges. The most straightforward approach is to use periodic
merges with a configurable interval (defaulting to something like one
second), with the option to explicitly trigger extra merges when
needed, and to process merges using our existing rebase logic that
leaves conflict markers in the tree when unresolvable conflicts are
detected.

However, as noted in OAK-633, there are a few conceptual problems with
this approach to processing merges:

a) Since validators and other commit hooks are not run during the
merge, the result can be an internally inconsistent content tree
(dangling references, incorrect permission store, etc.)

b) The presence of conflict markers will prevent further changes to
affected nodes until the conflict gets resolved

c) There's no good way to handle more than one set of conflicts per node

So, apart from problem a (which also affects the new MongoMK), the
current mechanism works fine (i.e. fully parallel writes) as long as
the changes are non-conflicting, but runs into trouble when there are
conflicts.

So far I've come up with the following alternative designs to address
the above problems:

* Use a more aggressive merge algorithm that automatically resolves
all conflicts by throwing away (or storing somewhere else) "less
important" changes when needed. Addresses problems b and c, problem a
still an issue.

* Instead of merging the full set of changes from another journal, we
could keep track of what the Oak client saved vs. what actually got
committed (after hook processing) and then during a merge try to
replay just those client changes before re-running the commit hooks. A
change set that fails because of merge conflicts or validation issues
could be moved to a separate conflict queue for later (possibly
manual) processing. This approach would solve all the above problems,
but could cause some surprises as the repository might occasionally
"undo" commits that previously passed without problems.

Given these tradeoffs I'm thinking that a default SegmentMK deployment
should start with just the root journal, and other journals (and their
merge behavior) should be configured depending on the requirements and
characteristics of particular deployment scenarios. We need to come up
with some better distributed test cases to determine what such
scenarios would look like and what the best journal configurations for
them would be.

BR,

Jukka Zitting