You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2012/12/19 13:29:14 UTC

[jira] [Comment Edited] (OAK-534) Inefficient NodeState comparison with MongoMK

    [ https://issues.apache.org/jira/browse/OAK-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535906#comment-13535906 ] 

Thomas Mueller edited comment on OAK-534 at 12/19/12 12:27 PM:
---------------------------------------------------------------

> Where would that scalable solution be? In oak-core or MicroKernel?

It would need to be in both. It was our plan to have a scalable design, now it seems both the MongoMK and oak-core are not actually scalable.

> oak-core needs to know if a subtree changed efficiently

Yes, to update the cache and for observation. But for nodes that are not in the cache and are never requested by a user, it is not needed. There is no need for oak-core to know that something changed in the repository in the path "/x/y/z" if there is no observation listener for that path, and if the user doesn't request that path. But it seems, currently oak-core would want to know that there was a change immediately, by requesting the content hash of the root node, or the journal of the root node. If any of both have to be accurate up to the latest millisecond, then I believe it's not possible to have a scalable microkernel implementation.

> at least in today's design.

Yes, that's my point. It seems to be the current design is not scalable.

                
      was (Author: tmueller):
    > Where would that scalable solution be? In oak-core or MicroKernel?

It would need to be in both. It was our plan to have a scalable design, not it seems both the MongoMK and oak-core are not actually scalable.

> oak-core needs to know if a subtree changed efficiently

Yes, to update the cache and for observation. But for nodes that are not in the cache and are never requested by a user, it is not needed. There is no need for oak-core to know that something changed in the repository in the path "/x/y/z" if there is no observation listener for that path, and if the user doesn't request that path. But it seems, currently oak-core would want to know that there was a change immediately, by requesting the content hash of the root node, or the journal of the root node. If any of both have to be accurate up to the latest millisecond, then I believe it's not possible to have a scalable microkernel implementation.

> at least in today's design.

Yes, that's my point. It seems to be the current design is not scalable.

                  
> Inefficient NodeState comparison with MongoMK
> ---------------------------------------------
>
>                 Key: OAK-534
>                 URL: https://issues.apache.org/jira/browse/OAK-534
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core, mongomk
>    Affects Versions: 0.5
>            Reporter: Marcel Reutegger
>
> Oak on MongoMK currently results in complete tree traversal for any kind of modification. This is caused by the fact that MongoMK does not support the optional :hash system property. In this case KernelNodeState.compareAgainstBaseState() falls back to a generic implementation, which traverses the complete tree to find out if a subtree was modified.
> The NodeState comparison is triggered in almost all commit hook and validator implementations to find out what changed with the given commit.
> I see a number of options to solve this:
> 1) Add support for :hash system property in MongoMK
> 2) Use MK.diff() to find out if something was modified in a subtree
> 3) Use MK.getJournal() to find out if something was modified in a subtree
> Some initial thoughts on presented options to start the discussion (feel free to jump in and add more):
> Adding the :hash system property in MongoMK might not be that easy, because the implementation tries to avoid contention on the root node by not updating it with every commit. It only updates the nodes that actually changed. A straight forward implementation of :hash requires updating all ancestors of modified nodes.
> Option 2) seems to require additional work in MongoMK because the diff() implementation in MongoMK is using oak-mk DiffBuilder. The builder in turn will then call SimpleMongoNodeStore.compare(), which seems to use the same generic comparison implementation as the fallback in KernelNodeState.compareAgainstBaseState().
> AFAICS 3) might be a viable option with the recent support for branches in getJournal() (OAK-501). But I don't know how efficient this is implemented in MongoMK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira