You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Chetan Mehrotra (JIRA)" <ji...@apache.org> on 2015/03/26 08:56:54 UTC

[jira] [Comment Edited] (OAK-2669) Use Consolidated diff for local changes with persistent cache to avoid calculating diff again

    [ https://issues.apache.org/jira/browse/OAK-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381524#comment-14381524 ] 

Chetan Mehrotra edited comment on OAK-2669 at 3/26/15 7:56 AM:
---------------------------------------------------------------

attaching the [initial patch|^OAK-2669-A.patch] to get a review done for the approach being taken

*Changes related to Observation*

* Introduces a new {{ContentChangeInfoProvider}} which provides a {{ContentChangeInfo}} which encapsulates the before, after and commit info
* Observation logic (i.e. {{Continuation}} would ensure that {{NodeStateDiff}} implementation passed to the {{NodeState#compareAgainstBaseState}} implements the new provider interface and thus provides access to the root states and commit info details
* {{DocumentNodeStore}} compare logic would check if the diff is of new type and then extracts the root nodeState for the commit. Using the root nodeState revision and path for which diff needs to be performed it looks up in the {{LocalDiffCache}}

*Changes related to diff handling*
* Introduces a {{LocalDiffCache}} which captures the local changes provided during commit and consolidates the diff across various changed path and caches it with key being the commit revision
* It also support {{PersistentCache}} 
* If required this feature can be disabled. However once enabled there would be two diff related caches
** diffCache - This is used as per current usage. However if local diff cache is enabled then local diff would *not be pushed* to this cache. However while doing a diff calculation this cache would be used if there is a miss in localDiffCache or the change is external
** localDiffCache - Cache solely dedicated to capture the local changes diff

*ToDo*
* Need to determine the best way to serialize the consolidated diff as string. The diff string are again json encoded its not possible to serialize the consolidated diff as JSON. Current code uses a crude encoding and decoding logic

[~mreutegg] [~mduerig] Can you review the approach taken. In the meantime I am working on adding more testcases

[~tmueller] Any thoughts on best way to serialize the {{ConsolidatedDiff}} for persistent cache. Also do have a look on persistent cache integration


was (Author: chetanm):
attaching the [initial patch|^OAK-2669-A.patch] to get a review done for the approach being taken

*Changes related to Observation*

* Introduces a new {{ContentChangeInfoProvider}} which provides a {{ContentChangeInfo}} which encapsulates the before, after and commit info
* Observation logic (i.e. {{Continuation}} would ensure that {{NodeStateDiff}} implementation passed to the {{NodeState#compareAgainstBaseState}} implements the new provider interface and thus provides access to the root states and commit info details
* {{DocumentNodeStore}} compare logic would check if the diff is of new type and then extracts the root nodeState for the commit. Using the root nodeState revision and path for which diff needs to be performed it looks up in the {{LocalDiffCache}}

*Changes related to diff handling*
* Introduces a {{LocalDiffCache}} which captures the local changes provided during commit and consolidates the diff across various changed path and caches it with key being the commit revision
* It also support {{PersistentCache}} 
* If required this feature can be disabled. However once enabled there would be two diff related caches
** diffCache - This is used as per current usage. However if local diff cache is enabled then local diff would *not be pushed* to this cache. However while doing a diff calculation this cache would be used if there is a miss in localDiffCache or the change is external
** localDiffCache - Cache solely dedicated to capture the local changes diff

*ToDo*
* Need to determine the best way to serialize the consolidated diff as string. The diff string are again json encoded its not possible to serialize the consolidated diff as JSON. Current code uses a crude encoding and decoding logic

[~mreutegg] [~mduerig] Can you review the approach taken. In the meantime I am working on adding more testcases

[~tmueller] Any thoughts on best way to serialize the {{ConsolidatedDiff}} for persistent cache

> Use Consolidated diff for local changes with persistent cache to avoid calculating diff again
> ---------------------------------------------------------------------------------------------
>
>                 Key: OAK-2669
>                 URL: https://issues.apache.org/jira/browse/OAK-2669
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: mongomk
>            Reporter: Chetan Mehrotra
>             Fix For: 1.3.0
>
>         Attachments: OAK-2669-A.patch
>
>
> Currently the diff logic in DocumentMK makes use of DiffCache which has an in memory implementation and a Mongo based implementation. Given that we need to have a fast observation support for local changes it would be better to make use of persistent cache. After discussing with [~mreutegg] following changes need to be done in current logic
> # Have the Commit#applyChanges push the commit diff to persistent cache with current commit revision as key
> # In compare pull out the diff from persistent cache and if present use that. Note that this diff is for complete tree compared to current JSOP diff used which is only per node level. So need to change the way diff is pushed back to NodeStateDiff
> Above change should avoid hitting mongo all together for determining the diff. Only extra work performed in diff calculation would be determining the node state view for the base revision. Later we can think of also include node state base revision as part of diff so as to avoid this extra work all together and rely on node state from persistent cache for that work also
> See also http://markmail.org/thread/bzmwcp7k4wmtw6od



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)