You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Marcel Reutegger (JIRA)" <ji...@apache.org> on 2013/02/14 12:02:13 UTC

[jira] [Updated] (OAK-591) Improve KernelNodeStore cache efficiency

     [ https://issues.apache.org/jira/browse/OAK-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcel Reutegger updated OAK-591:
---------------------------------

    Attachment: mk-log-patched.txt
                mk-log-current.txt
                OAK-591.patch

Attached patch contains a modified version of the initial test and changes KernelNodeState to lookup an already existing node in the cache with the same path and hash. The other two files contain the log output of the MK log wrapper for simpleTest().

The patch reduces the number of getNodes() calls significantly. Currently oak-core does 107 getNodes() to read the test nodes after a property is modified. With the patch the number of getNodes() calls drops to 5.

It is also interesting to see how many calls are needed when a single property is updated. Currently 73 calls, with the patch 24.
                
> Improve KernelNodeStore cache efficiency
> ----------------------------------------
>
>                 Key: OAK-591
>                 URL: https://issues.apache.org/jira/browse/OAK-591
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 0.6
>            Reporter: Marcel Reutegger
>         Attachments: mk-log-current.txt, mk.log.gz, mk-log-patched.txt, OAK-591.patch, OAK-591.patch
>
>
> The cache in KernelNodeStore references entries with a path+revision combo. This mapping quickly becomes inefficient when there are writes on the repository. Whenever something is changed, the complete cache basically becomes invalid and oak-core needs to re-fetch nodes again, even though they didn't change. The attached test shows this behaviour. The test initially creates 10 nodes and lets a thread read those nodes repeatedly. To make the test somewhat realistic the reader acquires a new session in every run through the loop. This is to simulate e.g. a request which acquires a new session every time (Apache Sling does it that way). At the same time writes occur but in a separate part of the repository. As can be seen in the logs, the nodes are read from the MicroKernel whenever something changes anywhere in the repository. Obviously this is no limited to the test nodes. The log also shows repeated reads to node type, user and index nodes. None of them change while the test runs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira