You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Miroslav Smiljanic (Jira)" <ji...@apache.org> on 2021/01/06 11:52:01 UTC

[jira] [Commented] (SLING-10011) Use javax.jcr.Item.getParent() when resolving parent JCR node in JcrResourceProvider#getParent

    [ https://issues.apache.org/jira/browse/SLING-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259643#comment-17259643 ] 

Miroslav Smiljanic commented on SLING-10011:
--------------------------------------------

I have created [benchmark test|https://github.com/apache/jackrabbit-oak/commit/5dfd1145a0b71c4a1626b1fc6430b3ec40e9c4c0] in Oak to compare performance of Node API vs Session API in retrieving the parent node.

Test run below utilises in-memory cache (100MB by default).
{noformat}
> java -jar target/oak-benchmarks-*-SNAPSHOT.jar benchmark  GetParentNodeWithNodeAPI  GetParentNodeWithSessionAPI  Oak-Segment-Tar
Apache Jackrabbit Oak 1.37-SNAPSHOT
# GetParentNodeWithNodeAPI         C     min     10%     50%     90%     max     N       mean 
Oak-Segment-Tar                    1       2       2       2       3      5   25891       2
# GetParentNodeWithSessionAP       C     min     10%     50%     90%     max     N       mean 
Oak-Segment-Tar                    1      26      27      29      32     40    2069      29
{noformat}
Parent node retrieval operation for 90% of executions is ~10 times slower when using Session API. 

Second run is with disabled cache, to simulate high eviction rate of in-memory cache, that can happen when sling application is under heavy load with big number of concurrent requests.
{noformat}
> java -jar target/oak-benchmarks-*-SNAPSHOT.jar benchmark --cache 0  GetParentNodeWithNodeAPI  GetParentNodeWithSessionAPI  Oak-Segment-Tar
Apache Jackrabbit Oak 1.37-SNAPSHOT
# GetParentNodeWithNodeAPI         C     min     10%     50%     90%     max     N       mean 
Oak-Segment-Tar                    1       7       8       9      10     14    6812       9
# GetParentNodeWithSessionAP       C     min     10%     50%     90%     max     N       mean 
Oak-Segment-Tar                    1    2209    2210    2227    2249   2274      27    2230
{noformat}
In this case, parent node retrieval operation for 90% of executions is ~200 times slower when using Session API. 

In this test, tar segment store is used, with segments persisted on a local SSD. When remote segment store implementation is used (Azure/AWS), test with Session API would be even slower because of the network roundtrips involved.

 

> Use javax.jcr.Item.getParent() when resolving parent JCR node in JcrResourceProvider#getParent
> ----------------------------------------------------------------------------------------------
>
>                 Key: SLING-10011
>                 URL: https://issues.apache.org/jira/browse/SLING-10011
>             Project: Sling
>          Issue Type: Improvement
>          Components: JCR
>    Affects Versions: JCR Resource 3.0.22
>            Reporter: Miroslav Smiljanic
>            Priority: Minor
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently [JcrResourceProvider.getParent|https://github.com/apache/sling-org-apache-sling-jcr-resource/blob/org.apache.sling.jcr.resource-3.0.22/src/main/java/org/apache/sling/jcr/resource/internal/helper/jcr/JcrResourceProvider.java#L361] is using JcrItemResourceFactory.getItemOrNull(String path), which eventually is using JCR session to retrieve parent node using absolute path.
> I propose using javax.jcr.Item.getParent() instead.
> Reasoning wold be to utilise potential improvements in JCR implementation that would for a given node retrieve the whole subtree. That can be configured for example by using particular node type or node path.
> {noformat}
>     root
>      |
>      a 
>    /   \
>   b     c    
> {noformat}
> If node 'a' in picture above, is matching desired configuration, then code below would return the whole subtree.
> {code:java}
> Node a = jcrSession.getNode("a");
> {code}
> That further means retrieved subtree can be traversed in memory, without the need to communicate with the JCR repository storage.
> (!)That is particularly important when remote (cloud) storage is used for repository in JCR implementation, and tree traversal can be done without doing additional network roundtrips.
> {code:java}
> //JCR tree traversal happens in memory
> Node b = a.getNode("b");
> Node c = a.getNode("c");
> {code}
> Also going from child to parent, is resolved in memory as well (proposal relates to this fact)
> {code:java}
> //JCR tree traversal happens in memory
> assert b.getParent() == c.getParent();
> {code}
> Jackrabbit Oak, for document node store is supporting node bundling for configured node type
>  [http://jackrabbit.apache.org/oak/docs/nodestore/document/node-bundling.html]
> Currently I am also doing some experiments to support node bundling/aggregation for arbitrary node store ([NodeDelegateFullyLoaded|https://github.com/smiroslav/jackrabbit-oak/blob/ppnextgen_newstore/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/delegate/NodeDelegateFullyLoaded.java], [FullyLoadedTree|https://github.com/smiroslav/jackrabbit-oak/blob/ppnextgen_newstore/oak-core/src/main/java/org/apache/jackrabbit/oak/core/FullyLoadedTree.java]).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)