You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Alex Parvulescu (JIRA)" <ji...@apache.org> on 2014/03/12 11:27:44 UTC
[jira] [Comment Edited] (OAK-1465) performance degradation with
growing index size on Oak-Mongo
[ https://issues.apache.org/jira/browse/OAK-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931613#comment-13931613 ]
Alex Parvulescu edited comment on OAK-1465 at 3/12/14 10:26 AM:
----------------------------------------------------------------
This is how the property index updates look like [0] each save operation triggers 2 index updates (_before_ and _after_ are the index keys):
- one node type (oak:Unstructured)
- one property type. (here the property has the same value as the node name)
My profiling session shows a lot of cache misses on the DocumentNodeStore#getNode, and given the high frequency of small commits I don't see any code tweaks that I could do to speed up this test.
It would be interesting to add some sort of output of the cache stats after the tests, I wanted to at least see them, but I found it ridiculously hard to get a reference to that stats object.
I'm un-assigning myself from this issue, but I'm still open to any ideas of improvement, so feel free to point to anything I might have missed in the indexing code.
A small thing I've noticed is that the NodeBuilder#getChildNodeNames in the case of the DocumentNodeState is using the default AbstractNodeState impl which is simply calling #getChildNodeEntries and then extracting the names. I did not see heavy usage of this method (I ran into it in the IndexUpdate#collectIndexEditors method), so I don't think very important to provide a more efficient implementation.
[0]
{code}
update on /test19b6d919/testNode/level1_49/217f0ea5-190c-4c56-8b7d-c4b180c670a1
before []
after [217f0ea5-190c-4c56-8b7d-c4b180c670a1]
update on /test19b6d919/testNode/level1_49/217f0ea5-190c-4c56-8b7d-c4b180c670a1
before []
after [oak%3AUnstructured]
{code}
was (Author: alex.parvulescu):
This is how the property index updates look like [0]: each save operation triggers 2 index updates, one node type and one property type.
My profiling session shows a lot of cache misses on the DocumentNodeStore#getNode, and given the high frequency of small commits I don't see any code tweaks that I could do to speed up this test.
It would be interesting to add some sort of output of the cache stats after the tests, I wanted to at least see them, but I found it ridiculously hard to get a reference to that stats object.
I'm un-assigning myself from this issue, but I'm still open to any ideas of improvement, so feel free to point to anything I might have missed in the indexing code.
A small thing I've noticed is that the NodeBuilder#getChildNodeNames in the case of the DocumentNodeState is using the default AbstractNodeState impl which is simply calling #getChildNodeEntries and then extracting the names. I did not see heavy usage of this method (I ran into it in the IndexUpdate#collectIndexEditors method), so I don't think very important to provide a more efficient implementation.
[0]
{code}
update on /test19b6d919/testNode/level1_49/217f0ea5-190c-4c56-8b7d-c4b180c670a1
before []
after [217f0ea5-190c-4c56-8b7d-c4b180c670a1]
update on /test19b6d919/testNode/level1_49/217f0ea5-190c-4c56-8b7d-c4b180c670a1
before []
after [oak%3AUnstructured]
{code}
> performance degradation with growing index size on Oak-Mongo
> ------------------------------------------------------------
>
> Key: OAK-1465
> URL: https://issues.apache.org/jira/browse/OAK-1465
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: mongomk
> Affects Versions: 0.17.1
> Reporter: Stefan Egli
> Assignee: Alex Parvulescu
> Priority: Blocker
> Fix For: 0.19
>
> Attachments: CreateManyIndexedNodesTest.java
>
>
> Tested with an oak-snapshot of Monday Feb 24, 10AM EST.
> Noticed that when the amount of nodes indexed - eg wrt a particular property - the adding of nodes becomes slower and slower.
> Will attach a oak-run benchmark to underline this. Basically the scenario where this occurred was:
> * have a number of "level 1" nodes (eg 100)
> * under those "level 1" nodes, add a growing list of children, each with a property that is indexed (ie that index is actually growing and is probably causing the slowdown).
--
This message was sent by Atlassian JIRA
(v6.2#6252)