You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Julian Reschke (JIRA)" <ji...@apache.org> on 2017/10/11 08:18:00 UTC

[jira] [Commented] (OAK-6806) RDBDocumentStore: version GC does not scale when there are many docs with long paths

    [ https://issues.apache.org/jira/browse/OAK-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199951#comment-16199951 ] 

Julian Reschke commented on OAK-6806:
-------------------------------------

Test results with modified RevisionGCTest on Oracle with scale=1000:

with 660 char path prefix: timeTakenToCollectAndDeleteSplitDocs=2.235 min
with short prefix: timeTakenToCollectAndDeleteSplitDocs=6.371 s


> RDBDocumentStore: version GC does not scale when there are many docs with long paths
> ------------------------------------------------------------------------------------
>
>                 Key: OAK-6806
>                 URL: https://issues.apache.org/jira/browse/OAK-6806
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: rdbmk
>            Reporter: Julian Reschke
>            Assignee: Julian Reschke
>         Attachments: OAK-6806.diff
>
>
> Due to the way the RDB RevisionGC looks for split documents, it doesn't perform well at all in the presence of many documents with long paths.
> The reason is that we currently do not have a column for SDTYPE, and thus use pattern matching on the document IDs instead. However, once a document has long path in the document store, it appears to the GC has candidate split document, and thus is always read upon GC.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)