You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Chetan Mehrotra (JIRA)" <ji...@apache.org> on 2017/03/08 14:05:38 UTC

[jira] [Comment Edited] (OAK-3878) Avoid caching of NodeDocument while iterating in BlobReferenceIterator

    [ https://issues.apache.org/jira/browse/OAK-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15901297#comment-15901297 ] 

Chetan Mehrotra edited comment on OAK-3878 at 3/8/17 2:04 PM:
--------------------------------------------------------------

bq. This increases complexity and e.g. makes testing more difficult.

Ack. However for best performance it would be better if we can expose the underlying Iterator instead of doing unnecessary pagination. So may be we expose an Iterable. Specially in batch operation like this using driver api directly would allow better utilization of persistent store capability 

In this case the implementation is very simple where a single query to be invoked and its iterator to be used. For RDB the result set can be wrapped directly. So complexity is lot lesser. Apart from RevisionGC and LastRevRecovery this is only other batch operation being performed


was (Author: chetanm):
bq. This increases complexity and e.g. makes testing more difficult.

Ack. However for best performance it would be better if we can expose the underlying Iterator instead of doing unnecessary pagination. So may be we expose an Iterable. Specially in batch operation like this using driver api directly would allow better utilization of persistent store capability 

> Avoid caching of NodeDocument while iterating in BlobReferenceIterator
> ----------------------------------------------------------------------
>
>                 Key: OAK-3878
>                 URL: https://issues.apache.org/jira/browse/OAK-3878
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: documentmk
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>            Priority: Minor
>             Fix For: 1.8
>
>
> {{BlobReferenceIterator}} in DocumentMK makes use of {{DocumentStore}} API to query the NodeDocument. This would cause all those NodeDocuments to be added to cache in DocumentStore. Due to this when blob gc is running cache usage would not be that effective due to all the associated churn. 
> As these NodeDocument are only required for BlobGC logic and its not expected that this document would read again soon it would be better to skip caching of these documents within DocumentStore
> Similar requirement exist in VersionGC logic but there we use direct store based API which does not add such documents to the cache



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)