You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Michael Dürig (JIRA)" <ji...@apache.org> on 2017/07/05 07:40:00 UTC
[jira] [Commented] (OAK-3349) Partial compaction
[ https://issues.apache.org/jira/browse/OAK-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074356#comment-16074356 ]
Michael Dürig commented on OAK-3349:
------------------------------------
A further optimisation we could do (possibly as an afterthought once above is implemented) is to clean up tail compacted generations once no {{w}} segments reference them any more. This is the case after a full compaction and at least {{n}} further tail compactions where {{n}} is the number of retained generations. One question here is how to best detect that above criteria is fulfilled.
> Partial compaction
> ------------------
>
> Key: OAK-3349
> URL: https://issues.apache.org/jira/browse/OAK-3349
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: segment-tar
> Reporter: Michael Dürig
> Assignee: Michael Dürig
> Labels: compaction, gc, scalability
> Fix For: 1.8, 1.7.4
>
> Attachments: compaction-time.png, cycle-count.png, post-gc-size.png
>
>
> On big repositories compaction can take quite a while to run as it needs to create a full deep copy of the current root node state. For such cases it could be beneficial if we could partially compact the repository thus splitting full compaction over multiple cycles.
> Partial compaction would run compaction on a sub-tree just like we now run it on the full tree. Afterwards it would create a new root node state by referencing the previous root node state replacing said sub-tree with the compacted one.
> Todo: Asses feasibility and impact, implement prototype.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)