You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Francesco Mari (JIRA)" <ji...@apache.org> on 2018/11/22 10:11:00 UTC

[jira] [Commented] (OAK-7914) Cleanup updates the gc.log after a failed compaction

    [ https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695748#comment-16695748 ] 

Francesco Mari commented on OAK-7914:
-------------------------------------

[~mduerig], what do you think about this?

> Cleanup updates the gc.log after a failed compaction
> ----------------------------------------------------
>
>                 Key: OAK-7914
>                 URL: https://issues.apache.org/jira/browse/OAK-7914
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>            Reporter: Francesco Mari
>            Priority: Major
>             Fix For: 1.10
>
>
> The {{gc.log}} is always updated during the cleanup phase, regardless of the result of the compaction phase. This might cause a scenario similar to the following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the repository, 100GB.
> Now, let's imagine that compaction is run shortly after that. The amount of content added to the repository is negligible. For the sake of simplicity, let's say that the size of the repository hasn't changed. The following happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} reports that the latest known size of the repository is 100GB, and there is not enough content to remove.
> This is in fact a bug, because there are 40GB worth of garbage in the repository, but estimation is not able to see that anymore. The solution seems to be not to update the {{gc.log}} if compaction fails. In other words, {{gc.log}} should contain the size of the *compacted* repository over time, and no more.
> Thanks to [~rma61870@adobe.com] for reporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)