You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Michael Dürig (JIRA)" <ji...@apache.org> on 2016/12/22 14:39:58 UTC

[jira] [Updated] (OAK-5278) Improved compaction estimator

     [ https://issues.apache.org/jira/browse/OAK-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Dürig updated OAK-5278:
-------------------------------
    Labels: gc operations  (was: gc)

> Improved compaction estimator
> -----------------------------
>
>                 Key: OAK-5278
>                 URL: https://issues.apache.org/jira/browse/OAK-5278
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Michael Dürig
>              Labels: gc, operations
>             Fix For: 1.8
>
>
> OAK-4293 introduced a new approach for estimating whether we actually want to run or skip a gc cycle. That approach is purely based on the absolute growth of the repository's on disk footprint. 
> I think this can be further refined as with the {{GCJournal}} we can effectively extrapolate the amount of garbage at a given point in time given the history of previous gc cycles. E.g. let {{S_n}} be the size of the repository and {{G_n}} the percentage of garbage right before the {{n}}-th gc cycle. We can then linearly extrapolate the garbage {{G_n+1}} for the {{n+1}}-the gc cycle along the repository sizes:
> {code}
> G_n+1 = G_n * (S_k+1 - S_k)/(S_k - S_k-1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)