You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Alex Parvulescu (JIRA)" <ji...@apache.org> on 2015/11/09 15:42:11 UTC
[jira] [Created] (OAK-3603) Evaluate skipping cleanup of a subset
of tar files
Alex Parvulescu created OAK-3603:
------------------------------------
Summary: Evaluate skipping cleanup of a subset of tar files
Key: OAK-3603
URL: https://issues.apache.org/jira/browse/OAK-3603
Project: Jackrabbit Oak
Issue Type: Improvement
Components: segmentmk
Reporter: Alex Parvulescu
Assignee: Alex Parvulescu
Given the fact that tar readers are immutable (we only create new generations of them once they reach a certain threshold of garbage) we can consider coming up with a heuristic for skipping cleanup entirely for consequent cleanup calls based on the same referenced id set (provided we can make this set more stable, aka. OAK-2849).
Ex: for a specific input set a cleanup call on a tar reader might decide that there's no enough garbage (some IO involved in reading through all existing entries). if the following cleanup cycle would have the exact same input, it doesn't make sense to recheck the tar file, we already know cleanup can be skipped, moreover we can skip the older tar files too, as their input would also not change. the gains increase the larger the number of tar files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)