You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Michael Dürig (JIRA)" <ji...@apache.org> on 2015/03/31 18:20:55 UTC

[jira] [Commented] (OAK-2713) High memory usage of CompactionMap

    [ https://issues.apache.org/jira/browse/OAK-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388775#comment-14388775 ] 

Michael Dürig commented on OAK-2713:
------------------------------------

Looking at {{CompactionMap}} there is notthing much to squeeze out. Given an id {{a}} compacted to {{b}} then to {{c}} and finally to {{d}}, this is currently stored as a series of mappings {{a -> b, b -> c, c -> d}}. Each id is asymptotically stored twice and we could thus roughly half memory consumption through value sharing of ids. 
On the implementation side this would need a complete rewrite, possibly switching to a representative based implementation of the equivalence relation (i.e. {{a -> a, b -> a, c -> a, d -> a}}) as the pointer arithmetic would be simpler here. 

However, given this maximally halves memory consumption and that we have a linear demand for more memory on each compaction cycle this would only push the problem out from the {{n}}-th to the {{2n}}-th cycle. 

>From this I conclude we have to options here: a) Either forget mappings (i.e. make the compaction map more cache like) or b) persist the compaction map. 

AFAICS option a) would mostly be trading CPU for memory with a certain risk to run into {{SegmentNotFoundException}} s when running 'really old diffs'. Option b) OTOH would be on the safe side here but would require some additional disk space that could only be claimed by off line compaction. 

cc [~alexparvulescu], [~mmarth]


> High memory usage of CompactionMap
> ----------------------------------
>
>                 Key: OAK-2713
>                 URL: https://issues.apache.org/jira/browse/OAK-2713
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segmentmk
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: compaction, gc
>             Fix For: 1.3.0
>
>
> In environments with a lot of volatile content the {{CompactionMap}} can end up eating a lot of memory. From {{CompactionStrategyMBean#getCompactionMapStats}}:
> {noformat}
> [Estimated Weight: 317,5 MB, Records: 39500094, Segments: 36698], 
> [Estimated Weight: 316,4 MB, Records: 39374593, Segments: 36660], 
> [Estimated Weight: 315,4 MB, Records: 39253205, Segments: 36620], 
> [Estimated Weight: 315,1 MB, Records: 39221882, Segments: 36614], 
> [Estimated Weight: 314,9 MB, Records: 39195490, Segments: 36604], 
> [Estimated Weight: 315,0 MB, Records: 39182753, Segments: 36602], 
> [Estimated Weight: 360 B, Records: 0, Segments: 0],
> {noformat}
> This causes compaction to be skipped:
> {noformat}
> 2015-03-30:30.03.2015 02:00:00.038 *INFO* [] [TarMK compaction thread [/foo/bar/crx-quickstart/repository/segmentstore], active since Mon Mar 30 02:00:00 CEST 2015, previous max duration 3854982ms] org.apache.jackrabbit.oak.plugins.segment.file.FileStore Not enough available memory 5,5 GB, needed 6,3 GB, last merge delta 1,3 GB, so skipping compaction for now
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)