You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Michael Dürig (JIRA)" <ji...@apache.org> on 2017/10/31 13:18:00 UTC
[jira] [Comment Edited] (OAK-5655) TarMK: Analyse locality of reference

    [ https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226764#comment-16226764 ] 

Michael Dürig edited comment on OAK-5655 at 10/31/17 1:17 PM:
--------------------------------------------------------------

In another analysis I ran offline compaction on a repository (17.5GB footprint compacting to 564MB, 4M nodes). The process took 20min to complete. When then running offline compaction again on the result it takes just 50sec to complete. While this test is a bit artificial as the repository consists of completely random content created by {{SegmentCompactionIT}} it still indicates that the process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time I ran offline compaction with memory mapped files on and off graphing compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to {{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC [flight recording|^offrc.jfr] from an offline compaction of the same repository with {{map=false}}. The flight recording shows that the process spends almost 99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls originating from segment reads. Furthermore the segment reads are spread more or less evenly across time and across all 50 tar files.





was (Author: mduerig):
In another analysis I ran offline compaction on a repository (17.5GB footprint compacting to 564MB, 4M nodes). The process took 20min to complete. When then running offline compaction again on the result it takes just 50sec to complete. While this test is a bit artificial as the repository consists of completely random content created by {{SegmentCompactionIT}} it still indicates that the process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time I ran offline compaction with memory mapped files on and off graphing compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to {{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC flight recording from an offline compaction of the same repository with {{map=false}}. The flight recording shows that the process spends almost 99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls originating from segment reads. Furthermore the segment reads are spread more or less evenly across time and across all 50 tar files.




> TarMK: Analyse locality of reference 
> -------------------------------------
>
>                 Key: OAK-5655
>                 URL: https://issues.apache.org/jira/browse/OAK-5655
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: segment-tar
>            Reporter: Michael Dürig
>              Labels: scalability
>             Fix For: 1.8
>
>         Attachments: compaction-time-vs.reposize.png, offrc.jfr, segment-per-path-compacted-nocache.png, segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)