You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Alex Parvulescu (JIRA)" <ji...@apache.org> on 2014/09/25 17:21:34 UTC

[jira] [Created] (OAK-2140) Segment Compactor will not compact binaries > 16k

Alex Parvulescu created OAK-2140:
------------------------------------

             Summary: Segment Compactor will not compact binaries > 16k
                 Key: OAK-2140
                 URL: https://issues.apache.org/jira/browse/OAK-2140
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: core, segmentmk
            Reporter: Alex Parvulescu


The compaction bit rely on the SegmentBlob#clone method in the case a binary is being processed but it looks like the #clone contract is not fully enforced for streams that are qualified as 'long values' (>16k if I read the code correctly). 
What happens is the stream is initially persisted as chunks in a ListRecord. When compaction calls #clone it will get back the original list of record ids, which will get referenced from the compacted node state [0], making compaction on large binaries ineffective as the bulk segments will never move from the original location where they were created, unless the reference node gets deleted.

I think the original design was setup to prevent large binaries from being copied over but looking at the size problem we have now it might be a good time to reconsider this approach.


[0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentBlob.java#L75




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)