You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Alex Parvulescu (JIRA)" <ji...@apache.org> on 2017/03/08 10:14:38 UTC
[jira] [Updated] (OAK-5910) Reduce copying of data when reading mmapped records

     [ https://issues.apache.org/jira/browse/OAK-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu updated OAK-5910:
---------------------------------
    Attachment: OAK-5910.patch

initial patch. javadocs missing on Segment class, but it proves the idea.
fyi [~mduerig], [~frm].

benchmarks are very flaky (more a reflection on the state of benchmarks rather than the patch itself):

ConcurrentReadWriteTest on Trunk:

# ConcurrentReadWriteTest          C     min     10%     50%     90%     max       N 
Oak-Segment-Tar                    1      55      98     119     145     307     494
# ConcurrentReadWriteTest          C     min     10%     50%     90%     max       N 
Oak-Segment-Tar                    1      75      98     118     137     246     504
# ConcurrentReadWriteTest          C     min     10%     50%     90%     max       N 
Oak-Segment-Tar                    1      45      93     112     132     221     532

ConcurrentReadWriteTest with patch:
# ConcurrentReadWriteTest          C     min     10%     50%     90%     max       N 
Oak-Segment-Tar                    1      40      97     116     135     252     517
# ConcurrentReadWriteTest          C     min     10%     50%     90%     max       N 
Oak-Segment-Tar                    1      44     100     121     142     242     493
# ConcurrentReadWriteTest          C     min     10%     50%     90%     max       N 
Oak-Segment-Tar                    1      71      92     112     128     256     537

it seems that the patch version looks better over multiple test runs but even the unpatched version spikes on my machine, so the results are borderline useless, unless someone can propose a reliable way to run benchmarks without the spikiness.


> Reduce copying of data when reading mmapped records
> ---------------------------------------------------
>
>                 Key: OAK-5910
>                 URL: https://issues.apache.org/jira/browse/OAK-5910
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Alex Parvulescu
>            Assignee: Alex Parvulescu
>             Fix For: 1.8
>
>         Attachments: OAK-5910.patch
>
>
> The idea is to reduce the amount of extra byte buffers created when reading mmapped records, if possible pushing the ByteBuffer all the way to the consumer.
> For example reading a String from a Segment right now means first reading the bytes of of the record into a byte array, then creating a string with an encoding (which behind the scenes will copy the byte array again and run it through the decoder). An alternative is to call {{decode}} on the Charset and pass in the ByteBuffer, skipping the intermediate operations.
> There are a few cases of this I included in the patch, but there may be others (like the {{SegmentStream}} which needs a full rewrite).
> Interested in what others think of this!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)