You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2014/06/11 20:01:08 UTC
[jira] [Comment Edited] (ACCUMULO-2827) HeapIterator optimization

    [ https://issues.apache.org/jira/browse/ACCUMULO-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028114#comment-14028114 ] 

Josh Elser edited comment on ACCUMULO-2827 at 6/11/14 5:59 PM:
---------------------------------------------------------------

Results of accumulo continuous ingest (against 1.5.1 on hadoop-2.2.0). Tests were run against a 12 physical-core, 64GB RAM, 8 drive single-node machine:

Test:
- Ingest roughly 1 billion entries (set NUM=1,000,000,000 (without commas))
- Pre-split into 8 tablets
- table.split.threshold=100G (Avoid splits so we can have more entries per tablet)
- table.compaction.major.ratio=4
- table.file.max=10
- tserver.compaction.major.concurrent.max=9 (enough to have all compactions running concurrently)
- tserver.compaction.major.thread.files.open.max=20 (all files open at once during majc)
- tserver.memory.maps.max=4G

We only used 1 ingester instance (so a single batchwriter thread).

Results:
After ingest completed, we triggered a full majc and timed how long it took to complete.
{noformat}
time accumulo shell -u root -p <secret> -e 'compact -t ci -w'
{noformat}

1.5.1 old heap iterator
{noformat}
real    21m48.785s
user    0m6.014s
sys     0m0.475s
{noformat}

1.5.1 new heap iterator
{noformat}
real    20m45.002s
user    0m5.693s
sys     0m0.456s
{noformat}


was (Author: parkjsung):
Results of accumulo continuous ingest (against 1.5.1 on hadoop-2.2.0). Tests were run against a 12 physical-core, 64GB RAM, 8 drive single-node machine:

Test:
- Ingest roughly 1 billion entries (set NUM=1,000,000,000 (without commas))
- Pre-split into 8 tablets
- table.split.threshold=100G (Avoid splits so we can have more entries per tablet)
- table.compaction.major.ratio=4
- table.file.max=10
- tserver.compaction.major.concurrent.max=9 (enough to have all compactions running concurrently)
- tserver.compaction.major.thread.files.open.max=20 (all files open at once during majc)
- tserver.memory.maps.max=4G

We only used 1 ingester instance (so a single batchwriter thread).

Results:
After ingest completed, we triggered a full majc and timed how long it took to complete.
{noformat}
time accumulo shell -u root -p <secret> -e 'compact -t ci -w'
{noformat}

1.5.1 old heap iterator
{no format}
real    21m48.785s
user    0m6.014s
sys     0m0.475s
{no format}

1.5.1 new heap iterator
{no format}
real    20m45.002s
user    0m5.693s
sys     0m0.456s
{no format}

> HeapIterator optimization
> -------------------------
>
>                 Key: ACCUMULO-2827
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2827
>             Project: Accumulo
>          Issue Type: Improvement
>    Affects Versions: 1.5.1, 1.6.0
>            Reporter: Jonathan Park
>            Assignee: Jonathan Park
>            Priority: Minor
>             Fix For: 1.5.2, 1.6.1, 1.7.0
>
>         Attachments: ACCUMULO-2827.0.patch.txt, accumulo-2827.raw_data, new_heapiter.png, old_heapiter.png, together.png
>
>
> We've been running a few performance tests of our iterator stack and noticed a decent amount of time spent in the HeapIterator specifically related to add/removal into the heap.
> This may not be a general enough optimization but we thought we'd see what people thought. Our assumption is that it's more probable that the current "top iterator" will supply the next value in the iteration than not. The current implementation takes the other assumption by always removing + inserting the minimum iterator back into the heap. With the implementation of a binary heap that we're using, this can get costly if our assumption is wrong because we pay the log penalty of percolating up the iterator in the heap upon insertion and again when percolating down upon removal.
> We believe our assumption is a fair one to hold given that as major compactions create a log distribution of file sizes, it's likely that we may see a long chain of consecutive entries coming from 1 iterator. Understandably, taking this assumption comes at an additional cost in the case that we're wrong. Therefore, we've run a few benchmarking tests to see how much of a cost we pay as well as what kind of benefit we see. I've attached a potential patch (which includes a test harness) + image that captures the results of our tests. The x-axis represents # of repeated keys before switching to another iterator. The y-axis represents iteration time. The sets of blue + red lines varies in # of iterators present in the heap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)