You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2011/08/05 00:01:28 UTC

[jira] [Commented] (HBASE-3855) Performance degradation of memstore because reseek is linear

    [ https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079633#comment-13079633 ] 

Andrew Purtell commented on HBASE-3855:
---------------------------------------

I have tracked down occasionally failing TestHRegion tests on 0.90 branch to this commit:

{noformat}
commit 4f4edbaa043952715d4eb9a40605154c6e41d179
Author: Michael Stack <st...@apache.org>
Date:   Fri Jun 10 19:21:41 2011 +0000

    HBASE-3855 Performance degradation of memstore because reseek is linear
    
    git-svn-id: https://svn.apache.org/repos/asf/hbase/branches/0.90@1134419 13f
{noformat}

Perhaps 1 out of 25 test runs fail on my dev laptop.

The same problem may exist on trunk, but I have not checked yet for evidence of that.

Failures are like:

{quote}
Tests run: 54, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.186 sec <<< FAILURE!
testWritesWhileGetting(org.apache.hadoop.hbase.regionserver.TestHRegion)  Time elapsed: 0.187 sec  <<< FAILURE!
junit.framework.AssertionFailedError: expected:<\x00\x00\x006> but was:<\x00\x00\x004>
        at org.apache.hadoop.hbase.HBaseTestCase.assertEquals(HBaseTestCase.java:685)
        at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:2711)
{quote}


> Performance degradation of memstore because reseek is linear
> ------------------------------------------------------------
>
>                 Key: HBASE-3855
>                 URL: https://issues.apache.org/jira/browse/HBASE-3855
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: memstoreReseek.txt, memstoreReseek2.txt
>
>
> The scanner use reseek to find the next row (or next column) as part of a scan. The reseek code iterates over a Set to position itself at the right place. If there are many thousands of kvs that need to be skipped over, then the time-cost is very high. In this case, a seek would be far lesser in cost than a reseek.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira