You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "nkeywal (JIRA)" <ji...@apache.org> on 2011/07/22 19:15:58 UTC

[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

    [ https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069626#comment-13069626 ] 

nkeywal commented on HBASE-1938:
--------------------------------

I modified the unit test to make it work with the trunk as it is today (new file attached). It worth reviewing, I set a magic value for setThreadReadPoint, I don't know if it is the right thing to do.

I also added a loop on the list size to make visible any exponential cost.

On a scan the "next()" part, the hbase currently compare the value of two internals iterators. In this test, the second list is always empty, hence the cost on comparator is lowered vs. real life. I don't know if it is a side effect of my modifications.

There is a trival optimization in the "seek" function.
      KeyValue lowest = getLowest();
      return lowest != null;

Could be replaced by:
return (kvsetNextRow != null || snapshotNextRow != null);

But I don't think it worth a patch just for this (it should be included in a bigger patch hoewever). If you think differently, I can do it.



> Make in-memory table scanning faster
> ------------------------------------
>
>                 Key: HBASE-1938
>                 URL: https://issues.apache.org/jira/browse/HBASE-1938
>             Project: HBase
>          Issue Type: Improvement
>          Components: performance
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run faster when all is up in memory.  Talking to some users, they are seeing about 1/4 million rows a second.  It should be able to go faster than this (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira