You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2014/08/23 09:09:11 UTC

[jira] [Comment Edited] (HBASE-11811) Use binary search for seeking into a block

    [ https://issues.apache.org/jira/browse/HBASE-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107902#comment-14107902 ] 

Lars Hofhansl edited comment on HBASE-11811 at 8/23/14 7:08 AM:
----------------------------------------------------------------

Here's sample patch that reduces the time for 1m gets from 40s to 8.5s and there is probably more room for optimization.
The data was simply generated by HBaseTestingUtility.loadTable(...) so the KVs are small.

Some points:
# the utility of this decreases as Cells get larger and only a few of them fit into a block
# the index is not persisted, so when blocks are evicted and later reloaded the index needs to be build up again
# not happy currently about the point where the index is built, as that needs to synchronize on the block (but only when the block actually had to be loaded)



was (Author: lhofhansl):
Here's sample patch that reduce the time for 1m gets from 40s to 8.5s and there is probably more room for optimization.
The data was simply generated by HBaseTestingUtility.loadTable(...) so the KVs are small.

Some points:
# the utility of this decreases as Cell get larger and only a few of them fit into a block
# the index is not persisted, so when blocks are evicted and later reloaded the index needs to be build up again
# not happy currently about the point where the index is built, as that needs to synchronize on the block (but only when the block actually had to be loaded)


> Use binary search for seeking into a block
> ------------------------------------------
>
>                 Key: HBASE-11811
>                 URL: https://issues.apache.org/jira/browse/HBASE-11811
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
>         Attachments: block_index-v2.txt
>
>
> Currently upon every seek (including Gets) we need to linearly look through the block from the beginning until we find the Cell we are looking for.
> It should be possible to build a simple cache of offsets of Cells for each block as it is loaded and then use binary search to find the Cell in question.



--
This message was sent by Atlassian JIRA
(v6.2#6252)