You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Liang Xie (JIRA)" <ji...@apache.org> on 2013/02/19 07:21:13 UTC

[jira] [Commented] (HBASE-7845) optimize hfile index size like leveldb's ByteWiseComparatorImpl::FindShortestSeparator() style

    [ https://issues.apache.org/jira/browse/HBASE-7845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581048#comment-13581048 ] 

Liang Xie commented on HBASE-7845:
----------------------------------

After a further investigation, i know:
1)For leveldb, the index key was a "faked" key which larger or equals than current data block's last key, and smaller than the next data block's first key
2)For HFileV2, we use the first key in current data block, there's a discussion before:HBASE-4443

IMHO, a "faked" last key has more benefit than mentioned in HBASE-4443. e.g, there's a good example from leveldb's comments:
{quote}
  // We do not emit the index entry for a block until we have seen the
  // first key for the next data block.  This allows us to use shorter
  // keys in the index block.  For example, consider a block boundary
  // between the keys "the quick brown fox" and "the who".  We can use
  // "the r" as the key for the index block entry since it is >= all
  // entries in the first block and < all entries in subsequent
  // blocks.
{quote}

I'd like to have a try to make a patch these days on it. Any comments and advices will be highly appreciated:)
                
> optimize hfile index size like leveldb's ByteWiseComparatorImpl::FindShortestSeparator() style
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-7845
>                 URL: https://issues.apache.org/jira/browse/HBASE-7845
>             Project: HBase
>          Issue Type: Improvement
>          Components: HFile
>    Affects Versions: 0.96.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>
> Leveldb uses ByteWiseComparatorImpl::FindShortestSeparator() & FindShortSuccessor() to reduce index key size, it would be helpful under special conditions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira