You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Dave Revell (Updated) (JIRA)" <ji...@apache.org> on 2011/10/10 22:14:29 UTC

[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

     [ https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dave Revell updated HBASE-4489:
-------------------------------

    Attachment: HBASE-4489-branch0.90-v2.patch
                HBASE-4489-trunk-v2.patch

New patches ending in -v2. These have new tests for RegionSplitter.

Some weirdness: RegionSplitter.rollingSplit() seems to be broken, so it doesn't have any test cases in my code. I opened HBASE-4567 to focus on this. I also included a test case in TestRegionSplitter.java called reproduceDivByZeroFailure() that reproduces the problem. I think fixing this bug is outside the scope of this ticket.
                
> Better key splitting in RegionSplitter
> --------------------------------------
>
>                 Key: HBASE-4489
>                 URL: https://issues.apache.org/jira/browse/HBASE-4489
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.4
>            Reporter: Dave Revell
>            Assignee: Dave Revell
>         Attachments: HBASE-4489-branch0.90-v1.patch, HBASE-4489-branch0.90-v2.patch, HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch
>
>
> The RegionSplitter utility allows users to create a pre-split table from the command line or do a rolling split on an existing table. It supports pluggable split algorithms that implement the SplitAlgorithm interface. The only/default SplitAlgorithm is one that assumes keys fall in the range from ASCII string "00000000" to ASCII string "7FFFFFFF". This is not a sane default, and seems useless to most users. Users are likely to be surprised by the fact that all the region splits occur in in the byte range of ASCII characters.
> A better default split algorithm would be one that evenly divides the space of all bytes, which is what this patch does. Making a table with five regions would split at \x33\x33..., \x66\x66...., \x99\x99..., \xCC\xCC..., and \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira