You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Ben Maurer (JIRA)" <ji...@apache.org> on 2009/03/08 02:04:56 UTC

[jira] Created: (HBASE-1248) Fast splitting for last region

Fast splitting for last region
------------------------------

                 Key: HBASE-1248
                 URL: https://issues.apache.org/jira/browse/HBASE-1248
             Project: Hadoop HBase
          Issue Type: Improvement
            Reporter: Ben Maurer
             Fix For: 0.20.0


I have an HBase table where items are keyed by a sequential row id. When doing bulk appends to this table, the last region of the table gets pretty large until enough compactions can be done to sort everything out.

In this type of case, it'd be better if when the region serving the largest keys doesn't split at the midkey, but on the last key. One way to implement this would be by saying that when the top region reaches MAX_REGION_SIZE/2, create a new region, with the lower half getting all the data and the top half empty. For bulk sequential inserts, this should avoid the need for any compactions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1248) Fast splitting for last region

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1248:
-------------------------

    Fix Version/s:     (was: 0.20.0)

Nice idea.  If a patch comes in before 0.20.0 is cut, will include.  Meantime moving this out of 0.20.0.

> Fast splitting for last region
> ------------------------------
>
>                 Key: HBASE-1248
>                 URL: https://issues.apache.org/jira/browse/HBASE-1248
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Ben Maurer
>
> I have an HBase table where items are keyed by a sequential row id. When doing bulk appends to this table, the last region of the table gets pretty large until enough compactions can be done to sort everything out.
> In this type of case, it'd be better if when the region serving the largest keys doesn't split at the midkey, but on the last key. One way to implement this would be by saying that when the top region reaches MAX_REGION_SIZE/2, create a new region, with the lower half getting all the data and the top half empty. For bulk sequential inserts, this should avoid the need for any compactions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.