You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Pankaj Kumar (JIRA)" <ji...@apache.org> on 2018/02/13 09:01:00 UTC

[jira] [Commented] (HBASE-9081) Online split for an reserved empty region

    [ https://issues.apache.org/jira/browse/HBASE-9081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362016#comment-16362016 ] 

Pankaj Kumar commented on HBASE-9081:
-------------------------------------

Thanks [~jeason] for raising this Jira. 

Multiple split points (pre-split) can be defined only at table creation and thereafter region splits only into two daughter regions either manually (using HBaseAdmin APIs) or automatically (based on the split policy). Currently there is no way to split a region into multiple daughter regions, user need to send multiple RPCs to retrieve table regions and send split request.

Based on the customer experiences, there is a need of multiple split of region in a single operation. We can say "Region Multi Split" instead of "Online split for an reserved empty region".

There can be multiple scenario where multi split is very much useful,
1) In the beginning user can't predict the incoming data behavior, so create the table with default region (without pre-split). After some data load into the table, user can predict the data distribution and define the split points efficiently. But currently to split the region into multiple regions (let say 500) is not easy with existing APIs. User has to retrieve and split the region multiple times.

2) In case where the incoming data rate is too high, with current region split (2 daughter regions), multiple times splits is going to happen which will cause lot of I/O and cpu resources till it reaches to its desired number of regions (let say 500). But with the new feature, directly region can be split into the desirable number of regions in single operation.


Let me know your thought over this, will attach the design doc soon.

> Online split for an reserved empty region
> -----------------------------------------
>
>                 Key: HBASE-9081
>                 URL: https://issues.apache.org/jira/browse/HBASE-9081
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver
>            Reporter: Jieshan Bean
>            Assignee: Jieshan Bean
>            Priority: Major
>
> We already have a region splitter tool. But it can only provide limited functions:
> 1. Create table with a specified region number without give any splits.
> 2. Roll-Split on an exist region.
> We have such user scenario: 
> Table was created with splits like below: 
> a____b____c____d____e____f____g____o
> g~o is a reserved empty region. Will use it only after some days. So we don't know the rowkey distribution currently. Will split it only when it get used.
> Say, we want to split g~o with 10 new regions, likes g, g1, g2, g3, g4, g5.......,g9, o.
> I didn't find similar function has already been there. Please tell me if I am wrong.
> Hope to hear your ideas on this:)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)