You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Zheng Wang (Jira)" <ji...@apache.org> on 2020/07/01 04:15:00 UTC

[jira] [Comment Edited] (HBASE-24664) Some changing of split region by overall region size rather than only one store size

    [ https://issues.apache.org/jira/browse/HBASE-24664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149089#comment-17149089 ] 

Zheng Wang edited comment on HBASE-24664 at 7/1/20, 4:14 AM:
-------------------------------------------------------------

bq. Maybe we could just add an option for the RegionSplitPolicy to control whether it should count all store files or only the largest one?
bq. 
bq. On branch-2 we could make the default to count the largest one, to keep the old behavior and users can also manually set to count all store files, and on master, we change the default to count all store files.

I am not sure it is worth to add the option here, since seems not exist a scenario that better to split by store size? 
And about branch-2, i have not heavy opinion on whether keep the old behavior or not.



was (Author: filtertip):
bq. Maybe we could just add an option for the RegionSplitPolicy to control whether it should count all store files or only the largest one?
bq. 
bq. On branch-2 we could make the default to count the largest one, to keep the old behavior and users can also manually set to count all store files, and on master, we change the default to count all store files.

I am not sure it is worth to add the option here, since seems not exist a scenario that better to split by store size? 

> Some changing of split region by overall region size rather than only one store size
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-24664
>                 URL: https://issues.apache.org/jira/browse/HBASE-24664
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Zheng Wang
>            Assignee: Zheng Wang
>            Priority: Major
>
> As a distributed cluster, HBase distribute loads in unit of region, so if region grows too big,
>  it will bring some negative effects, such as:
>  1. Harder to homogenize disk usage(consider locality)
>  2. Might cost more time on region opening
>  3. After split, the daughter region might lead to more io cost on compaction in a short time(if write evenly)
> HBASE-24530 introduced a new SteppingAllStoresSizeSplitPolicy, and as discussed in its comments and related [thread|https://lists.apache.org/thread.html/r08a8103e2532eb667a0fcb4efa8a4117b3f82e6251bc4bd0bc157c26%40%3Cdev.hbase.apache.org%3E], we should do follow-on tasks in this new issue.
>  1. Set SteppingAllStoresSizeSplitPolicy as default
>  2. Mark SteppingSplitPolicy and IncreasingToUpperBoundRegionSplitPolicy as deprecated
>  3. Fix ConstantSizeRegionSplitPolicy to split region by overall region size also



--
This message was sent by Atlassian Jira
(v8.3.4#803005)