You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Bram Desoete <br...@ngdata.com> on 2016/03/24 11:50:16 UTC

Re: Unexpected region splits

Pedro Gandola <pe...@...> writes:

> 
> Hi Ted,
> 
> Thanks,
> I think I got the problem, I'm using *IncreasingToUpperBoundRegionSplitPolicy
> (default)* instead *ConstantSizeRegionSplitPolicy* which in my use case is
> what I want.
> 
> Cheers
> Pedro
> 
> On Mon, Feb 15, 2016 at 5:22 PM, Ted Yu <yu...@...> wrote:
> 
> > Can you pastebin region server log snippet around the time when the split
> > happened ?
> >
> > Was the split on data table or index table ?
> >
> > Thanks
> >
> > > On Feb 15, 2016, at 10:22 AM, Pedro Gandola <pe...@...>
> > wrote:
> > >
> > > Hi,
> > >
> > > I have a cluster using *HBase 1.1.2* where I have a table and a local
> > index
> > > (using *Apache Phoenix 4.6*) in total both tables have *300 regions*
> > > (aprox: *18 regions per server*), my* hbase.hregion.max.filesize=30GB
> > *and
> > > my region sizes are now *~4.5GB compressed (~7GB uncompressed)*. However
> > > each time I restart a RS sometimes a region gets split. This is
> > unexpected
> > > because my key space is uniform (using MD5) and if the problem was my
> > > *region.size
> > >> * *hbase.hregion.max.filesize *I would expect to have all the regions or
> > > almost all splitting but this only happens when I restart a RS and it
> > > happens only for 1 or 2 regions.
> > >
> > > What are the different scenarios where a region can split?
> > >
> > > What are the right steps to restart a region server in order to avoid
> > these
> > > unexpected splits?
> > >
> > > Thank you,
> > > Cheers
> > > Pedro
> >
> 

Thanks Pedro for giving your solution.

i see the same issue during Hbase restarts. unexpected region splits.
i believe it is because the *IncreasingToUpperBoundRegionSplitPolicy* is basing
 his calculation on the amount of ONLINE regions. 
but while the RS is starting only a couple of regions are online YET. 
so the policy things it would be no problem to add another region 
since 'there are only a few'. 
(while there are actually already are 330 for that RS for that phoenix table... 
yes i know i need to merge regions. 
but this problem got out of hand unnoticed for some time now here)

could HBase block split region decision until it is fully up and running?

Hbase 1.0.0 logs. (check mainly the last line)

Mar 24, 11:06:41.494 AM	INFO	
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher	
Flushed, sequenceid=69436099, memsize=303.3 K, hasBloomFilter=true, into tmp 
file 
hdfs://ns/hbase/data/default/CUSTOMER/60af2857a7980ce4f1ac602dd83e05a6/.tmp/
0fd4988f24f24d5d9887c542182efccc
Mar 24, 11:06:41.529 AM	INFO	
org.apache.hadoop.hbase.regionserver.HStore	
Added hdfs://-ns/hbase/data/default/CUSTOMER/
ff4ecd56e6b06f228404f05f171f8282/0/1d05cf9cac4c46008e47e3578e7a18d6, 
entries=235, sequenceid=22828972, filesize=5.5 K
Mar 24, 11:06:41.561 AM	INFO	
org.apache.hadoop.hbase.regionserver.HStore	
Completed compaction of 3 (all) file(s) in s of CUSTOMER,\x0A0+\xF6\
xD8,1457121856469.183f6134683e0213ccb15558a56f7c02. 
into 730489295b8c42afaec4a3b8bc38c915(size=1.4 M), 
total size for store is 1.4 M. This selection was in queue for 
0sec, and took 0sec to execute.
Mar 24, 11:06:41.561 AM	INFO	
org.apache.hadoop.hbase.regionserver.CompactSplitThread	
Completed compaction: Request = regionName=CUSTOMER,
\x0A0+\xF6\xD8,1457121856469.183f6134683e0213ccb15558a56f7c02., 
storeName=s, fileCount=3, fileSize=1.7 M, priority=7, time=1456532583179472; 
duration=0sec
Mar 24, 11:06:41.562 AM	DEBUG	
org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy	
ShouldSplit because IB size=3269370636, sizeToCheck=2147483648, 
regionsWithCommonTable=2

i will also revert back to the ConstantSizeRegionSplitPolicy

Regards,

Re: Unexpected region splits

Posted by Ted Yu <yu...@gmail.com>.

Actually there may be a simpler solution:

http://pastebin.com/3KJ7Vxnc

We can check the ratio between online regions and total number of regions
in IncreasingToUpperBoundRegionSplitPolicy#shouldSplit().

Only when the ratio gets over certain threshold, should splitting start.

FYI

On Thu, Mar 24, 2016 at 12:39 PM, Ted Yu <yu...@gmail.com> wrote:

> Currently IncreasingToUpperBoundRegionSplitPolicy doesn't detect when the
> master initialization finishes.
>
> There is also some missing piece where region server notifies the
> completion of cluster initialization (by looking at RegionServerObserver).
>
> Cheers
>
> On Thu, Mar 24, 2016 at 3:50 AM, Bram Desoete <br...@ngdata.com> wrote:
>
>>
>>
>>
>> Pedro Gandola <pe...@...> writes:
>>
>> >
>> > Hi Ted,
>> >
>> > Thanks,
>> > I think I got the problem, I'm using
>> *IncreasingToUpperBoundRegionSplitPolicy
>> > (default)* instead *ConstantSizeRegionSplitPolicy* which in my use case
>> is
>> > what I want.
>> >
>> > Cheers
>> > Pedro
>> >
>> > On Mon, Feb 15, 2016 at 5:22 PM, Ted Yu <yu...@...> wrote:
>> >
>> > > Can you pastebin region server log snippet around the time when the
>> split
>> > > happened ?
>> > >
>> > > Was the split on data table or index table ?
>> > >
>> > > Thanks
>> > >
>> > > > On Feb 15, 2016, at 10:22 AM, Pedro Gandola <pe...@...>
>> > > wrote:
>> > > >
>> > > > Hi,
>> > > >
>> > > > I have a cluster using *HBase 1.1.2* where I have a table and a
>> local
>> > > index
>> > > > (using *Apache Phoenix 4.6*) in total both tables have *300 regions*
>> > > > (aprox: *18 regions per server*), my*
>> hbase.hregion.max.filesize=30GB
>> > > *and
>> > > > my region sizes are now *~4.5GB compressed (~7GB uncompressed)*.
>> However
>> > > > each time I restart a RS sometimes a region gets split. This is
>> > > unexpected
>> > > > because my key space is uniform (using MD5) and if the problem was
>> my
>> > > > *region.size
>> > > >> * *hbase.hregion.max.filesize *I would expect to have all the
>> regions or
>> > > > almost all splitting but this only happens when I restart a RS and
>> it
>> > > > happens only for 1 or 2 regions.
>> > > >
>> > > > What are the different scenarios where a region can split?
>> > > >
>> > > > What are the right steps to restart a region server in order to
>> avoid
>> > > these
>> > > > unexpected splits?
>> > > >
>> > > > Thank you,
>> > > > Cheers
>> > > > Pedro
>> > >
>> >
>>
>>
>>
>> Thanks Pedro for giving your solution.
>>
>> i see the same issue during Hbase restarts. unexpected region splits.
>> i believe it is because the *IncreasingToUpperBoundRegionSplitPolicy* is
>> basing
>>  his calculation on the amount of ONLINE regions.
>> but while the RS is starting only a couple of regions are online YET.
>> so the policy things it would be no problem to add another region
>> since 'there are only a few'.
>> (while there are actually already are 330 for that RS for that phoenix
>> table...
>> yes i know i need to merge regions.
>> but this problem got out of hand unnoticed for some time now here)
>>
>> could HBase block split region decision until it is fully up and running?
>>
>> Hbase 1.0.0 logs. (check mainly the last line)
>>
>> Mar 24, 11:06:41.494 AM INFO
>> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher
>> Flushed, sequenceid=69436099, memsize=303.3 K, hasBloomFilter=true, into
>> tmp
>> file
>>
>> hdfs://ns/hbase/data/default/CUSTOMER/60af2857a7980ce4f1ac602dd83e05a6/.tmp/
>> 0fd4988f24f24d5d9887c542182efccc
>> Mar 24, 11:06:41.529 AM INFO
>> org.apache.hadoop.hbase.regionserver.HStore
>> Added hdfs://-ns/hbase/data/default/CUSTOMER/
>> ff4ecd56e6b06f228404f05f171f8282/0/1d05cf9cac4c46008e47e3578e7a18d6,
>> entries=235, sequenceid=22828972, filesize=5.5 K
>> Mar 24, 11:06:41.561 AM INFO
>> org.apache.hadoop.hbase.regionserver.HStore
>> Completed compaction of 3 (all) file(s) in s of CUSTOMER,\x0A0+\xF6\
>> xD8,1457121856469.183f6134683e0213ccb15558a56f7c02.
>> into 730489295b8c42afaec4a3b8bc38c915(size=1.4 M),
>> total size for store is 1.4 M. This selection was in queue for
>> 0sec, and took 0sec to execute.
>> Mar 24, 11:06:41.561 AM INFO
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread
>> Completed compaction: Request = regionName=CUSTOMER,
>> \x0A0+\xF6\xD8,1457121856469.183f6134683e0213ccb15558a56f7c02.,
>> storeName=s, fileCount=3, fileSize=1.7 M, priority=7,
>> time=1456532583179472;
>> duration=0sec
>> Mar 24, 11:06:41.562 AM DEBUG
>>
>> org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy
>> ShouldSplit because IB size=3269370636, sizeToCheck=2147483648,
>> regionsWithCommonTable=2
>>
>> i will also revert back to the ConstantSizeRegionSplitPolicy
>>
>> Regards,
>>
>>
>>
>>
>

Re: Unexpected region splits

Posted by Ted Yu <yu...@gmail.com>.

Currently IncreasingToUpperBoundRegionSplitPolicy doesn't detect when the
master initialization finishes.

There is also some missing piece where region server notifies the
completion of cluster initialization (by looking at RegionServerObserver).

Cheers

On Thu, Mar 24, 2016 at 3:50 AM, Bram Desoete <br...@ngdata.com> wrote:

>
>
>
> Pedro Gandola <pe...@...> writes:
>
> >
> > Hi Ted,
> >
> > Thanks,
> > I think I got the problem, I'm using
> *IncreasingToUpperBoundRegionSplitPolicy
> > (default)* instead *ConstantSizeRegionSplitPolicy* which in my use case
> is
> > what I want.
> >
> > Cheers
> > Pedro
> >
> > On Mon, Feb 15, 2016 at 5:22 PM, Ted Yu <yu...@...> wrote:
> >
> > > Can you pastebin region server log snippet around the time when the
> split
> > > happened ?
> > >
> > > Was the split on data table or index table ?
> > >
> > > Thanks
> > >
> > > > On Feb 15, 2016, at 10:22 AM, Pedro Gandola <pe...@...>
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > I have a cluster using *HBase 1.1.2* where I have a table and a local
> > > index
> > > > (using *Apache Phoenix 4.6*) in total both tables have *300 regions*
> > > > (aprox: *18 regions per server*), my* hbase.hregion.max.filesize=30GB
> > > *and
> > > > my region sizes are now *~4.5GB compressed (~7GB uncompressed)*.
> However
> > > > each time I restart a RS sometimes a region gets split. This is
> > > unexpected
> > > > because my key space is uniform (using MD5) and if the problem was my
> > > > *region.size
> > > >> * *hbase.hregion.max.filesize *I would expect to have all the
> regions or
> > > > almost all splitting but this only happens when I restart a RS and it
> > > > happens only for 1 or 2 regions.
> > > >
> > > > What are the different scenarios where a region can split?
> > > >
> > > > What are the right steps to restart a region server in order to avoid
> > > these
> > > > unexpected splits?
> > > >
> > > > Thank you,
> > > > Cheers
> > > > Pedro
> > >
> >
>
>
>
> Thanks Pedro for giving your solution.
>
> i see the same issue during Hbase restarts. unexpected region splits.
> i believe it is because the *IncreasingToUpperBoundRegionSplitPolicy* is
> basing
>  his calculation on the amount of ONLINE regions.
> but while the RS is starting only a couple of regions are online YET.
> so the policy things it would be no problem to add another region
> since 'there are only a few'.
> (while there are actually already are 330 for that RS for that phoenix
> table...
> yes i know i need to merge regions.
> but this problem got out of hand unnoticed for some time now here)
>
> could HBase block split region decision until it is fully up and running?
>
> Hbase 1.0.0 logs. (check mainly the last line)
>
> Mar 24, 11:06:41.494 AM INFO
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher
> Flushed, sequenceid=69436099, memsize=303.3 K, hasBloomFilter=true, into
> tmp
> file
>
> hdfs://ns/hbase/data/default/CUSTOMER/60af2857a7980ce4f1ac602dd83e05a6/.tmp/
> 0fd4988f24f24d5d9887c542182efccc
> Mar 24, 11:06:41.529 AM INFO
> org.apache.hadoop.hbase.regionserver.HStore
> Added hdfs://-ns/hbase/data/default/CUSTOMER/
> ff4ecd56e6b06f228404f05f171f8282/0/1d05cf9cac4c46008e47e3578e7a18d6,
> entries=235, sequenceid=22828972, filesize=5.5 K
> Mar 24, 11:06:41.561 AM INFO
> org.apache.hadoop.hbase.regionserver.HStore
> Completed compaction of 3 (all) file(s) in s of CUSTOMER,\x0A0+\xF6\
> xD8,1457121856469.183f6134683e0213ccb15558a56f7c02.
> into 730489295b8c42afaec4a3b8bc38c915(size=1.4 M),
> total size for store is 1.4 M. This selection was in queue for
> 0sec, and took 0sec to execute.
> Mar 24, 11:06:41.561 AM INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread
> Completed compaction: Request = regionName=CUSTOMER,
> \x0A0+\xF6\xD8,1457121856469.183f6134683e0213ccb15558a56f7c02.,
> storeName=s, fileCount=3, fileSize=1.7 M, priority=7,
> time=1456532583179472;
> duration=0sec
> Mar 24, 11:06:41.562 AM DEBUG
>
> org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy
> ShouldSplit because IB size=3269370636, sizeToCheck=2147483648,
> regionsWithCommonTable=2
>
> i will also revert back to the ConstantSizeRegionSplitPolicy
>
> Regards,
>
>
>
>