You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Geoff Hendrey <gh...@decarta.com> on 2011/09/04 01:32:36 UTC

prevent region splits?

Is there a way to prevent regions from splitting while we are running a
mapreduce job that does a lot of Puts? It seems that there is a lot of
HDFS activity related to the splitting of regions while my M/R job is
doing the puts. Is it sensible to disable splitting during the job that
does lots of Put? Would there be any danger in this (i.e. disabling
splitting during the job, and re-enabling it when the job completes)?

 

I see the hbase.regionserver.thread.splitcompactcheckfrequency could be
used to make splits happen less frequently, but what I'd really like is
for splitting to be disabled, then re-enabled later.

 

-Geoff


RE: prevent region splits?

Posted by Geoff Hendrey <gh...@decarta.com>.
St.Ack -

I will definitely save the meta (before and after) and the master log,
next time this problem occurs. At the moment, I am running with
splitting disabled hoping the problem doesn't recur. Already, I see that
the master logs are "quiet", whereas before they were churning endlessly
with split-related info.

You are correct, we are running 90.1 (CDH3U0). 

I will keep you posted on this issue. Thanks a lot for the help!

-geoff

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
Stack
Sent: Sunday, September 04, 2011 2:46 PM
To: user@hbase.apache.org
Subject: Re: prevent region splits?

On Sun, Sep 4, 2011 at 12:08 PM, Geoff Hendrey <gh...@decarta.com>
wrote:
> great advice guys. appreciate it. Have made the changes to increase
> storefile size. I'd also like to prevent rebalancing while I am
running
> my large M/R Put job. Any way to do that?
>

In shell you can disable the balancer.  Be warned its an in-memory
setting the master only (unless you set config in hbase-site.xml to
have balancer run once an eternity).

> At present, 50% of the time that I run my large M/R Put job, the table
> is corrupted (hole in .META.) and we have to run our repair program to
> fix the hole.

Mind doing a dump of meta region content pre MR job, and then master
log and dump of meta after with identified messed up section of meta?

>  It's very labor intensive. I am hoping that be turning off
> splitting, and deferring balancing, that I can prevent whatever
> condition leads to the creation of the hole in .META.. My hope is that
> if we prevent splitting and rebalancing then there would be no action
> that could cause a whole to occur.
>

Probably.  From previous email, up your timeout monitor (you are not
running 0.90.4 I take it).

St.Ack

Re: prevent region splits?

Posted by Stack <st...@duboce.net>.
On Sun, Sep 4, 2011 at 12:08 PM, Geoff Hendrey <gh...@decarta.com> wrote:
> great advice guys. appreciate it. Have made the changes to increase
> storefile size. I'd also like to prevent rebalancing while I am running
> my large M/R Put job. Any way to do that?
>

In shell you can disable the balancer.  Be warned its an in-memory
setting the master only (unless you set config in hbase-site.xml to
have balancer run once an eternity).

> At present, 50% of the time that I run my large M/R Put job, the table
> is corrupted (hole in .META.) and we have to run our repair program to
> fix the hole.

Mind doing a dump of meta region content pre MR job, and then master
log and dump of meta after with identified messed up section of meta?

>  It's very labor intensive. I am hoping that be turning off
> splitting, and deferring balancing, that I can prevent whatever
> condition leads to the creation of the hole in .META.. My hope is that
> if we prevent splitting and rebalancing then there would be no action
> that could cause a whole to occur.
>

Probably.  From previous email, up your timeout monitor (you are not
running 0.90.4 I take it).

St.Ack

RE: prevent region splits?

Posted by Geoff Hendrey <gh...@decarta.com>.
great advice guys. appreciate it. Have made the changes to increase
storefile size. I'd also like to prevent rebalancing while I am running
my large M/R Put job. Any way to do that?

At present, 50% of the time that I run my large M/R Put job, the table
is corrupted (hole in .META.) and we have to run our repair program to
fix the hole. It's very labor intensive. I am hoping that be turning off
splitting, and deferring balancing, that I can prevent whatever
condition leads to the creation of the hole in .META.. My hope is that
if we prevent splitting and rebalancing then there would be no action
that could cause a whole to occur.

-geoff

-----Original Message-----
From: Doug Meil [mailto:doug.meil@explorysmedical.com] 
Sent: Sunday, September 04, 2011 9:12 AM
To: user@hbase.apache.org
Cc: hbase-user@hadoop.apache.org
Subject: Re: prevent region splits?


Along with what Jack said, see this...

http://hbase.apache.org/book.html#required_configuration

.. and just double check that you don't have scheduled major compactions
going off once a day (the default)



On 9/3/11 7:54 PM, "Jack Levin" <ma...@gmail.com> wrote:

>Make hbase.hregion.max.filesize to be very large. Then your regions
>won't split.  We use this method when copying 'live' hbase to make a
>backup.
>
>-Jack
>
>On Sat, Sep 3, 2011 at 4:32 PM, Geoff Hendrey <gh...@decarta.com>
>wrote:
>> Is there a way to prevent regions from splitting while we are running
a
>> mapreduce job that does a lot of Puts? It seems that there is a lot
of
>> HDFS activity related to the splitting of regions while my M/R job is
>> doing the puts. Is it sensible to disable splitting during the job
that
>> does lots of Put? Would there be any danger in this (i.e. disabling
>> splitting during the job, and re-enabling it when the job completes)?
>>
>>
>>
>> I see the hbase.regionserver.thread.splitcompactcheckfrequency could
be
>> used to make splits happen less frequently, but what I'd really like
is
>> for splitting to be disabled, then re-enabled later.
>>
>>
>>
>> -Geoff
>>
>>


Re: prevent region splits?

Posted by Doug Meil <do...@explorysmedical.com>.
Along with what Jack said, see this...

http://hbase.apache.org/book.html#required_configuration

.. and just double check that you don't have scheduled major compactions
going off once a day (the default)



On 9/3/11 7:54 PM, "Jack Levin" <ma...@gmail.com> wrote:

>Make hbase.hregion.max.filesize to be very large. Then your regions
>won't split.  We use this method when copying 'live' hbase to make a
>backup.
>
>-Jack
>
>On Sat, Sep 3, 2011 at 4:32 PM, Geoff Hendrey <gh...@decarta.com>
>wrote:
>> Is there a way to prevent regions from splitting while we are running a
>> mapreduce job that does a lot of Puts? It seems that there is a lot of
>> HDFS activity related to the splitting of regions while my M/R job is
>> doing the puts. Is it sensible to disable splitting during the job that
>> does lots of Put? Would there be any danger in this (i.e. disabling
>> splitting during the job, and re-enabling it when the job completes)?
>>
>>
>>
>> I see the hbase.regionserver.thread.splitcompactcheckfrequency could be
>> used to make splits happen less frequently, but what I'd really like is
>> for splitting to be disabled, then re-enabled later.
>>
>>
>>
>> -Geoff
>>
>>


Re: prevent region splits?

Posted by Jack Levin <ma...@gmail.com>.
Make hbase.hregion.max.filesize to be very large. Then your regions
won't split.  We use this method when copying 'live' hbase to make a
backup.

-Jack

On Sat, Sep 3, 2011 at 4:32 PM, Geoff Hendrey <gh...@decarta.com> wrote:
> Is there a way to prevent regions from splitting while we are running a
> mapreduce job that does a lot of Puts? It seems that there is a lot of
> HDFS activity related to the splitting of regions while my M/R job is
> doing the puts. Is it sensible to disable splitting during the job that
> does lots of Put? Would there be any danger in this (i.e. disabling
> splitting during the job, and re-enabling it when the job completes)?
>
>
>
> I see the hbase.regionserver.thread.splitcompactcheckfrequency could be
> used to make splits happen less frequently, but what I'd really like is
> for splitting to be disabled, then re-enabled later.
>
>
>
> -Geoff
>
>