You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jack Levin <ma...@gmail.com> on 2010/10/22 17:47:09 UTC

large store file split

I am trying to split a 20G regionfile, and getting timeouts see below:

2010-10-22 08:41:44,851 INFO
org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split
of region test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
2010-10-22 08:44:06,065 INFO
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running
rollback of failed split of
test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.;
Timed out trying to locate root region
2010-10-22 08:44:06,066 INFO
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Successful
rollback of failed split of
test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.


Is there a way to adjust some parameters to have this finish?

-Jack

RE: large store file split

Posted by Jonathan Gray <jg...@facebook.com>.
Hey Jack,

Seems like you're getting a lot of strange ZooKeeper behavior.

How many nodes are you running with in your quorum?  Do you have any weird networking issues?

Check out the ZK server logs as well and see if there's anything suspicious going on in there.

Also, if you enable ZK debug on the HBase-side, you'll see all the session ids of these clients that seem to be out of sync.  You can see which server they get connected to, match it up with those server's logs, and try to figure out if there's anything in common with all these clients getting odd stuff out of ZK.

JG

> -----Original Message-----
> From: Jack Levin [mailto:magnito@gmail.com]
> Sent: Friday, October 22, 2010 12:23 PM
> To: user@hbase.apache.org
> Cc: user@hbase.apache.org
> Subject: Re: large store file split
> 
> Yes exactly
> 
> -Jack
> 
> 
> On Oct 22, 2010, at 10:49 AM, Stack <st...@duboce.net> wrote:
> 
> > Thats all that is in the log file?  You run at DEBUG level, right?
> > Was that regionserver working fine otherwise?  Just failing the split
> > because couldn't "find" root?
> >
> > St.Ack
> >
> > On Fri, Oct 22, 2010 at 10:39 AM, Jack Levin <ma...@gmail.com>
> wrote:
> >> Everything else is humming along nicely... regions are loaded, and
> >> there are no issues.
> >>
> >> -Jack
> >>
> >> PS. I was able to split it finally by doing split 'table' a couple
> of times.
> >>
> >> On Fri, Oct 22, 2010 at 10:26 AM, Stack <st...@duboce.net> wrote:
> >>> The root region is not on line according to the below.  Is that the
> case?
> >>> St.Ack
> >>>
> >>> On Fri, Oct 22, 2010 at 8:47 AM, Jack Levin <ma...@gmail.com>
> wrote:
> >>>> I am trying to split a 20G regionfile, and getting timeouts see
> below:
> >>>>
> >>>> 2010-10-22 08:41:44,851 INFO
> >>>> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting
> split
> >>>> of region
> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e
> 8fbefe2a1e0.
> >>>> 2010-10-22 08:44:06,065 INFO
> >>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running
> >>>> rollback of failed split of
> >>>>
> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e
> 8fbefe2a1e0.;
> >>>> Timed out trying to locate root region
> >>>> 2010-10-22 08:44:06,066 INFO
> >>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
> Successful
> >>>> rollback of failed split of
> >>>>
> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e
> 8fbefe2a1e0.
> >>>>
> >>>>
> >>>> Is there a way to adjust some parameters to have this finish?
> >>>>
> >>>> -Jack
> >>>>
> >>>
> >>

Re: large store file split

Posted by Jack Levin <ma...@gmail.com>.
Yes exactly

-Jack


On Oct 22, 2010, at 10:49 AM, Stack <st...@duboce.net> wrote:

> Thats all that is in the log file?  You run at DEBUG level, right?
> Was that regionserver working fine otherwise?  Just failing the split
> because couldn't "find" root?
> 
> St.Ack
> 
> On Fri, Oct 22, 2010 at 10:39 AM, Jack Levin <ma...@gmail.com> wrote:
>> Everything else is humming along nicely... regions are loaded, and
>> there are no issues.
>> 
>> -Jack
>> 
>> PS. I was able to split it finally by doing split 'table' a couple of times.
>> 
>> On Fri, Oct 22, 2010 at 10:26 AM, Stack <st...@duboce.net> wrote:
>>> The root region is not on line according to the below.  Is that the case?
>>> St.Ack
>>> 
>>> On Fri, Oct 22, 2010 at 8:47 AM, Jack Levin <ma...@gmail.com> wrote:
>>>> I am trying to split a 20G regionfile, and getting timeouts see below:
>>>> 
>>>> 2010-10-22 08:41:44,851 INFO
>>>> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split
>>>> of region test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
>>>> 2010-10-22 08:44:06,065 INFO
>>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running
>>>> rollback of failed split of
>>>> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.;
>>>> Timed out trying to locate root region
>>>> 2010-10-22 08:44:06,066 INFO
>>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Successful
>>>> rollback of failed split of
>>>> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
>>>> 
>>>> 
>>>> Is there a way to adjust some parameters to have this finish?
>>>> 
>>>> -Jack
>>>> 
>>> 
>> 

Re: large store file split

Posted by Stack <st...@duboce.net>.
Thats all that is in the log file?  You run at DEBUG level, right?
Was that regionserver working fine otherwise?  Just failing the split
because couldn't "find" root?

St.Ack

On Fri, Oct 22, 2010 at 10:39 AM, Jack Levin <ma...@gmail.com> wrote:
> Everything else is humming along nicely... regions are loaded, and
> there are no issues.
>
> -Jack
>
> PS. I was able to split it finally by doing split 'table' a couple of times.
>
> On Fri, Oct 22, 2010 at 10:26 AM, Stack <st...@duboce.net> wrote:
>> The root region is not on line according to the below.  Is that the case?
>> St.Ack
>>
>> On Fri, Oct 22, 2010 at 8:47 AM, Jack Levin <ma...@gmail.com> wrote:
>>> I am trying to split a 20G regionfile, and getting timeouts see below:
>>>
>>> 2010-10-22 08:41:44,851 INFO
>>> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split
>>> of region test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
>>> 2010-10-22 08:44:06,065 INFO
>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running
>>> rollback of failed split of
>>> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.;
>>> Timed out trying to locate root region
>>> 2010-10-22 08:44:06,066 INFO
>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Successful
>>> rollback of failed split of
>>> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
>>>
>>>
>>> Is there a way to adjust some parameters to have this finish?
>>>
>>> -Jack
>>>
>>
>

Re: large store file split

Posted by Jack Levin <ma...@gmail.com>.
Everything else is humming along nicely... regions are loaded, and
there are no issues.

-Jack

PS. I was able to split it finally by doing split 'table' a couple of times.

On Fri, Oct 22, 2010 at 10:26 AM, Stack <st...@duboce.net> wrote:
> The root region is not on line according to the below.  Is that the case?
> St.Ack
>
> On Fri, Oct 22, 2010 at 8:47 AM, Jack Levin <ma...@gmail.com> wrote:
>> I am trying to split a 20G regionfile, and getting timeouts see below:
>>
>> 2010-10-22 08:41:44,851 INFO
>> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split
>> of region test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
>> 2010-10-22 08:44:06,065 INFO
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running
>> rollback of failed split of
>> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.;
>> Timed out trying to locate root region
>> 2010-10-22 08:44:06,066 INFO
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Successful
>> rollback of failed split of
>> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
>>
>>
>> Is there a way to adjust some parameters to have this finish?
>>
>> -Jack
>>
>

Re: large store file split

Posted by Stack <st...@duboce.net>.
The root region is not on line according to the below.  Is that the case?
St.Ack

On Fri, Oct 22, 2010 at 8:47 AM, Jack Levin <ma...@gmail.com> wrote:
> I am trying to split a 20G regionfile, and getting timeouts see below:
>
> 2010-10-22 08:41:44,851 INFO
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split
> of region test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
> 2010-10-22 08:44:06,065 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running
> rollback of failed split of
> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.;
> Timed out trying to locate root region
> 2010-10-22 08:44:06,066 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Successful
> rollback of failed split of
> test_bulk7_tsv,ds18115092010.th.jpg,1287730617803.07eb62bf729e1f9cbb39e8fbefe2a1e0.
>
>
> Is there a way to adjust some parameters to have this finish?
>
> -Jack
>