You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Iulia Zidaru <iu...@1and1.ro> on 2011/03/18 12:08:17 UTC

Table distribution


Hi all,
We are using ASF HBase 0.90 with cloudera distribution for HDFS(cdh3b3)). We have a cluster with 6 machines 1135 regions on each machine.
We have many tables, each of them having regions on many nodes. We've created a new table and started to load it. The other tables are not used anymore. The problem is that all regions of the new table are on the same machine(about 150 regions) and it is really loaded! The entire cluster is still well distributed (the same number of regions on each machine), but it seems that only old tables are redistributed. Also, the table's data is distributed in HDFS across the entire cluster.
Do you have any idea what is wrong with it?
Thank you,
Iulia






-- 
Iulia Zidaru
Java Developer

1&1 Internet AG - Bucharest/Romania - Web Components Romania
18 Mircea Eliade St
Sect 1, Bucharest
RO Bucharest, 012015
iulia.zidaru@1and1.ro
0040 31 223 9153

  


Re: Table distribution

Posted by Iulia Zidaru <iu...@1and1.ro>.
  Thank you Erdem,
I've stoped the busy RegionServer and the regions were redistributed, 
but I still don't know why that happened.


On 03/18/2011 01:46 PM, Erdem Agaoglu wrote:
> Hi,
>
> I have come across the exact same issue a while ago. I am no expert but it
> seems balancer is not reassigning newly split regions, in order to reduce
> client disconnect problems, selecting other regions on the RS instead. I
> have seen a JIRA about this behaviour but i can't remember the id. Anyways,
> this is not a real solution but i overcame it by disabling unused tables.
> Still, i too am curious about some expert opinion.
>
> --
> erdem
>
> On Fri, Mar 18, 2011 at 1:08 PM, Iulia Zidaru<iu...@1and1.ro>  wrote:
>
>>
>> Hi all,
>> We are using ASF HBase 0.90 with cloudera distribution for HDFS(cdh3b3)).
>> We have a cluster with 6 machines 1135 regions on each machine.
>> We have many tables, each of them having regions on many nodes. We've
>> created a new table and started to load it. The other tables are not used
>> anymore. The problem is that all regions of the new table are on the same
>> machine(about 150 regions) and it is really loaded! The entire cluster is
>> still well distributed (the same number of regions on each machine), but it
>> seems that only old tables are redistributed. Also, the table's data is
>> distributed in HDFS across the entire cluster.
>> Do you have any idea what is wrong with it?
>> Thank you,
>> Iulia
>>
>>
>>
>>
>>
>>
>> --
>> Iulia Zidaru
>> Java Developer
>>
>> 1&1 Internet AG - Bucharest/Romania - Web Components Romania
>> 18 Mircea Eliade St
>> Sect 1, Bucharest
>> RO Bucharest, 012015
>> iulia.zidaru@1and1.ro
>> 0040 31 223 9153
>>
>>
>>
>
>


-- 
Iulia Zidaru
Java Developer

1&1 Internet AG - Bucharest/Romania - Web Components Romania
18 Mircea Eliade St
Sect 1, Bucharest
RO Bucharest, 012015
iulia.zidaru@1and1.ro
0040 31 223 9153

  


Re: Table distribution

Posted by Iulia Zidaru <iu...@1and1.ro>.
  Thank you Mike, I'm looking forward this fix. I hope the problem won't 
impact us very much, as it didn't so far.


On 03/18/2011 03:52 PM, Michael Segel wrote:
> This is a known bug and is fixed in 90.2. Cloudera's CDH3B4 is 90.1.
> They are aware of this issue and if it impacts you, you may want to ask them to raise its priority for Todd L. to back port the fix.
> (Now Todd is going to hate me for making him do more work :-) )
>
> -Mike
>
>
>> Date: Fri, 18 Mar 2011 13:46:30 +0200
>> Subject: Re: Table distribution
>> From: erdem.agaoglu@gmail.com
>> To: user@hbase.apache.org
>>
>> Hi,
>>
>> I have come across the exact same issue a while ago. I am no expert but it
>> seems balancer is not reassigning newly split regions, in order to reduce
>> client disconnect problems, selecting other regions on the RS instead. I
>> have seen a JIRA about this behaviour but i can't remember the id. Anyways,
>> this is not a real solution but i overcame it by disabling unused tables.
>> Still, i too am curious about some expert opinion.
>>
>> --
>> erdem
>>
>> On Fri, Mar 18, 2011 at 1:08 PM, Iulia Zidaru<iu...@1and1.ro>  wrote:
>>
>>>
>>> Hi all,
>>> We are using ASF HBase 0.90 with cloudera distribution for HDFS(cdh3b3)).
>>> We have a cluster with 6 machines 1135 regions on each machine.
>>> We have many tables, each of them having regions on many nodes. We've
>>> created a new table and started to load it. The other tables are not used
>>> anymore. The problem is that all regions of the new table are on the same
>>> machine(about 150 regions) and it is really loaded! The entire cluster is
>>> still well distributed (the same number of regions on each machine), but it
>>> seems that only old tables are redistributed. Also, the table's data is
>>> distributed in HDFS across the entire cluster.
>>> Do you have any idea what is wrong with it?
>>> Thank you,
>>> Iulia
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Iulia Zidaru
>>> Java Developer
>>>
>>> 1&1 Internet AG - Bucharest/Romania - Web Components Romania
>>> 18 Mircea Eliade St
>>> Sect 1, Bucharest
>>> RO Bucharest, 012015
>>> iulia.zidaru@1and1.ro
>>> 0040 31 223 9153
>>>
>>>
>>>
>>
>>
>> -- 
>> erdem agaoglu
>   		 	   		


-- 
Iulia Zidaru
Java Developer

1&1 Internet AG - Bucharest/Romania - Web Components Romania
18 Mircea Eliade St
Sect 1, Bucharest
RO Bucharest, 012015
iulia.zidaru@1and1.ro
0040 31 223 9153

  


RE: Table distribution

Posted by Michael Segel <mi...@hotmail.com>.
This is a known bug and is fixed in 90.2. Cloudera's CDH3B4 is 90.1.
They are aware of this issue and if it impacts you, you may want to ask them to raise its priority for Todd L. to back port the fix.
(Now Todd is going to hate me for making him do more work :-) )

-Mike


> Date: Fri, 18 Mar 2011 13:46:30 +0200
> Subject: Re: Table distribution
> From: erdem.agaoglu@gmail.com
> To: user@hbase.apache.org
> 
> Hi,
> 
> I have come across the exact same issue a while ago. I am no expert but it
> seems balancer is not reassigning newly split regions, in order to reduce
> client disconnect problems, selecting other regions on the RS instead. I
> have seen a JIRA about this behaviour but i can't remember the id. Anyways,
> this is not a real solution but i overcame it by disabling unused tables.
> Still, i too am curious about some expert opinion.
> 
> --
> erdem
> 
> On Fri, Mar 18, 2011 at 1:08 PM, Iulia Zidaru <iu...@1and1.ro> wrote:
> 
> >
> >
> > Hi all,
> > We are using ASF HBase 0.90 with cloudera distribution for HDFS(cdh3b3)).
> > We have a cluster with 6 machines 1135 regions on each machine.
> > We have many tables, each of them having regions on many nodes. We've
> > created a new table and started to load it. The other tables are not used
> > anymore. The problem is that all regions of the new table are on the same
> > machine(about 150 regions) and it is really loaded! The entire cluster is
> > still well distributed (the same number of regions on each machine), but it
> > seems that only old tables are redistributed. Also, the table's data is
> > distributed in HDFS across the entire cluster.
> > Do you have any idea what is wrong with it?
> > Thank you,
> > Iulia
> >
> >
> >
> >
> >
> >
> > --
> > Iulia Zidaru
> > Java Developer
> >
> > 1&1 Internet AG - Bucharest/Romania - Web Components Romania
> > 18 Mircea Eliade St
> > Sect 1, Bucharest
> > RO Bucharest, 012015
> > iulia.zidaru@1and1.ro
> > 0040 31 223 9153
> >
> >
> >
> 
> 
> 
> -- 
> erdem agaoglu
 		 	   		  

Re: Table distribution

Posted by Erdem Agaoglu <er...@gmail.com>.
Hi,

I have come across the exact same issue a while ago. I am no expert but it
seems balancer is not reassigning newly split regions, in order to reduce
client disconnect problems, selecting other regions on the RS instead. I
have seen a JIRA about this behaviour but i can't remember the id. Anyways,
this is not a real solution but i overcame it by disabling unused tables.
Still, i too am curious about some expert opinion.

--
erdem

On Fri, Mar 18, 2011 at 1:08 PM, Iulia Zidaru <iu...@1and1.ro> wrote:

>
>
> Hi all,
> We are using ASF HBase 0.90 with cloudera distribution for HDFS(cdh3b3)).
> We have a cluster with 6 machines 1135 regions on each machine.
> We have many tables, each of them having regions on many nodes. We've
> created a new table and started to load it. The other tables are not used
> anymore. The problem is that all regions of the new table are on the same
> machine(about 150 regions) and it is really loaded! The entire cluster is
> still well distributed (the same number of regions on each machine), but it
> seems that only old tables are redistributed. Also, the table's data is
> distributed in HDFS across the entire cluster.
> Do you have any idea what is wrong with it?
> Thank you,
> Iulia
>
>
>
>
>
>
> --
> Iulia Zidaru
> Java Developer
>
> 1&1 Internet AG - Bucharest/Romania - Web Components Romania
> 18 Mircea Eliade St
> Sect 1, Bucharest
> RO Bucharest, 012015
> iulia.zidaru@1and1.ro
> 0040 31 223 9153
>
>
>



-- 
erdem agaoglu