You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Adrian Popescu <po...@yahoo.com> on 2010/03/10 00:54:11 UTC

non-uniform region splits for index tables

Hello,

I've observed that when creating secondary indexes their corresponding tables are not necessarily uniformly partitioned into region splits (e.g. one region table has 100MB, another one has 200MB); the solution that I found is to force a split on the table. However, my belief was that the load balancing of the data should be done automatically by HBase. Is any particular reason that the balancing of the data doesn't happen automatically on secondary index tables or do I miss some setting ?  

Thank you,
Adrian


      

Re: non-uniform region splits for index tables

Posted by Adrian Popescu <po...@yahoo.com>.
Good point! I took the size of the entire store (storefileSizeMB from the regionserver's web interface), so it makes sense that the split didn't happen.

Thanks,
Adrian




________________________________
From: Stack <st...@duboce.net>
To: hbase-user@hadoop.apache.org
Sent: Wed, March 10, 2010 5:01:56 PM
Subject: Re: non-uniform region splits for index tables

Hey Adrian:

How'd you do the measure of the region size?  Did you take the size of
an individual storefile or of the region as a whole?   A region will
split as soon as any individual file under the region dir gets >
hbase.hregion.max.filesize.

St.Ack

On Wed, Mar 10, 2010 at 2:52 AM, Adrian Popescu
<po...@yahoo.com> wrote:
> Sorry, forgot to say that I've changed the hregion filesize to 128MB in hbase-site.xml as follows:
>
>    <name>hbase.hregion.max.filesize</name>
>    <value>134217728</value>
>
> Is any other setting that I miss ?
>
> thank you,
> Adrian
>
>
>
>
> ________________________________
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: hbase-user@hadoop.apache.org
> Sent: Wed, March 10, 2010 12:57:15 AM
> Subject: Re: non-uniform region splits for index tables
>
> A region split happens when a family grows bigger than 256MB, if you
> have less than that in a whole table then it will be hosted in a
> single region.
>
> J-D
>
> On Tue, Mar 9, 2010 at 3:54 PM, Adrian Popescu
> <po...@yahoo.com> wrote:
>> Hello,
>>
>> I've observed that when creating secondary indexes their corresponding tables are not necessarily uniformly partitioned into region splits (e.g. one region table has 100MB, another one has 200MB); the solution that I found is to force a split on the table. However, my belief was that the load balancing of the data should be done automatically by HBase. Is any particular reason that the balancing of the data doesn't happen automatically on secondary index tables or do I miss some setting ?
>>
>> Thank you,
>> Adrian
>>
>>
>>
>
>
>
>



      

Re: non-uniform region splits for index tables

Posted by Stack <st...@duboce.net>.
Hey Adrian:

How'd you do the measure of the region size?  Did you take the size of
an individual storefile or of the region as a whole?   A region will
split as soon as any individual file under the region dir gets >
hbase.hregion.max.filesize.

St.Ack

On Wed, Mar 10, 2010 at 2:52 AM, Adrian Popescu
<po...@yahoo.com> wrote:
> Sorry, forgot to say that I've changed the hregion filesize to 128MB in hbase-site.xml as follows:
>
>    <name>hbase.hregion.max.filesize</name>
>    <value>134217728</value>
>
> Is any other setting that I miss ?
>
> thank you,
> Adrian
>
>
>
>
> ________________________________
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: hbase-user@hadoop.apache.org
> Sent: Wed, March 10, 2010 12:57:15 AM
> Subject: Re: non-uniform region splits for index tables
>
> A region split happens when a family grows bigger than 256MB, if you
> have less than that in a whole table then it will be hosted in a
> single region.
>
> J-D
>
> On Tue, Mar 9, 2010 at 3:54 PM, Adrian Popescu
> <po...@yahoo.com> wrote:
>> Hello,
>>
>> I've observed that when creating secondary indexes their corresponding tables are not necessarily uniformly partitioned into region splits (e.g. one region table has 100MB, another one has 200MB); the solution that I found is to force a split on the table. However, my belief was that the load balancing of the data should be done automatically by HBase. Is any particular reason that the balancing of the data doesn't happen automatically on secondary index tables or do I miss some setting ?
>>
>> Thank you,
>> Adrian
>>
>>
>>
>
>
>
>

Re: non-uniform region splits for index tables

Posted by Adrian Popescu <po...@yahoo.com>.
Sorry, forgot to say that I've changed the hregion filesize to 128MB in hbase-site.xml as follows:

    <name>hbase.hregion.max.filesize</name>
    <value>134217728</value>

Is any other setting that I miss ?

thank you,
Adrian




________________________________
From: Jean-Daniel Cryans <jd...@apache.org>
To: hbase-user@hadoop.apache.org
Sent: Wed, March 10, 2010 12:57:15 AM
Subject: Re: non-uniform region splits for index tables

A region split happens when a family grows bigger than 256MB, if you
have less than that in a whole table then it will be hosted in a
single region.

J-D

On Tue, Mar 9, 2010 at 3:54 PM, Adrian Popescu
<po...@yahoo.com> wrote:
> Hello,
>
> I've observed that when creating secondary indexes their corresponding tables are not necessarily uniformly partitioned into region splits (e.g. one region table has 100MB, another one has 200MB); the solution that I found is to force a split on the table. However, my belief was that the load balancing of the data should be done automatically by HBase. Is any particular reason that the balancing of the data doesn't happen automatically on secondary index tables or do I miss some setting ?
>
> Thank you,
> Adrian
>
>
>



      

Re: non-uniform region splits for index tables

Posted by Jean-Daniel Cryans <jd...@apache.org>.
A region split happens when a family grows bigger than 256MB, if you
have less than that in a whole table then it will be hosted in a
single region.

J-D

On Tue, Mar 9, 2010 at 3:54 PM, Adrian Popescu
<po...@yahoo.com> wrote:
> Hello,
>
> I've observed that when creating secondary indexes their corresponding tables are not necessarily uniformly partitioned into region splits (e.g. one region table has 100MB, another one has 200MB); the solution that I found is to force a split on the table. However, my belief was that the load balancing of the data should be done automatically by HBase. Is any particular reason that the balancing of the data doesn't happen automatically on secondary index tables or do I miss some setting ?
>
> Thank you,
> Adrian
>
>
>