You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by jackwang <wa...@gmail.com> on 2016/03/16 00:43:45 UTC

why Hbase only split regions in one RegionServer

I was writing 300GiB data to my Hbase table user_info, the table I created by
default having only one region. When the writing was going I saw one region
became two regions and more late on it became 8 regions. But my confusion is
that the 8 regions were kept in the same RegionServer. 

Why Hbase didn't split the regions to different RegionServer. btw, I had 10
physical RegionsServers in my Hbase cluster, and the region size I set is
20GiB, Thanks!



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: why Hbase only split regions in one RegionServer

Posted by Dave Latham <la...@davelink.net>.
You're definitely better off today if you know what your data looks like
and are able to set up your table ahead of time accordingly.  But I think
this is an area where HBase can and should do better.  We should try to
lower the bar and make it simpler for people to get started.

HBase would likely eventually balance things out, but if it's unable to do
that while the single server is under load, then it's not serving this case
for newcomers very well.

On Wed, Mar 16, 2016 at 8:07 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> It's usually better to do a bit more work upstream to know about you table
> schema and keys instead of trying to fight that later.
>
> HBase will handle the case. As soon as the compactions will be done. It's
> just that it's not a recommanded way to proceed. Better to prevent instead
> of trying to fix it.
>
> JMS
>
> 2016-03-16 10:54 GMT-04:00 Dave Latham <la...@davelink.net>:
>
> > What if someone doesn't know the distribution of their row keys?
> > HBase should be able to handle this case.
> >
> > On Wed, Mar 16, 2016 at 7:18 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > >  Balancer is not moving regions that are compacting, right? He is just
> > > pusing to much load on a non splitted table that will keep splitting
> and
> > > compacting like crazy until balancer get a chance to get in action.
> > >
> > > Pre-split / Balance. Problem solved.
> > >
> > > Jack, when the ingestion of data is done, is the table balancing? Any
> > other
> > > table on the cluster? How many regions per region server?
> > >
> > > JMS
> > >
> > > 2016-03-16 10:16 GMT-04:00 Ted Yu <yu...@gmail.com>:
> > >
> > > > In Jack's case, even if the table was pre-split, after loading some
> > data,
> > > > if balancer didn't run, the regions would still be out of balance.
> > > >
> > > > We should help Jack find out the cause for imbalance of regions.
> > > >
> > > > On Wed, Mar 16, 2016 at 4:17 AM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org> wrote:
> > > >
> > > > > +1 with what Heng said. I think we should just deprecate the
> ability
> > to
> > > > not
> > > > > pre-split a table ;) It's always good to pre-split it based on your
> > key
> > > > > design...
> > > > >
> > > > > 2016-03-16 0:17 GMT-04:00 Heng Chen <he...@gmail.com>:
> > > > >
> > > > > > bq. the table I created by default having only one region
> > > > > >
> > > > > > Why not pre-split table into more regions when create it?
> > > > > >
> > > > > > 2016-03-16 11:38 GMT+08:00 Ted Yu <yu...@gmail.com>:
> > > > > >
> > > > > > > When one region is split into two, both daughter regions are
> > opened
> > > > on
> > > > > > the
> > > > > > > same server where parent region was opened.
> > > > > > >
> > > > > > > Can you provide a bit more information:
> > > > > > >
> > > > > > > release of hbase
> > > > > > > whether balancer was turned on - you can inspect master log to
> > see
> > > if
> > > > > > > balancer was on
> > > > > > >
> > > > > > > Consider pastebinning portion of master log.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Tue, Mar 15, 2016 at 4:43 PM, jackwang <
> > wangjiajie917@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > I was writing 300GiB data to my Hbase table user_info, the
> > table
> > > I
> > > > > > > created
> > > > > > > > by
> > > > > > > > default having only one region. When the writing was going I
> > saw
> > > > one
> > > > > > > region
> > > > > > > > became two regions and more late on it became 8 regions. But
> my
> > > > > > confusion
> > > > > > > > is
> > > > > > > > that the 8 regions were kept in the same RegionServer.
> > > > > > > >
> > > > > > > > Why Hbase didn't split the regions to different RegionServer.
> > > btw,
> > > > I
> > > > > > had
> > > > > > > 10
> > > > > > > > physical RegionsServers in my Hbase cluster, and the region
> > size
> > > I
> > > > > set
> > > > > > is
> > > > > > > > 20GiB, Thanks!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > View this message in context:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
> > > > > > > > Sent from the HBase User mailing list archive at Nabble.com.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: why Hbase only split regions in one RegionServer

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
It's usually better to do a bit more work upstream to know about you table
schema and keys instead of trying to fight that later.

HBase will handle the case. As soon as the compactions will be done. It's
just that it's not a recommanded way to proceed. Better to prevent instead
of trying to fix it.

JMS

2016-03-16 10:54 GMT-04:00 Dave Latham <la...@davelink.net>:

> What if someone doesn't know the distribution of their row keys?
> HBase should be able to handle this case.
>
> On Wed, Mar 16, 2016 at 7:18 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> >  Balancer is not moving regions that are compacting, right? He is just
> > pusing to much load on a non splitted table that will keep splitting and
> > compacting like crazy until balancer get a chance to get in action.
> >
> > Pre-split / Balance. Problem solved.
> >
> > Jack, when the ingestion of data is done, is the table balancing? Any
> other
> > table on the cluster? How many regions per region server?
> >
> > JMS
> >
> > 2016-03-16 10:16 GMT-04:00 Ted Yu <yu...@gmail.com>:
> >
> > > In Jack's case, even if the table was pre-split, after loading some
> data,
> > > if balancer didn't run, the regions would still be out of balance.
> > >
> > > We should help Jack find out the cause for imbalance of regions.
> > >
> > > On Wed, Mar 16, 2016 at 4:17 AM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org> wrote:
> > >
> > > > +1 with what Heng said. I think we should just deprecate the ability
> to
> > > not
> > > > pre-split a table ;) It's always good to pre-split it based on your
> key
> > > > design...
> > > >
> > > > 2016-03-16 0:17 GMT-04:00 Heng Chen <he...@gmail.com>:
> > > >
> > > > > bq. the table I created by default having only one region
> > > > >
> > > > > Why not pre-split table into more regions when create it?
> > > > >
> > > > > 2016-03-16 11:38 GMT+08:00 Ted Yu <yu...@gmail.com>:
> > > > >
> > > > > > When one region is split into two, both daughter regions are
> opened
> > > on
> > > > > the
> > > > > > same server where parent region was opened.
> > > > > >
> > > > > > Can you provide a bit more information:
> > > > > >
> > > > > > release of hbase
> > > > > > whether balancer was turned on - you can inspect master log to
> see
> > if
> > > > > > balancer was on
> > > > > >
> > > > > > Consider pastebinning portion of master log.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Tue, Mar 15, 2016 at 4:43 PM, jackwang <
> wangjiajie917@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > I was writing 300GiB data to my Hbase table user_info, the
> table
> > I
> > > > > > created
> > > > > > > by
> > > > > > > default having only one region. When the writing was going I
> saw
> > > one
> > > > > > region
> > > > > > > became two regions and more late on it became 8 regions. But my
> > > > > confusion
> > > > > > > is
> > > > > > > that the 8 regions were kept in the same RegionServer.
> > > > > > >
> > > > > > > Why Hbase didn't split the regions to different RegionServer.
> > btw,
> > > I
> > > > > had
> > > > > > 10
> > > > > > > physical RegionsServers in my Hbase cluster, and the region
> size
> > I
> > > > set
> > > > > is
> > > > > > > 20GiB, Thanks!
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > View this message in context:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
> > > > > > > Sent from the HBase User mailing list archive at Nabble.com.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: why Hbase only split regions in one RegionServer

Posted by Dave Latham <la...@davelink.net>.
What if someone doesn't know the distribution of their row keys?
HBase should be able to handle this case.

On Wed, Mar 16, 2016 at 7:18 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

>  Balancer is not moving regions that are compacting, right? He is just
> pusing to much load on a non splitted table that will keep splitting and
> compacting like crazy until balancer get a chance to get in action.
>
> Pre-split / Balance. Problem solved.
>
> Jack, when the ingestion of data is done, is the table balancing? Any other
> table on the cluster? How many regions per region server?
>
> JMS
>
> 2016-03-16 10:16 GMT-04:00 Ted Yu <yu...@gmail.com>:
>
> > In Jack's case, even if the table was pre-split, after loading some data,
> > if balancer didn't run, the regions would still be out of balance.
> >
> > We should help Jack find out the cause for imbalance of regions.
> >
> > On Wed, Mar 16, 2016 at 4:17 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > +1 with what Heng said. I think we should just deprecate the ability to
> > not
> > > pre-split a table ;) It's always good to pre-split it based on your key
> > > design...
> > >
> > > 2016-03-16 0:17 GMT-04:00 Heng Chen <he...@gmail.com>:
> > >
> > > > bq. the table I created by default having only one region
> > > >
> > > > Why not pre-split table into more regions when create it?
> > > >
> > > > 2016-03-16 11:38 GMT+08:00 Ted Yu <yu...@gmail.com>:
> > > >
> > > > > When one region is split into two, both daughter regions are opened
> > on
> > > > the
> > > > > same server where parent region was opened.
> > > > >
> > > > > Can you provide a bit more information:
> > > > >
> > > > > release of hbase
> > > > > whether balancer was turned on - you can inspect master log to see
> if
> > > > > balancer was on
> > > > >
> > > > > Consider pastebinning portion of master log.
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Tue, Mar 15, 2016 at 4:43 PM, jackwang <wangjiajie917@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > I was writing 300GiB data to my Hbase table user_info, the table
> I
> > > > > created
> > > > > > by
> > > > > > default having only one region. When the writing was going I saw
> > one
> > > > > region
> > > > > > became two regions and more late on it became 8 regions. But my
> > > > confusion
> > > > > > is
> > > > > > that the 8 regions were kept in the same RegionServer.
> > > > > >
> > > > > > Why Hbase didn't split the regions to different RegionServer.
> btw,
> > I
> > > > had
> > > > > 10
> > > > > > physical RegionsServers in my Hbase cluster, and the region size
> I
> > > set
> > > > is
> > > > > > 20GiB, Thanks!
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > View this message in context:
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
> > > > > > Sent from the HBase User mailing list archive at Nabble.com.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: why Hbase only split regions in one RegionServer

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
 Balancer is not moving regions that are compacting, right? He is just
pusing to much load on a non splitted table that will keep splitting and
compacting like crazy until balancer get a chance to get in action.

Pre-split / Balance. Problem solved.

Jack, when the ingestion of data is done, is the table balancing? Any other
table on the cluster? How many regions per region server?

JMS

2016-03-16 10:16 GMT-04:00 Ted Yu <yu...@gmail.com>:

> In Jack's case, even if the table was pre-split, after loading some data,
> if balancer didn't run, the regions would still be out of balance.
>
> We should help Jack find out the cause for imbalance of regions.
>
> On Wed, Mar 16, 2016 at 4:17 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > +1 with what Heng said. I think we should just deprecate the ability to
> not
> > pre-split a table ;) It's always good to pre-split it based on your key
> > design...
> >
> > 2016-03-16 0:17 GMT-04:00 Heng Chen <he...@gmail.com>:
> >
> > > bq. the table I created by default having only one region
> > >
> > > Why not pre-split table into more regions when create it?
> > >
> > > 2016-03-16 11:38 GMT+08:00 Ted Yu <yu...@gmail.com>:
> > >
> > > > When one region is split into two, both daughter regions are opened
> on
> > > the
> > > > same server where parent region was opened.
> > > >
> > > > Can you provide a bit more information:
> > > >
> > > > release of hbase
> > > > whether balancer was turned on - you can inspect master log to see if
> > > > balancer was on
> > > >
> > > > Consider pastebinning portion of master log.
> > > >
> > > > Thanks
> > > >
> > > > On Tue, Mar 15, 2016 at 4:43 PM, jackwang <wa...@gmail.com>
> > > wrote:
> > > >
> > > > > I was writing 300GiB data to my Hbase table user_info, the table I
> > > > created
> > > > > by
> > > > > default having only one region. When the writing was going I saw
> one
> > > > region
> > > > > became two regions and more late on it became 8 regions. But my
> > > confusion
> > > > > is
> > > > > that the 8 regions were kept in the same RegionServer.
> > > > >
> > > > > Why Hbase didn't split the regions to different RegionServer. btw,
> I
> > > had
> > > > 10
> > > > > physical RegionsServers in my Hbase cluster, and the region size I
> > set
> > > is
> > > > > 20GiB, Thanks!
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > View this message in context:
> > > > >
> > > >
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
> > > > > Sent from the HBase User mailing list archive at Nabble.com.
> > > > >
> > > >
> > >
> >
>

Re: why Hbase only split regions in one RegionServer

Posted by Ted Yu <yu...@gmail.com>.
In Jack's case, even if the table was pre-split, after loading some data,
if balancer didn't run, the regions would still be out of balance.

We should help Jack find out the cause for imbalance of regions.

On Wed, Mar 16, 2016 at 4:17 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> +1 with what Heng said. I think we should just deprecate the ability to not
> pre-split a table ;) It's always good to pre-split it based on your key
> design...
>
> 2016-03-16 0:17 GMT-04:00 Heng Chen <he...@gmail.com>:
>
> > bq. the table I created by default having only one region
> >
> > Why not pre-split table into more regions when create it?
> >
> > 2016-03-16 11:38 GMT+08:00 Ted Yu <yu...@gmail.com>:
> >
> > > When one region is split into two, both daughter regions are opened on
> > the
> > > same server where parent region was opened.
> > >
> > > Can you provide a bit more information:
> > >
> > > release of hbase
> > > whether balancer was turned on - you can inspect master log to see if
> > > balancer was on
> > >
> > > Consider pastebinning portion of master log.
> > >
> > > Thanks
> > >
> > > On Tue, Mar 15, 2016 at 4:43 PM, jackwang <wa...@gmail.com>
> > wrote:
> > >
> > > > I was writing 300GiB data to my Hbase table user_info, the table I
> > > created
> > > > by
> > > > default having only one region. When the writing was going I saw one
> > > region
> > > > became two regions and more late on it became 8 regions. But my
> > confusion
> > > > is
> > > > that the 8 regions were kept in the same RegionServer.
> > > >
> > > > Why Hbase didn't split the regions to different RegionServer. btw, I
> > had
> > > 10
> > > > physical RegionsServers in my Hbase cluster, and the region size I
> set
> > is
> > > > 20GiB, Thanks!
> > > >
> > > >
> > > >
> > > > --
> > > > View this message in context:
> > > >
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
> > > > Sent from the HBase User mailing list archive at Nabble.com.
> > > >
> > >
> >
>

Re: why Hbase only split regions in one RegionServer

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
+1 with what Heng said. I think we should just deprecate the ability to not
pre-split a table ;) It's always good to pre-split it based on your key
design...

2016-03-16 0:17 GMT-04:00 Heng Chen <he...@gmail.com>:

> bq. the table I created by default having only one region
>
> Why not pre-split table into more regions when create it?
>
> 2016-03-16 11:38 GMT+08:00 Ted Yu <yu...@gmail.com>:
>
> > When one region is split into two, both daughter regions are opened on
> the
> > same server where parent region was opened.
> >
> > Can you provide a bit more information:
> >
> > release of hbase
> > whether balancer was turned on - you can inspect master log to see if
> > balancer was on
> >
> > Consider pastebinning portion of master log.
> >
> > Thanks
> >
> > On Tue, Mar 15, 2016 at 4:43 PM, jackwang <wa...@gmail.com>
> wrote:
> >
> > > I was writing 300GiB data to my Hbase table user_info, the table I
> > created
> > > by
> > > default having only one region. When the writing was going I saw one
> > region
> > > became two regions and more late on it became 8 regions. But my
> confusion
> > > is
> > > that the 8 regions were kept in the same RegionServer.
> > >
> > > Why Hbase didn't split the regions to different RegionServer. btw, I
> had
> > 10
> > > physical RegionsServers in my Hbase cluster, and the region size I set
> is
> > > 20GiB, Thanks!
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
> > > Sent from the HBase User mailing list archive at Nabble.com.
> > >
> >
>

Re: why Hbase only split regions in one RegionServer

Posted by Heng Chen <he...@gmail.com>.
bq. the table I created by default having only one region

Why not pre-split table into more regions when create it?

2016-03-16 11:38 GMT+08:00 Ted Yu <yu...@gmail.com>:

> When one region is split into two, both daughter regions are opened on the
> same server where parent region was opened.
>
> Can you provide a bit more information:
>
> release of hbase
> whether balancer was turned on - you can inspect master log to see if
> balancer was on
>
> Consider pastebinning portion of master log.
>
> Thanks
>
> On Tue, Mar 15, 2016 at 4:43 PM, jackwang <wa...@gmail.com> wrote:
>
> > I was writing 300GiB data to my Hbase table user_info, the table I
> created
> > by
> > default having only one region. When the writing was going I saw one
> region
> > became two regions and more late on it became 8 regions. But my confusion
> > is
> > that the 8 regions were kept in the same RegionServer.
> >
> > Why Hbase didn't split the regions to different RegionServer. btw, I had
> 10
> > physical RegionsServers in my Hbase cluster, and the region size I set is
> > 20GiB, Thanks!
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
> > Sent from the HBase User mailing list archive at Nabble.com.
> >
>

Re: why Hbase only split regions in one RegionServer

Posted by Ted Yu <yu...@gmail.com>.
When one region is split into two, both daughter regions are opened on the
same server where parent region was opened.

Can you provide a bit more information:

release of hbase
whether balancer was turned on - you can inspect master log to see if
balancer was on

Consider pastebinning portion of master log.

Thanks

On Tue, Mar 15, 2016 at 4:43 PM, jackwang <wa...@gmail.com> wrote:

> I was writing 300GiB data to my Hbase table user_info, the table I created
> by
> default having only one region. When the writing was going I saw one region
> became two regions and more late on it became 8 regions. But my confusion
> is
> that the 8 regions were kept in the same RegionServer.
>
> Why Hbase didn't split the regions to different RegionServer. btw, I had 10
> physical RegionsServers in my Hbase cluster, and the region size I set is
> 20GiB, Thanks!
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/why-Hbase-only-split-regions-in-one-RegionServer-tp4078497.html
> Sent from the HBase User mailing list archive at Nabble.com.
>