You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Dejan Menges <de...@gmail.com> on 2015/04/28 11:22:06 UTC

Regions balancing

Hi,

We have a HBase cluster with multiple tables in it. We hit today
interesting issue - one of our jobs started failing with
OutOfOrderScannerNextException.

As this job was basically reading data from only one specific table from
one specific cluster and doing something with it, I checked table
properties, and saw that it's, mostly nicely said, very imbalanced regions
related. For example, one node had 40 regions hosted, another one had only
8. After I took down this node with 40 regions and forced it to rebalance
regions that way, job passed fine.

However, when I took it back, the same node got 30 regions this time,
what's still in the best case 40% more than other node with biggest amount
of regions for the same table.

Is there any way to balance this?

Thanks a lot,
Dejan

Re: Regions balancing

Posted by Dejan Menges <de...@gmail.com>.

Cool, thanks really a lot for very useful information, I really appreciate
it.

I'll take a look in Hortonworks knowledge base as I have account there, and
eventually check with them if this is 'maybe' going to be part of one of
the patchsets for 2.1 - we installed last one, 2.1.10 which really fixed
bunch of issues that we recently had.

On Thu, Apr 30, 2015 at 4:04 PM Michael Segel <mi...@hotmail.com>
wrote:

> No, sorry.  I cannot.
>
> I came across it at a client and since I’m not at that client, I don’t
> have access to my emails w Hortonworks.
>
> I do remember having this issue and when I looked at the JIRA, it was
> solved in a later release. Andrew P. solved it.
> (It had to do with someone writing sloppy re-entrant / recursive code)
>
> When speaking with Horton support, they indicated that we should upgrade…
> everything to 2.2 and not just HBase.
> The other issue… upgrades from HDP 2.1 to 2.2 or later isn’t straight
> forward although they do promise 2.2 upgrades to 2.3 and beyond will be
> easier.
>
> I would suggest figuring out an upgrade path to see if you will want to go
> to 2.2 or wait a couple of weeks for 2.3 when it comes out.
> Please contact Hortonworks support and they should be able to help you
> further.
>
> Outside of HBase… fixes to Ranger and to Ambari make things a bit nicer.
> Of course having them support later releases of HUE would also be nice too
> but that’s a different story…. ;-)
>
> > On Apr 29, 2015, at 4:09 PM, Dejan Menges <de...@gmail.com>
> wrote:
> >
> > Hi Michael,
> >
> > Can you please point me to exact bug?
> >
> > Thanks a lot!
> > On Apr 29, 2015 8:45 PM, "Michael Segel" <mi...@hotmail.com>
> wrote:
> >
> >> You need to upgrade.
> >> You hit a known bug.
> >>
> >>> On Apr 29, 2015, at 2:27 AM, Dejan Menges <de...@gmail.com>
> >> wrote:
> >>>
> >>> Using currently 0.98.0 (or Hortonworks 2.1.10, with bunch of patches
> that
> >>> fixed all my questions in last two months here on the list :))
> >>>
> >>> Thanks, going to take a look.
> >>>
> >>> On Wed, Apr 29, 2015 at 6:21 AM, Ted Yu <yu...@gmail.com> wrote:
> >>>
> >>>> Which hbase release are you using ?
> >>>>
> >>>> Please take a look at StochasticLoadBalancer#TableSkewCostFunction
> >>>> You can increase the weight for
> >>>> "hbase.master.balancer.stochastic.tableSkewCost"
> >>>>
> >>>> Cheers
> >>>>
> >>>> On Tue, Apr 28, 2015 at 3:04 AM, Dejan Menges <dejan.menges@gmail.com
> >
> >>>> wrote:
> >>>>
> >>>>> And one more follow up - I know about hbase.regions.slop which is in
> >> our
> >>>>> case default 0.2
> >>>>>
> >>>>> So in this specific scenario, one table, having in total 15 region
> >>>> servers,
> >>>>> one table having in total 225 regions, how to avoid some region
> servers
> >>>>> serving 10 regions, and some 29 in this case (at least not 40
> anymore)?
> >>>>>
> >>>>> On Tue, Apr 28, 2015 at 11:22 AM Dejan Menges <
> dejan.menges@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> We have a HBase cluster with multiple tables in it. We hit today
> >>>>>> interesting issue - one of our jobs started failing with
> >>>>>> OutOfOrderScannerNextException.
> >>>>>>
> >>>>>> As this job was basically reading data from only one specific table
> >>>> from
> >>>>>> one specific cluster and doing something with it, I checked table
> >>>>>> properties, and saw that it's, mostly nicely said, very imbalanced
> >>>>> regions
> >>>>>> related. For example, one node had 40 regions hosted, another one
> had
> >>>>> only
> >>>>>> 8. After I took down this node with 40 regions and forced it to
> >>>> rebalance
> >>>>>> regions that way, job passed fine.
> >>>>>>
> >>>>>> However, when I took it back, the same node got 30 regions this
> time,
> >>>>>> what's still in the best case 40% more than other node with biggest
> >>>>> amount
> >>>>>> of regions for the same table.
> >>>>>>
> >>>>>> Is there any way to balance this?
> >>>>>>
> >>>>>> Thanks a lot,
> >>>>>> Dejan
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

Re: Regions balancing

Posted by Michael Segel <mi...@hotmail.com>.

No, sorry.  I cannot. 

I came across it at a client and since I’m not at that client, I don’t have access to my emails w Hortonworks. 

I do remember having this issue and when I looked at the JIRA, it was solved in a later release. Andrew P. solved it. 
(It had to do with someone writing sloppy re-entrant / recursive code)

When speaking with Horton support, they indicated that we should upgrade… everything to 2.2 and not just HBase. 
The other issue… upgrades from HDP 2.1 to 2.2 or later isn’t straight forward although they do promise 2.2 upgrades to 2.3 and beyond will be easier. 

I would suggest figuring out an upgrade path to see if you will want to go to 2.2 or wait a couple of weeks for 2.3 when it comes out. 
Please contact Hortonworks support and they should be able to help you further. 

Outside of HBase… fixes to Ranger and to Ambari make things a bit nicer. Of course having them support later releases of HUE would also be nice too but that’s a different story…. ;-) 

> On Apr 29, 2015, at 4:09 PM, Dejan Menges <de...@gmail.com> wrote:
> 
> Hi Michael,
> 
> Can you please point me to exact bug?
> 
> Thanks a lot!
> On Apr 29, 2015 8:45 PM, "Michael Segel" <mi...@hotmail.com> wrote:
> 
>> You need to upgrade.
>> You hit a known bug.
>> 
>>> On Apr 29, 2015, at 2:27 AM, Dejan Menges <de...@gmail.com>
>> wrote:
>>> 
>>> Using currently 0.98.0 (or Hortonworks 2.1.10, with bunch of patches that
>>> fixed all my questions in last two months here on the list :))
>>> 
>>> Thanks, going to take a look.
>>> 
>>> On Wed, Apr 29, 2015 at 6:21 AM, Ted Yu <yu...@gmail.com> wrote:
>>> 
>>>> Which hbase release are you using ?
>>>> 
>>>> Please take a look at StochasticLoadBalancer#TableSkewCostFunction
>>>> You can increase the weight for
>>>> "hbase.master.balancer.stochastic.tableSkewCost"
>>>> 
>>>> Cheers
>>>> 
>>>> On Tue, Apr 28, 2015 at 3:04 AM, Dejan Menges <de...@gmail.com>
>>>> wrote:
>>>> 
>>>>> And one more follow up - I know about hbase.regions.slop which is in
>> our
>>>>> case default 0.2
>>>>> 
>>>>> So in this specific scenario, one table, having in total 15 region
>>>> servers,
>>>>> one table having in total 225 regions, how to avoid some region servers
>>>>> serving 10 regions, and some 29 in this case (at least not 40 anymore)?
>>>>> 
>>>>> On Tue, Apr 28, 2015 at 11:22 AM Dejan Menges <de...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> We have a HBase cluster with multiple tables in it. We hit today
>>>>>> interesting issue - one of our jobs started failing with
>>>>>> OutOfOrderScannerNextException.
>>>>>> 
>>>>>> As this job was basically reading data from only one specific table
>>>> from
>>>>>> one specific cluster and doing something with it, I checked table
>>>>>> properties, and saw that it's, mostly nicely said, very imbalanced
>>>>> regions
>>>>>> related. For example, one node had 40 regions hosted, another one had
>>>>> only
>>>>>> 8. After I took down this node with 40 regions and forced it to
>>>> rebalance
>>>>>> regions that way, job passed fine.
>>>>>> 
>>>>>> However, when I took it back, the same node got 30 regions this time,
>>>>>> what's still in the best case 40% more than other node with biggest
>>>>> amount
>>>>>> of regions for the same table.
>>>>>> 
>>>>>> Is there any way to balance this?
>>>>>> 
>>>>>> Thanks a lot,
>>>>>> Dejan
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 
>>

Re: Regions balancing

Posted by Dejan Menges <de...@gmail.com>.

Hi Michael,

Can you please point me to exact bug?

Thanks a lot!
On Apr 29, 2015 8:45 PM, "Michael Segel" <mi...@hotmail.com> wrote:

> You need to upgrade.
> You hit a known bug.
>
> > On Apr 29, 2015, at 2:27 AM, Dejan Menges <de...@gmail.com>
> wrote:
> >
> > Using currently 0.98.0 (or Hortonworks 2.1.10, with bunch of patches that
> > fixed all my questions in last two months here on the list :))
> >
> > Thanks, going to take a look.
> >
> > On Wed, Apr 29, 2015 at 6:21 AM, Ted Yu <yu...@gmail.com> wrote:
> >
> >> Which hbase release are you using ?
> >>
> >> Please take a look at StochasticLoadBalancer#TableSkewCostFunction
> >> You can increase the weight for
> >> "hbase.master.balancer.stochastic.tableSkewCost"
> >>
> >> Cheers
> >>
> >> On Tue, Apr 28, 2015 at 3:04 AM, Dejan Menges <de...@gmail.com>
> >> wrote:
> >>
> >>> And one more follow up - I know about hbase.regions.slop which is in
> our
> >>> case default 0.2
> >>>
> >>> So in this specific scenario, one table, having in total 15 region
> >> servers,
> >>> one table having in total 225 regions, how to avoid some region servers
> >>> serving 10 regions, and some 29 in this case (at least not 40 anymore)?
> >>>
> >>> On Tue, Apr 28, 2015 at 11:22 AM Dejan Menges <de...@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> We have a HBase cluster with multiple tables in it. We hit today
> >>>> interesting issue - one of our jobs started failing with
> >>>> OutOfOrderScannerNextException.
> >>>>
> >>>> As this job was basically reading data from only one specific table
> >> from
> >>>> one specific cluster and doing something with it, I checked table
> >>>> properties, and saw that it's, mostly nicely said, very imbalanced
> >>> regions
> >>>> related. For example, one node had 40 regions hosted, another one had
> >>> only
> >>>> 8. After I took down this node with 40 regions and forced it to
> >> rebalance
> >>>> regions that way, job passed fine.
> >>>>
> >>>> However, when I took it back, the same node got 30 regions this time,
> >>>> what's still in the best case 40% more than other node with biggest
> >>> amount
> >>>> of regions for the same table.
> >>>>
> >>>> Is there any way to balance this?
> >>>>
> >>>> Thanks a lot,
> >>>> Dejan
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Re: Regions balancing

Posted by Michael Segel <mi...@hotmail.com>.

You need to upgrade. 
You hit a known bug. 

> On Apr 29, 2015, at 2:27 AM, Dejan Menges <de...@gmail.com> wrote:
> 
> Using currently 0.98.0 (or Hortonworks 2.1.10, with bunch of patches that
> fixed all my questions in last two months here on the list :))
> 
> Thanks, going to take a look.
> 
> On Wed, Apr 29, 2015 at 6:21 AM, Ted Yu <yu...@gmail.com> wrote:
> 
>> Which hbase release are you using ?
>> 
>> Please take a look at StochasticLoadBalancer#TableSkewCostFunction
>> You can increase the weight for
>> "hbase.master.balancer.stochastic.tableSkewCost"
>> 
>> Cheers
>> 
>> On Tue, Apr 28, 2015 at 3:04 AM, Dejan Menges <de...@gmail.com>
>> wrote:
>> 
>>> And one more follow up - I know about hbase.regions.slop which is in our
>>> case default 0.2
>>> 
>>> So in this specific scenario, one table, having in total 15 region
>> servers,
>>> one table having in total 225 regions, how to avoid some region servers
>>> serving 10 regions, and some 29 in this case (at least not 40 anymore)?
>>> 
>>> On Tue, Apr 28, 2015 at 11:22 AM Dejan Menges <de...@gmail.com>
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> We have a HBase cluster with multiple tables in it. We hit today
>>>> interesting issue - one of our jobs started failing with
>>>> OutOfOrderScannerNextException.
>>>> 
>>>> As this job was basically reading data from only one specific table
>> from
>>>> one specific cluster and doing something with it, I checked table
>>>> properties, and saw that it's, mostly nicely said, very imbalanced
>>> regions
>>>> related. For example, one node had 40 regions hosted, another one had
>>> only
>>>> 8. After I took down this node with 40 regions and forced it to
>> rebalance
>>>> regions that way, job passed fine.
>>>> 
>>>> However, when I took it back, the same node got 30 regions this time,
>>>> what's still in the best case 40% more than other node with biggest
>>> amount
>>>> of regions for the same table.
>>>> 
>>>> Is there any way to balance this?
>>>> 
>>>> Thanks a lot,
>>>> Dejan
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>

Re: Regions balancing

Posted by Dejan Menges <de...@gmail.com>.

Using currently 0.98.0 (or Hortonworks 2.1.10, with bunch of patches that
fixed all my questions in last two months here on the list :))

Thanks, going to take a look.

On Wed, Apr 29, 2015 at 6:21 AM, Ted Yu <yu...@gmail.com> wrote:

> Which hbase release are you using ?
>
> Please take a look at StochasticLoadBalancer#TableSkewCostFunction
> You can increase the weight for
> "hbase.master.balancer.stochastic.tableSkewCost"
>
> Cheers
>
> On Tue, Apr 28, 2015 at 3:04 AM, Dejan Menges <de...@gmail.com>
> wrote:
>
> > And one more follow up - I know about hbase.regions.slop which is in our
> > case default 0.2
> >
> > So in this specific scenario, one table, having in total 15 region
> servers,
> > one table having in total 225 regions, how to avoid some region servers
> > serving 10 regions, and some 29 in this case (at least not 40 anymore)?
> >
> > On Tue, Apr 28, 2015 at 11:22 AM Dejan Menges <de...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > We have a HBase cluster with multiple tables in it. We hit today
> > > interesting issue - one of our jobs started failing with
> > > OutOfOrderScannerNextException.
> > >
> > > As this job was basically reading data from only one specific table
> from
> > > one specific cluster and doing something with it, I checked table
> > > properties, and saw that it's, mostly nicely said, very imbalanced
> > regions
> > > related. For example, one node had 40 regions hosted, another one had
> > only
> > > 8. After I took down this node with 40 regions and forced it to
> rebalance
> > > regions that way, job passed fine.
> > >
> > > However, when I took it back, the same node got 30 regions this time,
> > > what's still in the best case 40% more than other node with biggest
> > amount
> > > of regions for the same table.
> > >
> > > Is there any way to balance this?
> > >
> > > Thanks a lot,
> > > Dejan
> > >
> > >
> > >
> > >
> > >
> > >
> >
>

Re: Regions balancing

Posted by Ted Yu <yu...@gmail.com>.

Which hbase release are you using ?

Please take a look at StochasticLoadBalancer#TableSkewCostFunction
You can increase the weight for
"hbase.master.balancer.stochastic.tableSkewCost"

Cheers

On Tue, Apr 28, 2015 at 3:04 AM, Dejan Menges <de...@gmail.com>
wrote:

> And one more follow up - I know about hbase.regions.slop which is in our
> case default 0.2
>
> So in this specific scenario, one table, having in total 15 region servers,
> one table having in total 225 regions, how to avoid some region servers
> serving 10 regions, and some 29 in this case (at least not 40 anymore)?
>
> On Tue, Apr 28, 2015 at 11:22 AM Dejan Menges <de...@gmail.com>
> wrote:
>
> > Hi,
> >
> > We have a HBase cluster with multiple tables in it. We hit today
> > interesting issue - one of our jobs started failing with
> > OutOfOrderScannerNextException.
> >
> > As this job was basically reading data from only one specific table from
> > one specific cluster and doing something with it, I checked table
> > properties, and saw that it's, mostly nicely said, very imbalanced
> regions
> > related. For example, one node had 40 regions hosted, another one had
> only
> > 8. After I took down this node with 40 regions and forced it to rebalance
> > regions that way, job passed fine.
> >
> > However, when I took it back, the same node got 30 regions this time,
> > what's still in the best case 40% more than other node with biggest
> amount
> > of regions for the same table.
> >
> > Is there any way to balance this?
> >
> > Thanks a lot,
> > Dejan
> >
> >
> >
> >
> >
> >
>

Re: Regions balancing

Posted by Dejan Menges <de...@gmail.com>.

And one more follow up - I know about hbase.regions.slop which is in our
case default 0.2

So in this specific scenario, one table, having in total 15 region servers,
one table having in total 225 regions, how to avoid some region servers
serving 10 regions, and some 29 in this case (at least not 40 anymore)?

On Tue, Apr 28, 2015 at 11:22 AM Dejan Menges <de...@gmail.com>
wrote:

> Hi,
>
> We have a HBase cluster with multiple tables in it. We hit today
> interesting issue - one of our jobs started failing with
> OutOfOrderScannerNextException.
>
> As this job was basically reading data from only one specific table from
> one specific cluster and doing something with it, I checked table
> properties, and saw that it's, mostly nicely said, very imbalanced regions
> related. For example, one node had 40 regions hosted, another one had only
> 8. After I took down this node with 40 regions and forced it to rebalance
> regions that way, job passed fine.
>
> However, when I took it back, the same node got 30 regions this time,
> what's still in the best case 40% more than other node with biggest amount
> of regions for the same table.
>
> Is there any way to balance this?
>
> Thanks a lot,
> Dejan
>
>
>
>
>
>