You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by divye sheth <di...@gmail.com> on 2014/03/04 14:09:38 UTC

DFS Balancer with Hbase

Hi,

We are seeing that our cluster has become imbalanced and we've noticed that
there are some servers with disk utilization of ~90% and the others are at
~40%.

We would like to run the DFS balancer but after some initial search I came
across suggestions to not run DFS balancer on HBase cluster. Could anyone
explain why we should not run DFS balancer on a Hbase Cluster? And if we
cannot is there a way we can balance the cluster?

Hbase Version: 0.94.2
Hadoop Version: 0.20.2 + append patch r1056497

P.S: We are planing to upgrade, but till that time please do help me with a
way to balance the cluster.

Thanks
Divye Sheth

Re: DFS Balancer with Hbase

Posted by lars hofhansl <la...@apache.org>.

That's a surprising change in behavior.
It seems like it was intentional in HBASE-6849, but it will catch folks by surprise.

-- Lars



________________________________
 From: Bharath Vissapragada <bh...@cloudera.com>
To: user@hbase.apache.org; lars hofhansl <la...@apache.org> 
Sent: Tuesday, March 4, 2014 6:43 PM
Subject: Re: DFS Balancer with Hbase
 

Thanks for correcting me Lars. Yes its enabled by default in 0.94.2 but
disabled in trunk after HBASE-6849.

Divye, you can still check the block locality index and do a major
compaction and see how it goes.


On Wed, Mar 5, 2014 at 6:02 AM, lars hofhansl <la...@apache.org> wrote:

> Looks like it's on by default.
>
>
>
> ________________________________
>  From: Bharath Vissapragada <bh...@cloudera.com>
> To: user@hbase.apache.org
> Sent: Tuesday, March 4, 2014 6:22 AM
> Subject: Re: DFS Balancer with Hbase
>
>
> Yes, its included in 0.94.2. Include this property in master's
> hbase-site.xml and requires a master restart. Allow the balancer to run and
> make sure the new assignment is in place and then run a major compaction
> during some maintenance window. Running major compaction once a week
> atleast is suggested since it clears up the data corresponding to deletes
> (which in your case is useful) and also improves block locality index for
> RS local reads.
>
>
> On Tue, Mar 4, 2014 at 7:33 PM, divye sheth <di...@gmail.com> wrote:
>
> > Thanks Bharath, I had a look at the jira and the fix version is 0.94.0. I
> > am using 0.94.2, I assume this should already be a part of Hbase. Is the
> > assumption correct? If no do I have to make these changes in
> > hbase-site.xml?
> >
> > Note: We have not run major_compaction for any of the tables so far. Will
> > doing so mitigate the issue to some extent?
> >
> > Thanks
> > Divye Sheth
> >
> >
> > On Tue, Mar 4, 2014 at 7:10 PM, Bharath Vissapragada
> > <bh...@cloudera.com>wrote:
> >
> > > Did you check the per table balancer? (HBASE-3373),
> > > hbase.master.loadbalance.bytable=true
> > >
> > > Default load balancer just balances based on metric region count per
> > table
> > > which can result in all big regions from a single table falling on one
> RS
> > > thus overloading it. This might be one of the reasons and you can
> confirm
> > > it from current region assignment.
> > >
> > > Do a major compaction after enabling this setting and regions are
> > balanced
> > > so that the newly written hfiles are uniformly distributed.
> > >
> > >
> > >
> > >
> > > On Tue, Mar 4, 2014 at 6:54 PM, divye sheth <di...@gmail.com>
> > wrote:
> > >
> > > > Thanks Jean, but why does only a couple of RS get loaded with data?
> We
> > > are
> > > > seeing out of 5 only 2 datanodes have around 90% of disk usage. Where
> > as
> > > > the rest are at around 40%.
> > > >
> > > > We have run the hbase balancer, and on an average we have around 500
> > > > regions per regionserver and a total of 5 RS's. We have even disabled
> > > > number of tables which are not required and currently the count of
> > > > regions/RS is around 120.
> > > >
> > > > Another question that comes to my mind is. Somewhere down the line
> the
> > > > Hadoop cluster tends to be imbalanced and lead to 100% disk
> utilization
> > > and
> > > > the balancer activity has to be triggered, how do you guys handle
> such
> > > > problem in your hbase cluster?
> > > >
> > > > Just a thought, could we execute the DFS balancer and after the
> > balancing
> > > > activity trigger major compaction for each table?
> > > >
> > > > Thanks
> > > > Divye Sheth
> > > >
> > > >
> > > > On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org
> > > > > wrote:
> > > >
> > > > > Hi Divye,
> > > > >
> > > > > the DFS balancer is that last thing you want to run in your HBase
> > > > > cluster.That will break all the data locallity for the compacted
> > > regions.
> > > > >
> > > > > On compaction, a region write the files on the local server first,
> > then
> > > > the
> > > > > 2 other replicates are going on different datanodes. so on read,
> > HBase
> > > > can
> > > > > garantee that data is read from local datanode dans not from
> another
> > > > > datanode over the network.
> > > > >
> > > > > Have you run the HBase balancer? How many regions do you have per
> > > region
> > > > > server?
> > > > >
> > > > > JM
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Bharath Vissapragada
> > > <http://www.cloudera.com>

>
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>




-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: DFS Balancer with Hbase

Posted by Bharath Vissapragada <bh...@cloudera.com>.

Thanks for correcting me Lars. Yes its enabled by default in 0.94.2 but
disabled in trunk after HBASE-6849.

Divye, you can still check the block locality index and do a major
compaction and see how it goes.


On Wed, Mar 5, 2014 at 6:02 AM, lars hofhansl <la...@apache.org> wrote:

> Looks like it's on by default.
>
>
>
> ________________________________
>  From: Bharath Vissapragada <bh...@cloudera.com>
> To: user@hbase.apache.org
> Sent: Tuesday, March 4, 2014 6:22 AM
> Subject: Re: DFS Balancer with Hbase
>
>
> Yes, its included in 0.94.2. Include this property in master's
> hbase-site.xml and requires a master restart. Allow the balancer to run and
> make sure the new assignment is in place and then run a major compaction
> during some maintenance window. Running major compaction once a week
> atleast is suggested since it clears up the data corresponding to deletes
> (which in your case is useful) and also improves block locality index for
> RS local reads.
>
>
> On Tue, Mar 4, 2014 at 7:33 PM, divye sheth <di...@gmail.com> wrote:
>
> > Thanks Bharath, I had a look at the jira and the fix version is 0.94.0. I
> > am using 0.94.2, I assume this should already be a part of Hbase. Is the
> > assumption correct? If no do I have to make these changes in
> > hbase-site.xml?
> >
> > Note: We have not run major_compaction for any of the tables so far. Will
> > doing so mitigate the issue to some extent?
> >
> > Thanks
> > Divye Sheth
> >
> >
> > On Tue, Mar 4, 2014 at 7:10 PM, Bharath Vissapragada
> > <bh...@cloudera.com>wrote:
> >
> > > Did you check the per table balancer? (HBASE-3373),
> > > hbase.master.loadbalance.bytable=true
> > >
> > > Default load balancer just balances based on metric region count per
> > table
> > > which can result in all big regions from a single table falling on one
> RS
> > > thus overloading it. This might be one of the reasons and you can
> confirm
> > > it from current region assignment.
> > >
> > > Do a major compaction after enabling this setting and regions are
> > balanced
> > > so that the newly written hfiles are uniformly distributed.
> > >
> > >
> > >
> > >
> > > On Tue, Mar 4, 2014 at 6:54 PM, divye sheth <di...@gmail.com>
> > wrote:
> > >
> > > > Thanks Jean, but why does only a couple of RS get loaded with data?
> We
> > > are
> > > > seeing out of 5 only 2 datanodes have around 90% of disk usage. Where
> > as
> > > > the rest are at around 40%.
> > > >
> > > > We have run the hbase balancer, and on an average we have around 500
> > > > regions per regionserver and a total of 5 RS's. We have even disabled
> > > > number of tables which are not required and currently the count of
> > > > regions/RS is around 120.
> > > >
> > > > Another question that comes to my mind is. Somewhere down the line
> the
> > > > Hadoop cluster tends to be imbalanced and lead to 100% disk
> utilization
> > > and
> > > > the balancer activity has to be triggered, how do you guys handle
> such
> > > > problem in your hbase cluster?
> > > >
> > > > Just a thought, could we execute the DFS balancer and after the
> > balancing
> > > > activity trigger major compaction for each table?
> > > >
> > > > Thanks
> > > > Divye Sheth
> > > >
> > > >
> > > > On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org
> > > > > wrote:
> > > >
> > > > > Hi Divye,
> > > > >
> > > > > the DFS balancer is that last thing you want to run in your HBase
> > > > > cluster.That will break all the data locallity for the compacted
> > > regions.
> > > > >
> > > > > On compaction, a region write the files on the local server first,
> > then
> > > > the
> > > > > 2 other replicates are going on different datanodes. so on read,
> > HBase
> > > > can
> > > > > garantee that data is read from local datanode dans not from
> another
> > > > > datanode over the network.
> > > > >
> > > > > Have you run the HBase balancer? How many regions do you have per
> > > region
> > > > > server?
> > > > >
> > > > > JM
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Bharath Vissapragada
> > > <http://www.cloudera.com>
>
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>




-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: DFS Balancer with Hbase

Posted by lars hofhansl <la...@apache.org>.

Looks like it's on by default.



________________________________
 From: Bharath Vissapragada <bh...@cloudera.com>
To: user@hbase.apache.org 
Sent: Tuesday, March 4, 2014 6:22 AM
Subject: Re: DFS Balancer with Hbase
 

Yes, its included in 0.94.2. Include this property in master's
hbase-site.xml and requires a master restart. Allow the balancer to run and
make sure the new assignment is in place and then run a major compaction
during some maintenance window. Running major compaction once a week
atleast is suggested since it clears up the data corresponding to deletes
(which in your case is useful) and also improves block locality index for
RS local reads.


On Tue, Mar 4, 2014 at 7:33 PM, divye sheth <di...@gmail.com> wrote:

> Thanks Bharath, I had a look at the jira and the fix version is 0.94.0. I
> am using 0.94.2, I assume this should already be a part of Hbase. Is the
> assumption correct? If no do I have to make these changes in
> hbase-site.xml?
>
> Note: We have not run major_compaction for any of the tables so far. Will
> doing so mitigate the issue to some extent?
>
> Thanks
> Divye Sheth
>
>
> On Tue, Mar 4, 2014 at 7:10 PM, Bharath Vissapragada
> <bh...@cloudera.com>wrote:
>
> > Did you check the per table balancer? (HBASE-3373),
> > hbase.master.loadbalance.bytable=true
> >
> > Default load balancer just balances based on metric region count per
> table
> > which can result in all big regions from a single table falling on one RS
> > thus overloading it. This might be one of the reasons and you can confirm
> > it from current region assignment.
> >
> > Do a major compaction after enabling this setting and regions are
> balanced
> > so that the newly written hfiles are uniformly distributed.
> >
> >
> >
> >
> > On Tue, Mar 4, 2014 at 6:54 PM, divye sheth <di...@gmail.com>
> wrote:
> >
> > > Thanks Jean, but why does only a couple of RS get loaded with data? We
> > are
> > > seeing out of 5 only 2 datanodes have around 90% of disk usage. Where
> as
> > > the rest are at around 40%.
> > >
> > > We have run the hbase balancer, and on an average we have around 500
> > > regions per regionserver and a total of 5 RS's. We have even disabled
> > > number of tables which are not required and currently the count of
> > > regions/RS is around 120.
> > >
> > > Another question that comes to my mind is. Somewhere down the line the
> > > Hadoop cluster tends to be imbalanced and lead to 100% disk utilization
> > and
> > > the balancer activity has to be triggered, how do you guys handle such
> > > problem in your hbase cluster?
> > >
> > > Just a thought, could we execute the DFS balancer and after the
> balancing
> > > activity trigger major compaction for each table?
> > >
> > > Thanks
> > > Divye Sheth
> > >
> > >
> > > On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org
> > > > wrote:
> > >
> > > > Hi Divye,
> > > >
> > > > the DFS balancer is that last thing you want to run in your HBase
> > > > cluster.That will break all the data locallity for the compacted
> > regions.
> > > >
> > > > On compaction, a region write the files on the local server first,
> then
> > > the
> > > > 2 other replicates are going on different datanodes. so on read,
> HBase
> > > can
> > > > garantee that data is read from local datanode dans not from another
> > > > datanode over the network.
> > > >
> > > > Have you run the HBase balancer? How many regions do you have per
> > region
> > > > server?
> > > >
> > > > JM
> > > >
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>

> >
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: DFS Balancer with Hbase

Posted by Bharath Vissapragada <bh...@cloudera.com>.

Yes, its included in 0.94.2. Include this property in master's
hbase-site.xml and requires a master restart. Allow the balancer to run and
make sure the new assignment is in place and then run a major compaction
during some maintenance window. Running major compaction once a week
atleast is suggested since it clears up the data corresponding to deletes
(which in your case is useful) and also improves block locality index for
RS local reads.


On Tue, Mar 4, 2014 at 7:33 PM, divye sheth <di...@gmail.com> wrote:

> Thanks Bharath, I had a look at the jira and the fix version is 0.94.0. I
> am using 0.94.2, I assume this should already be a part of Hbase. Is the
> assumption correct? If no do I have to make these changes in
> hbase-site.xml?
>
> Note: We have not run major_compaction for any of the tables so far. Will
> doing so mitigate the issue to some extent?
>
> Thanks
> Divye Sheth
>
>
> On Tue, Mar 4, 2014 at 7:10 PM, Bharath Vissapragada
> <bh...@cloudera.com>wrote:
>
> > Did you check the per table balancer? (HBASE-3373),
> > hbase.master.loadbalance.bytable=true
> >
> > Default load balancer just balances based on metric region count per
> table
> > which can result in all big regions from a single table falling on one RS
> > thus overloading it. This might be one of the reasons and you can confirm
> > it from current region assignment.
> >
> > Do a major compaction after enabling this setting and regions are
> balanced
> > so that the newly written hfiles are uniformly distributed.
> >
> >
> >
> >
> > On Tue, Mar 4, 2014 at 6:54 PM, divye sheth <di...@gmail.com>
> wrote:
> >
> > > Thanks Jean, but why does only a couple of RS get loaded with data? We
> > are
> > > seeing out of 5 only 2 datanodes have around 90% of disk usage. Where
> as
> > > the rest are at around 40%.
> > >
> > > We have run the hbase balancer, and on an average we have around 500
> > > regions per regionserver and a total of 5 RS's. We have even disabled
> > > number of tables which are not required and currently the count of
> > > regions/RS is around 120.
> > >
> > > Another question that comes to my mind is. Somewhere down the line the
> > > Hadoop cluster tends to be imbalanced and lead to 100% disk utilization
> > and
> > > the balancer activity has to be triggered, how do you guys handle such
> > > problem in your hbase cluster?
> > >
> > > Just a thought, could we execute the DFS balancer and after the
> balancing
> > > activity trigger major compaction for each table?
> > >
> > > Thanks
> > > Divye Sheth
> > >
> > >
> > > On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org
> > > > wrote:
> > >
> > > > Hi Divye,
> > > >
> > > > the DFS balancer is that last thing you want to run in your HBase
> > > > cluster.That will break all the data locallity for the compacted
> > regions.
> > > >
> > > > On compaction, a region write the files on the local server first,
> then
> > > the
> > > > 2 other replicates are going on different datanodes. so on read,
> HBase
> > > can
> > > > garantee that data is read from local datanode dans not from another
> > > > datanode over the network.
> > > >
> > > > Have you run the HBase balancer? How many regions do you have per
> > region
> > > > server?
> > > >
> > > > JM
> > > >
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>
> >
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: DFS Balancer with Hbase

Posted by divye sheth <di...@gmail.com>.

Thanks Bharath, I had a look at the jira and the fix version is 0.94.0. I
am using 0.94.2, I assume this should already be a part of Hbase. Is the
assumption correct? If no do I have to make these changes in hbase-site.xml?

Note: We have not run major_compaction for any of the tables so far. Will
doing so mitigate the issue to some extent?

Thanks
Divye Sheth


On Tue, Mar 4, 2014 at 7:10 PM, Bharath Vissapragada
<bh...@cloudera.com>wrote:

> Did you check the per table balancer? (HBASE-3373),
> hbase.master.loadbalance.bytable=true
>
> Default load balancer just balances based on metric region count per table
> which can result in all big regions from a single table falling on one RS
> thus overloading it. This might be one of the reasons and you can confirm
> it from current region assignment.
>
> Do a major compaction after enabling this setting and regions are balanced
> so that the newly written hfiles are uniformly distributed.
>
>
>
>
> On Tue, Mar 4, 2014 at 6:54 PM, divye sheth <di...@gmail.com> wrote:
>
> > Thanks Jean, but why does only a couple of RS get loaded with data? We
> are
> > seeing out of 5 only 2 datanodes have around 90% of disk usage. Where as
> > the rest are at around 40%.
> >
> > We have run the hbase balancer, and on an average we have around 500
> > regions per regionserver and a total of 5 RS's. We have even disabled
> > number of tables which are not required and currently the count of
> > regions/RS is around 120.
> >
> > Another question that comes to my mind is. Somewhere down the line the
> > Hadoop cluster tends to be imbalanced and lead to 100% disk utilization
> and
> > the balancer activity has to be triggered, how do you guys handle such
> > problem in your hbase cluster?
> >
> > Just a thought, could we execute the DFS balancer and after the balancing
> > activity trigger major compaction for each table?
> >
> > Thanks
> > Divye Sheth
> >
> >
> > On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org
> > > wrote:
> >
> > > Hi Divye,
> > >
> > > the DFS balancer is that last thing you want to run in your HBase
> > > cluster.That will break all the data locallity for the compacted
> regions.
> > >
> > > On compaction, a region write the files on the local server first, then
> > the
> > > 2 other replicates are going on different datanodes. so on read, HBase
> > can
> > > garantee that data is read from local datanode dans not from another
> > > datanode over the network.
> > >
> > > Have you run the HBase balancer? How many regions do you have per
> region
> > > server?
> > >
> > > JM
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>
>

Re: DFS Balancer with Hbase

Posted by Bharath Vissapragada <bh...@cloudera.com>.

Did you check the per table balancer? (HBASE-3373),
hbase.master.loadbalance.bytable=true

Default load balancer just balances based on metric region count per table
which can result in all big regions from a single table falling on one RS
thus overloading it. This might be one of the reasons and you can confirm
it from current region assignment.

Do a major compaction after enabling this setting and regions are balanced
so that the newly written hfiles are uniformly distributed.




On Tue, Mar 4, 2014 at 6:54 PM, divye sheth <di...@gmail.com> wrote:

> Thanks Jean, but why does only a couple of RS get loaded with data? We are
> seeing out of 5 only 2 datanodes have around 90% of disk usage. Where as
> the rest are at around 40%.
>
> We have run the hbase balancer, and on an average we have around 500
> regions per regionserver and a total of 5 RS's. We have even disabled
> number of tables which are not required and currently the count of
> regions/RS is around 120.
>
> Another question that comes to my mind is. Somewhere down the line the
> Hadoop cluster tends to be imbalanced and lead to 100% disk utilization and
> the balancer activity has to be triggered, how do you guys handle such
> problem in your hbase cluster?
>
> Just a thought, could we execute the DFS balancer and after the balancing
> activity trigger major compaction for each table?
>
> Thanks
> Divye Sheth
>
>
> On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Hi Divye,
> >
> > the DFS balancer is that last thing you want to run in your HBase
> > cluster.That will break all the data locallity for the compacted regions.
> >
> > On compaction, a region write the files on the local server first, then
> the
> > 2 other replicates are going on different datanodes. so on read, HBase
> can
> > garantee that data is read from local datanode dans not from another
> > datanode over the network.
> >
> > Have you run the HBase balancer? How many regions do you have per region
> > server?
> >
> > JM
> >
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: DFS Balancer with Hbase

Posted by divye sheth <di...@gmail.com>.

Thanks Jean, but why does only a couple of RS get loaded with data? We are
seeing out of 5 only 2 datanodes have around 90% of disk usage. Where as
the rest are at around 40%.

We have run the hbase balancer, and on an average we have around 500
regions per regionserver and a total of 5 RS's. We have even disabled
number of tables which are not required and currently the count of
regions/RS is around 120.

Another question that comes to my mind is. Somewhere down the line the
Hadoop cluster tends to be imbalanced and lead to 100% disk utilization and
the balancer activity has to be triggered, how do you guys handle such
problem in your hbase cluster?

Just a thought, could we execute the DFS balancer and after the balancing
activity trigger major compaction for each table?

Thanks
Divye Sheth

On Tue, Mar 4, 2014 at 6:45 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi Divye,
>
> the DFS balancer is that last thing you want to run in your HBase
> cluster.That will break all the data locallity for the compacted regions.
>
> On compaction, a region write the files on the local server first, then the
> 2 other replicates are going on different datanodes. so on read, HBase can
> garantee that data is read from local datanode dans not from another
> datanode over the network.
>
> Have you run the HBase balancer? How many regions do you have per region
> server?
>
> JM
>

Re: DFS Balancer with Hbase

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Hi Divye,

the DFS balancer is that last thing you want to run in your HBase
cluster.That will break all the data locallity for the compacted regions.

On compaction, a region write the files on the local server first, then the
2 other replicates are going on different datanodes. so on read, HBase can
garantee that data is read from local datanode dans not from another
datanode over the network.

Have you run the HBase balancer? How many regions do you have per region
server?

JM