You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Jabir Ahmed <ja...@gmail.com> on 2012/12/09 12:27:19 UTC

Re-balancer on datanodes that run hbase regions servers

Our cluster has around 12 data-nodes

9 nodes run datanodes + task trackers
3 nodes run dtanodes + regions servers
1 Namenode and Jotbtracker

In this kind of a cluster setup is it advisable to run a re-blancer since
running a balancer affects the performance of hbase.

Thnx

Jabir

Re: Re-balancer on datanodes that run hbase regions servers

Posted by Harsh J <ha...@cloudera.com>.

It is a bad idea only cause it will temporarily distort the perfect
locality of the regions hosted by each RegionServer. This gets
corrected only at the end of the next major compaction of all regions,
eventually, but both the events would cause some small level of
performance dips and increase in network use + I/O until done.

There's no way to escape the fact that if you write more HBase data,
the 3 nodes of RS are bound to fill up faster than the others, but
what we could do as an enhancement for aiding rebalancing the
remaining replica nodes without affecting the RS is to provide an
exclude-nodes feature to the balancer. By asking the Balancer to
exclude the RS's nodes, you can rebalance the rest of the cluster
while not causing a performance problem on the RS during the time.

Most clusters run the RS+DN pair across all nodes, so this scenario of
an imbalance won't really occur there.

I filed https://issues.apache.org/jira/browse/HDFS-4509 with some
ideas you could use (see comments).

On Sun, Dec 9, 2012 at 4:57 PM, Jabir Ahmed <ja...@gmail.com> wrote:
> Our cluster has around 12 data-nodes
>
> 9 nodes run datanodes + task trackers
> 3 nodes run dtanodes + regions servers
> 1 Namenode and Jotbtracker
>
> In this kind of a cluster setup is it advisable to run a re-blancer since
> running a balancer affects the performance of hbase.
>
> Thnx
>
> Jabir
>
> --
>
>
>

--
Harsh J