You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jim Donofrio <do...@gmail.com> on 2012/05/18 14:24:34 UTC

how to rebalance individual data node?

Lets say that every node in your cluster has 2 same sized disks and one 
is 50% full and the other is 100% full. According to my understanding of 
the balancer documentation, all data nodes will be at the average 
utilization of 75% so no balancing will occur yet one hard drive in each 
node is struggling at capacity. Is there any way to run the balancer 
just on a datanode to force each disk to be 75% full?

Thanks

Re: how to rebalance individual data node?

Posted by Harsh J <ha...@cloudera.com>.
Jim,

The HDFS balancer presently does not look at the disks of a DN. They
only view DNs on the whole (sum of all usage). The improvement to
balance disks of a single DN is trackable at
https://issues.apache.org/jira/browse/HDFS-1312

You may balance your disks out manually, however:
http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F

On Fri, May 18, 2012 at 5:54 PM, Jim Donofrio <do...@gmail.com> wrote:
> Lets say that every node in your cluster has 2 same sized disks and one is
> 50% full and the other is 100% full. According to my understanding of the
> balancer documentation, all data nodes will be at the average utilization of
> 75% so no balancing will occur yet one hard drive in each node is struggling
> at capacity. Is there any way to run the balancer just on a datanode to
> force each disk to be 75% full?
>
> Thanks



-- 
Harsh J