You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "moonwatcher32329@yahoo.com" <mo...@yahoo.com> on 2008/09/09 17:05:01 UTC

block distribution with varying disk sizes

Does Hadoop distribute blocks according to how many blocks a node currently contains or according to how much disk space the node has remaining currently ?
Suppose that I have many machines with identical CPUs but different disk sizes. If the blocks get distributed according to the remaining disk space, then the larger disk nodes would be storing more data... would this cause performance problems during the mapping phase ?
Thanks,
moonwatcher




      

Re: block distribution with varying disk sizes

Posted by 叶双明 <ye...@gmail.com>.
You can see Rebalancer of Hadoop at:
http://hadoop.apache.org/core/docs/r0.18.0/hdfs_user_guide.html#Rebalancer

2008/9/9 moonwatcher32329@yahoo.com <mo...@yahoo.com>

>
> Does Hadoop distribute blocks according to how many blocks a node currently
> contains or according to how much disk space the node has remaining
> currently ?
> Suppose that I have many machines with identical CPUs but different disk
> sizes. If the blocks get distributed according to the remaining disk space,
> then the larger disk nodes would be storing more data... would this cause
> performance problems during the mapping phase ?
> Thanks,
> moonwatcher
>
>
>
>
>




-- 
Sorry for my english!! 明