You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Mayuran Yogarajah <ma...@casalemedia.com> on 2010/06/25 20:01:15 UTC

Question about how a DataNode picks a partition if multiple are specified

When files are being loaded into HDFS, if the node has multiple entries for
dfs.data.dir, how does Hadoop pick which directory to store files in? Does
it intelligently pick the partition that has the most amount of space 
available,
or is it round robin, or perhaps random?

We keep running into a problem where a DataNode keeps running out of space
because the data was being written to the partition with less space 
available.

Here's some info about the cluster:
7 nodes, all identical hardware, running Hadoop 0.18.3.

Any feedback would be greatly appreciated.

thanks,
M