You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Sumeet Nikam <su...@gmail.com> on 2013/07/15 09:52:02 UTC

hdfs configuration on nfs

Hi,

I have a nfs partition of 6TB and i am using (4+1) node hadoop cluster,
below is my current configuration on hdfs


My question is since i am creating sub-folders under same partition , it
really does not make any sense for taking hadoop advantage of fail safe as
ultimately all are pointed to same location. There is also one disadvantage
of hadoop mis calculating available space, though the entire space is 6TB,
for hadoop it becomes 72 TB



The reason could be that hadoop is executing df command under every folder
(1,2 & 3 ) and every time it gets 6TB , so for node1 space configured
becomes (6*3) = 18TB and 18*4 = 72 TB explains the configured space.

My question is , having sub-folders does not look like best practice as
this will mis guide hadoop in calculating available space while running
job. So I think creating multiple logical partitions , (1.5 TB * 4) as
there as 4 machines that will use this as data nodes would be better than
using 1 single base.
But I do not have any strong theory on why this new configure should fare
better than earlier.

So I am looking for expert guidance under such scenarios



-- 
Regards,
Sumeet