You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Null Ecksor <nu...@gmail.com> on 2010/04/10 05:39:52 UTC

Hadoop file distribution

Hey guys,

I am new user of hadoop.
I am writing a mapreduce query on a relatively huge file (3 Gb). First I had
a single node hadoop installed which took approx 200 seconds.
Now I installed hadoop cluster on 10 machines and tried to use the same
query. It took nearly 230 seconds this time.

The query Im using to insert data into hdfs is -

hadoop dfs -put *.dat /data/

How to check weather the file is distributed among the 10 machines? And how
to distribute the file amongst the datanodes to make it faster?

--
^