You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by tienduc_dinh <ti...@yahoo.com> on 2009/01/07 00:20:56 UTC

TestDFSIO delivers bad values of "throughput" and "average IO rate"

Hello,

I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and 4
slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
the values "throughput" and "average IO rate" are similar, I just post the
values of "throughput" of the same command with 3 running times.

- > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
-nrFiles 1

+ with "dfs.replication = 1" => 33,60 / 31,48 / 30,95

+ with "dfs.replication = 2" => 26,40 / 20,99 / 21,70

I found something strange while reading the source code.

- The value of mapred.reduce.tasks is always set to 1 (in the source code)

job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile = new
Path(WRITE_DIR, "part-00000") in analyzeResult().

I tested with other values of  mapred.reduce.tasks, e.g. 
mapred.reduce.tasks = 2 and have fast the same result in comparision to 
mapred.reduce.tasks = 1

- And i don't understand the line with "double med = rate / 1000 / tasks".
Is it not "double med = rate * tasks / 1000" ?

Can anyone give me a hint.

Any help will be appreciated, thanks lots ! 
-- 
View this message in context: http://www.nabble.com/TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21321597p21321597.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.