You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/09/10 15:56:22 UTC

Hadoop Smoke Test: TERASORT

Hi,

I am trying the smoke test for Hadoop (2.4.1).  About “terasort”, below is my test command, the Map part was completed very fast because it was split into many subtasks, however the Reduce part takes very long time and only 1 running Reduce job.  Is there a way speed up the reduce phase by splitting the large reduce job into many smaller ones and run them across the cluster like the Map part?


bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar  terasort /tmp/teragenout /tmp/terasortout


Job ID							Name	State			Maps Total	Maps Completed		Reduce Total 			Reduce Complted
job_1409876705457_0002  	TeraSort	RUNNING 		22352		22352				1 					0


Regards
Arthur


               				




















Re: Hadoop Smoke Test: TERASORT

Posted by Rich Haase <rd...@gmail.com>.
You can set the number of reducers used in any hadoop job from the command
line by using -Dmapred.reduce.tasks=XX.

e.g.  hadoop jar hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=10  /terasort-input /terasort-output

Re: Hadoop Smoke Test: TERASORT

Posted by Rich Haase <rd...@gmail.com>.
You can set the number of reducers used in any hadoop job from the command
line by using -Dmapred.reduce.tasks=XX.

e.g.  hadoop jar hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=10  /terasort-input /terasort-output

Re: Hadoop Smoke Test: TERASORT

Posted by Rich Haase <rd...@gmail.com>.
You can set the number of reducers used in any hadoop job from the command
line by using -Dmapred.reduce.tasks=XX.

e.g.  hadoop jar hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=10  /terasort-input /terasort-output

Re: Hadoop Smoke Test: TERASORT

Posted by Rich Haase <rd...@gmail.com>.
You can set the number of reducers used in any hadoop job from the command
line by using -Dmapred.reduce.tasks=XX.

e.g.  hadoop jar hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=10  /terasort-input /terasort-output