You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by jerrro <je...@gmail.com> on 2007/12/03 22:29:12 UTC

number of map tasks

hello,

I would like to be able to control the number of map tasks being run in
parallel for a certain job.
I found a property called mapred.map.tasks which seems to give some control
over that. However, the wiki
http://wiki.apache.org/lucene-hadoop/HowManyMapsAndReduces states that if
this value is smaller than the size of the file divided  by the DFS block
size, it does not really have effect - because it is only a hint, and hadoop
will probably choose to have as many map tasks as the number of DFS blocks
in the file. Can I set this to a lower number? What if I want to have just
one map task that goes through each line the input file, without spawning
many map tasks? Is that possible?
I am also not sure that mapred.map.tasks can be used within a specific job
conf, or only in the general configuration such as hadoop-site.xml or
mapred-default.xml of hadoop which I don't have control over.

Thanks.


Jerr.

-- 
View this message in context: http://www.nabble.com/number-of-map-tasks-tf4939413.html#a14139217
Sent from the Hadoop Users mailing list archive at Nabble.com.