You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Felix Halim <fe...@gmail.com> on 2010/06/21 21:07:57 UTC

How to set the number of map tasks? (ver 0.20.2)

I'm using the new Job class:

http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html

There is a way to set the number of reduce tasks:

setNumReduceTasks(int tasks)

However, I don't see how to set the number of MAP tasks?

I tried to set it through mapred-site.xml :

	<property>
		<name>mapred.map.tasks</name>
		<value>500</value>
	</property>

It doesn't work either (launched map task is still small).

I'm wondering, do I have to rename the prefix from "mapred" to
"mapreduce"? like this (for all configurations?):

	<property>
		<name>mapreduce.map.tasks</name>
		<value>500</value>
	</property>

I added both, and it still doesn't work.

However, the old way of (using JobConf) works.

Is the new Job class is intended to not support altering the number of
map tasks?

FYI, setting the number of reduce tasks seemed to work using the new Job class.

Felix Halim

Re: How to set the number of map tasks? (ver 0.20.2)

Posted by Hemanth Yamijala <yh...@gmail.com>.
Felix,

> I'm using the new Job class:
>
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html
>
> There is a way to set the number of reduce tasks:
>
> setNumReduceTasks(int tasks)
>
> However, I don't see how to set the number of MAP tasks?
>
> I tried to set it through mapred-site.xml :
>
>        <property>
>                <name>mapred.map.tasks</name>
>                <value>500</value>
>        </property>
>
> It doesn't work either (launched map task is still small).
>
> I'm wondering, do I have to rename the prefix from "mapred" to
> "mapreduce"? like this (for all configurations?):
>
>        <property>
>                <name>mapreduce.map.tasks</name>
>                <value>500</value>
>        </property>
>
> I added both, and it still doesn't work.

As documented in the mapreduce tutorial as well as the Java
documentation (http://bit.ly/9HKclu), the number of map tasks is
primarily determined by the number of input splits generated for the
input data.

Thanks
Hemanth