You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Harsh J <ha...@cloudera.com> on 2013/01/18 12:44:09 UTC

Re: how to restrict the concurrent running map tasks?

You will need to use an alternative scheduler for this.

Look at minMaps/maxMaps/etc. properties in FairScheduler at
http://hadoop.apache.org/docs/stable/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
Alternatively, look at resource-based scheduling in CapacityScheduler at
http://hadoop.apache.org/docs/stable/capacity_scheduler.html#Resource+based+scheduling

P.s. Do not use general@ list for user level queries. The right list is
user@hadoop.apache.org.


On Fri, Jan 18, 2013 at 3:52 PM, hwang <jo...@gmail.com> wrote:

> Hi all:
>
> My hadoop version is 1.0.2. Now I want at most 10 map tasks running at the
> same time. I have found 2 parameter related to this question.
>
> a) mapred.job.map.capacity
>
> but in my hadoop version, this parameter seems abandoned.
>
> b) mapred.jobtracker.taskScheduler.maxRunningTasksPerJob (
>
> http://grepcode.com/file/repo1.maven.org/maven2/com.ning/metrics.collector/1.0.2/mapred-default.xml
> )
>
> I set this variable like below:
>
> Configuration conf = new Configuration();
> conf.set("date", date);
> conf.set("mapred.job.queue.name", "hadoop");
> conf.set("mapred.jobtracker.taskScheduler.maxRunningTasksPerJob", "10");
>
> DistributedCache.createSymlink(conf);
> Job job = new Job(conf, "ConstructApkDownload_" + date);
> ...
>
> The problem is that it doesn't work. There is still more than 50 maps
> running as the job starts.
>
> I'm not sure whether I set this parameter in wrong way ? or misunderstand
> it.
>
> After looking through the hadoop document, I can't find another parameter
> to limit the concurrent running map tasks.
>
> Hope someone can help me ,Thanks.
>



-- 
Harsh J