You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Erix Yao <ya...@gmail.com> on 2011/04/26 18:21:15 UTC

Questions about Fair scheduler in hadoop

hi,
I have 8 machines for the hadoop cluster, 1 namenode and 7 data node.
I want the production jobs to have more priority than the user-defined jobs,
so I use the Fair scheduler.

Why sometimes my job scheduled by user: hadoop just start 7 map tasks, while
 in the cluster there's no other job running or waiting to be run?

Here's my configuration section in mapred-site.xml:
<property>
  <name>mapred.jobtracker.taskScheduler</name>
  <value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
 <property>
<name>mapred.fairscheduler.allocation.file</name>
<value>conf/pools.xml</value>
</property>

and here's the pool.xml configuration:
<?xml version="1.0"?>
<allocations>
  <pool name="hadoop">
    <minMaps>30</minMaps>
    <minReduces>30</minReduces>
    <weight>4.0</weight>
  </pool>
  <user name="hive">
    <maxRunningJobs>20</maxRunningJobs>
  </user>
  <userMaxJobsDefault>10</userMaxJobsDefault>
</allocations>

thanks!



-- 
haitao.yao@Beijing