You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jason Venner <ja...@attributor.com> on 2007/12/03 04:33:56 UTC
Controlling the number of simultanious jobs per machine - 0.15.0
We have jobs that require different resources and as such saturate our
machines at different levels or parallelization.
What we want to do in the driver is set the number of simultaneous jobs
per node.
JobClient client = new JobClient();
Configuration configuration = new Configuration();
configuration.setInt( "mapred.tasktracker.tasks.maximum", 7);
JobConf conf = new JobConf(configuration,MergeNewSeenDriver.class);
System.err.println( "configured maximum tasks is " + conf.get(
"mapred.tasktracker.tasks.maximum" ));
But this doesn't seem to work. The only success we have had is using
multithreaded map runner, but then we don't get to run multiple reduces
at a time on the machines.
Any suggestions?
Re: Controlling the number of simultanious jobs per machine - 0.15.0
Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Dec 2, 2007, at 11:53 PM, Espen Amble Kolstad wrote:
> AFAIK 0.15.x does not support different
> mapred.tasktracker.tasks.maximum per node. It's a per cluster setting.
> So whatever's in you hadoop-site.xml is what will be used.
>
> I think this is something coming in 0.16.x though.
This change was committed as HADOOP-1245. Furthermore, HADOOP-1274
allows you to change the number of map slots independently from the
number of reduce slots, so that you can run more maps without
clobbering your cluster with reduces.
-- Owen
Re: Controlling the number of simultanious jobs per machine - 0.15.0
Posted by Espen Amble Kolstad <es...@trank.no>.
AFAIK 0.15.x does not support different
mapred.tasktracker.tasks.maximum per node. It's a per cluster setting.
So whatever's in you hadoop-site.xml is what will be used.
I think this is something coming in 0.16.x though.
Espen
On 12/3/07, Jason Venner <ja...@attributor.com> wrote:
> We have jobs that require different resources and as such saturate our
> machines at different levels or parallelization.
> What we want to do in the driver is set the number of simultaneous jobs
> per node.
>
> JobClient client = new JobClient();
> Configuration configuration = new Configuration();
> configuration.setInt( "mapred.tasktracker.tasks.maximum", 7);
> JobConf conf = new JobConf(configuration,MergeNewSeenDriver.class);
>
> System.err.println( "configured maximum tasks is " + conf.get(
> "mapred.tasktracker.tasks.maximum" ));
>
> But this doesn't seem to work. The only success we have had is using
> multithreaded map runner, but then we don't get to run multiple reduces
> at a time on the machines.
>
> Any suggestions?
>
Re: Controlling the number of simultanious jobs per machine - 0.15.0
Posted by Michael Bieniosek <mi...@powerset.com>.
You also might want to look at HADOOP-2300
On 12/2/07 7:33 PM, "Jason Venner" <ja...@attributor.com> wrote:
We have jobs that require different resources and as such saturate our
machines at different levels or parallelization.
What we want to do in the driver is set the number of simultaneous jobs
per node.
JobClient client = new JobClient();
Configuration configuration = new Configuration();
configuration.setInt( "mapred.tasktracker.tasks.maximum", 7);
JobConf conf = new JobConf(configuration,MergeNewSeenDriver.class);
System.err.println( "configured maximum tasks is " + conf.get(
"mapred.tasktracker.tasks.maximum" ));
But this doesn't seem to work. The only success we have had is using
multithreaded map runner, but then we don't get to run multiple reduces
at a time on the machines.
Any suggestions?