You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by T Vinod Gupta <tv...@readypulse.com> on 2012/02/29 15:10:19 UTC

passing arguments to map reduce and queuing multiple jobs

hi,
whats the recommended way of passing arguments to m/r jobs? based on web
examples, the mapper and reducer classes are static classes. so if you have
to set some parameters that the individual instances of m/r runs need to
access, whats the way? im trying to do it by creating a static initialize
method in mapper and reducer and configuring static variables there. i am
not sure if this is the best way.

this brings me to next question - i need to create a always running daemon
that schedules multiple m/r jobs everyday. it can just queue multiple jobs
and wait for their completion. but then the above question becomes even
more relevant. if static variables are used, then how can one queue
multiple jobs on the same mapper/reducer?

thanks

Re: passing arguments to map reduce and queuing multiple jobs

Posted by Brock Noland <br...@cloudera.com>.
Hi,

This question is for mapreduce-user not hbase-user.

+mapreduce-user
bcc hbase-user

On Wed, Feb 29, 2012 at 7:40 PM, T Vinod Gupta <tv...@readypulse.com> wrote:
> hi,
> whats the recommended way of passing arguments to m/r jobs? based on web
> examples, the mapper and reducer classes are static classes. so if you have
> to set some parameters that the individual instances of m/r runs need to
> access, whats the way? im trying to do it by creating a static initialize
> method in mapper and reducer and configuring static variables there. i am
> not sure if this is the best way.

Look at the configure method:

http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Mapper.html

You are passed the job conf. You can set your parameters on the job conf.

> this brings me to next question - i need to create a always running daemon
> that schedules multiple m/r jobs everyday. it can just queue multiple jobs
> and wait for their completion. but then the above question becomes even
> more relevant. if static variables are used, then how can one queue
> multiple jobs on the same mapper/reducer?

I think you want to look at oozie:  http://incubator.apache.org/oozie/

Brock

-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Re: passing arguments to map reduce and queuing multiple jobs

Posted by Brock Noland <br...@cloudera.com>.
Hi,

This question is for mapreduce-user not hbase-user.

+mapreduce-user
bcc hbase-user

On Wed, Feb 29, 2012 at 7:40 PM, T Vinod Gupta <tv...@readypulse.com> wrote:
> hi,
> whats the recommended way of passing arguments to m/r jobs? based on web
> examples, the mapper and reducer classes are static classes. so if you have
> to set some parameters that the individual instances of m/r runs need to
> access, whats the way? im trying to do it by creating a static initialize
> method in mapper and reducer and configuring static variables there. i am
> not sure if this is the best way.

Look at the configure method:

http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Mapper.html

You are passed the job conf. You can set your parameters on the job conf.

> this brings me to next question - i need to create a always running daemon
> that schedules multiple m/r jobs everyday. it can just queue multiple jobs
> and wait for their completion. but then the above question becomes even
> more relevant. if static variables are used, then how can one queue
> multiple jobs on the same mapper/reducer?

I think you want to look at oozie:  http://incubator.apache.org/oozie/

Brock

-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/