You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mithila Nagendra <mn...@asu.edu> on 2009/08/11 23:59:24 UTC

Creating a job

Hello All

How do I create a Job in Hadoop using Class Job? And how do I run it?
Generally JobClient.runJob(conf) is used, but the parameter in not of the
type Job.

Also How do I use the class JobControl? Can I create Threads in a Hadoop
(similar to multithreading in JAVA), where different Threads call diffrent
hadoop jobs? I guess JobControl is connected to all this in some way.

Thanks for you help
Mithila Nagendra

Re: Creating a job

Posted by Mithila Nagendra <mn...@asu.edu>.
Hello Jakob

Yes I have gone through the Job Submission strategy in Hadoop, that is
helpful. But I was looking at interdependent jobs, I was trying to switch
the state of a running job to waiting. I was looking at Jobcontrol for that
reason.

I have gone through the document you pointed out, was wondering if there is
a more comprehensive guide out there.

Thanks!
Mithila



On Tue, Aug 11, 2009 at 4:59 PM, Jakob Homan <jh...@yahoo-inc.com> wrote:

> Hey Mithila-
>   I would point you to the WordCount example (
> http://hadoop.apache.org/common/docs/current/mapred_tutorial.html) for a
> basic example of how jobs are created by supplying a JobConf to the
> JobClient.  This will submit your conf to the cluster which will create and
> run the job.
>
> The JobControl class is to manager a series of jobs that are dependent on
> each other. Is this a situation you're facing? If not, the job submission
> strategy in the WordCount example should be sufficient.
>
> Regarding threading: Writing multi-thread apps is generally not needed, as
> Hadoop provides parallelization via MapReduce.  However, there is a
> MultithreadedMapper for situations where you may not be maxing out the CPU
> in a specific Mapper.
>
> It sounds like it may be helpful to check out the job submission
> documentation:
> http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Job+Submission+and+Monitoring Let us know if anything is unclear after that.
>
> Thanks,
>
> Jakob Homan
> Yahoo!
>
>
> Mithila Nagendra wrote:
>
>> Hello All
>>
>> How do I create a Job in Hadoop using Class Job? And how do I run it?
>> Generally JobClient.runJob(conf) is used, but the parameter in not of the
>> type Job.
>>
>> Also How do I use the class JobControl? Can I create Threads in a Hadoop
>> (similar to multithreading in JAVA), where different Threads call diffrent
>> hadoop jobs? I guess JobControl is connected to all this in some way.
>>
>> Thanks for you help
>> Mithila Nagendra
>>
>>
>

Re: Creating a job

Posted by Jakob Homan <jh...@yahoo-inc.com>.
Hey Mithila-
    I would point you to the WordCount example 
(http://hadoop.apache.org/common/docs/current/mapred_tutorial.html) for 
a basic example of how jobs are created by supplying a JobConf to the 
JobClient.  This will submit your conf to the cluster which will create 
and run the job.

The JobControl class is to manager a series of jobs that are dependent 
on each other. Is this a situation you're facing? If not, the job 
submission strategy in the WordCount example should be sufficient.

Regarding threading: Writing multi-thread apps is generally not needed, 
as Hadoop provides parallelization via MapReduce.  However, there is a 
MultithreadedMapper for situations where you may not be maxing out the 
CPU in a specific Mapper.

It sounds like it may be helpful to check out the job submission 
documentation: 
http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Job+Submission+and+Monitoring 
  Let us know if anything is unclear after that.

Thanks,

Jakob Homan
Yahoo!

Mithila Nagendra wrote:
> Hello All
> 
> How do I create a Job in Hadoop using Class Job? And how do I run it?
> Generally JobClient.runJob(conf) is used, but the parameter in not of the
> type Job.
> 
> Also How do I use the class JobControl? Can I create Threads in a Hadoop
> (similar to multithreading in JAVA), where different Threads call diffrent
> hadoop jobs? I guess JobControl is connected to all this in some way.
> 
> Thanks for you help
> Mithila Nagendra
>