You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Arindam Choudhury <ar...@gmail.com> on 2012/04/25 10:44:20 UTC

understanding hadoop job submission

Hi,

I am new to hadoop and I am trying to understand hadoop job submission.

We submit the job using:

hadoop jar some.jar name input output

this in turn invoke the RunJar . But in RunJar I can not find any
JobSubmit() or any call to JobClient.

Then, how the job gets submitted to the JobTracker?

-Arindam

Re: understanding hadoop job submission

Posted by Jay Vyas <ja...@gmail.com>.
Yes, the job is submitted by the api calls in map reduce code

On Wed, Apr 25, 2012 at 3:56 AM, Devaraj k <de...@huawei.com> wrote:

> Hi Arindam,
>
>    hadoop jar jarFileName MainClassName
>
> The above command will not submit the job. This command only executes the
> jar file using the Main Class(Main-class present in manifest info if
> available otherwise class name(i.e MainClassName in the above command)
> passed as an argument. If we give any additional arguments in the command,
> those will be passed to the Main class args.
>
>   We can have a job submission code in the Main Class or any of the
> classes in the jar file. You can take a look into WordCount example for job
> submission info.
>
>
> Thanks
> Devaraj
>
> ________________________________________
> From: Arindam Choudhury [arindamchoudhury0@gmail.com]
> Sent: Wednesday, April 25, 2012 2:14 PM
> To: common-user
> Subject: understanding hadoop job submission
>
> Hi,
>
> I am new to hadoop and I am trying to understand hadoop job submission.
>
> We submit the job using:
>
> hadoop jar some.jar name input output
>
> this in turn invoke the RunJar . But in RunJar I can not find any
> JobSubmit() or any call to JobClient.
>
> Then, how the job gets submitted to the JobTracker?
>
> -Arindam
>



-- 
Jay Vyas
MMSB/UCHC

RE: understanding hadoop job submission

Posted by Devaraj k <de...@huawei.com>.
You can submit the job using any one of the below ways,

1. If you submit the job using JobClient, you need to create JobConf and submit the job using JobClient.runJob(JobConf conf) API.

2. Also you can submit the job by creating instance for Job by passing Configuration object and submit(using submit() or waitForCompletion()) as you mentioned in the below code. This case no need to create an instance for JobConf.

Thanks
Devaraj

________________________________________
From: Arindam Choudhury [arindamchoudhury0@gmail.com]
Sent: Wednesday, April 25, 2012 3:27 PM
To: common-user@hadoop.apache.org
Subject: Re: understanding hadoop job submission

Hi,

The code is:

public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: wordcount <in> <out>");
      System.exit(2);
    }
    Job job = new Job(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }

I understand it now. But, is it possible to write a program using the
JobClient to submit the hadoop job?

To do that I have to create a JobConf manually. Am I thinking right?

Arindam

On Wed, Apr 25, 2012 at 10:56 AM, Devaraj k <de...@huawei.com> wrote:

> Hi Arindam,
>
>    hadoop jar jarFileName MainClassName
>
> The above command will not submit the job. This command only executes the
> jar file using the Main Class(Main-class present in manifest info if
> available otherwise class name(i.e MainClassName in the above command)
> passed as an argument. If we give any additional arguments in the command,
> those will be passed to the Main class args.
>
>   We can have a job submission code in the Main Class or any of the
> classes in the jar file. You can take a look into WordCount example for job
> submission info.
>
>
> Thanks
> Devaraj
>
> ________________________________________
> From: Arindam Choudhury [arindamchoudhury0@gmail.com]
> Sent: Wednesday, April 25, 2012 2:14 PM
> To: common-user
> Subject: understanding hadoop job submission
>
> Hi,
>
> I am new to hadoop and I am trying to understand hadoop job submission.
>
> We submit the job using:
>
> hadoop jar some.jar name input output
>
> this in turn invoke the RunJar . But in RunJar I can not find any
> JobSubmit() or any call to JobClient.
>
> Then, how the job gets submitted to the JobTracker?
>
> -Arindam
>

Re: understanding hadoop job submission

Posted by Arindam Choudhury <ar...@gmail.com>.
Hi,

The code is:

public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: wordcount <in> <out>");
      System.exit(2);
    }
    Job job = new Job(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }

I understand it now. But, is it possible to write a program using the
JobClient to submit the hadoop job?

To do that I have to create a JobConf manually. Am I thinking right?

Arindam

On Wed, Apr 25, 2012 at 10:56 AM, Devaraj k <de...@huawei.com> wrote:

> Hi Arindam,
>
>    hadoop jar jarFileName MainClassName
>
> The above command will not submit the job. This command only executes the
> jar file using the Main Class(Main-class present in manifest info if
> available otherwise class name(i.e MainClassName in the above command)
> passed as an argument. If we give any additional arguments in the command,
> those will be passed to the Main class args.
>
>   We can have a job submission code in the Main Class or any of the
> classes in the jar file. You can take a look into WordCount example for job
> submission info.
>
>
> Thanks
> Devaraj
>
> ________________________________________
> From: Arindam Choudhury [arindamchoudhury0@gmail.com]
> Sent: Wednesday, April 25, 2012 2:14 PM
> To: common-user
> Subject: understanding hadoop job submission
>
> Hi,
>
> I am new to hadoop and I am trying to understand hadoop job submission.
>
> We submit the job using:
>
> hadoop jar some.jar name input output
>
> this in turn invoke the RunJar . But in RunJar I can not find any
> JobSubmit() or any call to JobClient.
>
> Then, how the job gets submitted to the JobTracker?
>
> -Arindam
>

RE: understanding hadoop job submission

Posted by Devaraj k <de...@huawei.com>.
Hi Arindam,

    hadoop jar jarFileName MainClassName

The above command will not submit the job. This command only executes the jar file using the Main Class(Main-class present in manifest info if available otherwise class name(i.e MainClassName in the above command) passed as an argument. If we give any additional arguments in the command, those will be passed to the Main class args.

   We can have a job submission code in the Main Class or any of the classes in the jar file. You can take a look into WordCount example for job submission info. 


Thanks
Devaraj

________________________________________
From: Arindam Choudhury [arindamchoudhury0@gmail.com]
Sent: Wednesday, April 25, 2012 2:14 PM
To: common-user
Subject: understanding hadoop job submission

Hi,

I am new to hadoop and I am trying to understand hadoop job submission.

We submit the job using:

hadoop jar some.jar name input output

this in turn invoke the RunJar . But in RunJar I can not find any
JobSubmit() or any call to JobClient.

Then, how the job gets submitted to the JobTracker?

-Arindam