You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by madhu phatak <ph...@gmail.com> on 2011/07/26 11:58:46 UTC

Submitting and running hadoop jobs Programmatically

Hi,
  I am working on a open source project
Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
i am trying to create the hadoop jobs depending upon the user input. I was
using Java Process API to run the bin/hadoop shell script to submit the
jobs. But it seems not good way because the process creation model is
not consistent across different operating systems . Is there any better way
to submit the jobs rather than invoking the shell script? I am using
hadoop-0.21.0 version and i am running my program in the same user where
hadoop is installed . Some of the older thread told if I add configuration
files in path it will work fine . But i am not able to run in that way . So
anyone tried this before? If So , please can you give detailed instruction
how to achieve it . Advanced thanks for your help.

Regards,
Madhukara Phatak

Re: Submitting and running hadoop jobs Programmatically

Posted by Harsh J <ha...@cloudera.com>.

A simple job.submit(…) OR JobClient.runJob(jobConf), submits your job
right from the Java API. Does this not work for you? If not, what
error do you face?

Forking out and launching from a system process is a bad idea unless
there's absolutely no way.

On Tue, Jul 26, 2011 at 3:28 PM, madhu phatak <ph...@gmail.com> wrote:
> Hi,
>  I am working on a open source project
> Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
> i am trying to create the hadoop jobs depending upon the user input. I was
> using Java Process API to run the bin/hadoop shell script to submit the
> jobs. But it seems not good way because the process creation model is
> not consistent across different operating systems . Is there any better way
> to submit the jobs rather than invoking the shell script? I am using
> hadoop-0.21.0 version and i am running my program in the same user where
> hadoop is installed . Some of the older thread told if I add configuration
> files in path it will work fine . But i am not able to run in that way . So
> anyone tried this before? If So , please can you give detailed instruction
> how to achieve it . Advanced thanks for your help.
>
> Regards,
> Madhukara Phatak
>



-- 
Harsh J

Re: Submitting and running hadoop jobs Programmatically

Posted by madhu phatak <ph...@gmail.com>.

Thank you Harsha . I am able to run the jobs by ditching *.

On Wed, Jul 27, 2011 at 11:41 AM, Harsh J <ha...@cloudera.com> wrote:

> Madhu,
>
> Ditch the '*' in the classpath element that has the configuration
> directory. The directory ought to be on the classpath, not the files
> AFAIK.
>
> Try and let us know if it then picks up the proper config (right now,
> its using the local mode).
>
> On Wed, Jul 27, 2011 at 10:25 AM, madhu phatak <ph...@gmail.com>
> wrote:
> > Hi
> > I am submitting the job as follows
> >
> > java -cp
> >
>  Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
> > com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob
> input/book.csv
> > kkk11fffrrw 1
> >
> > I get the log in CLI as below
> >
> > 11/07/27 10:22:54 INFO security.Groups: Group mapping
> > impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> > cacheTimeout=300000
> > 11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> > processName=JobTracker, sessionId=
> > 11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with
> > processName=JobTracker, sessionId= - already initialized
> > 11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser
> for
> > parsing the arguments. Applications should implement Tool for the same.
> > 11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area
> >
> file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001
> >
> > It doesn't create any job in hadoop.
> >
> > On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K <de...@huawei.com> wrote:
> >
> >> Madhu,
> >>
> >>  Can you check the client logs, whether any error/exception is coming
> while
> >> submitting the job?
> >>
> >> Devaraj K
> >>
> >> -----Original Message-----
> >> From: Harsh J [mailto:harsh@cloudera.com]
> >> Sent: Tuesday, July 26, 2011 5:01 PM
> >> To: common-user@hadoop.apache.org
> >> Subject: Re: Submitting and running hadoop jobs Programmatically
> >>
> >> Yes. Internally, it calls regular submit APIs.
> >>
> >> On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak <ph...@gmail.com>
> >> wrote:
> >> > I am using JobControl.add() to add a job and running job control in
> >> > a separate thread and using JobControl.allFinished() to see all jobs
> >> > completed or not . Is this work same as Job.submit()??
> >> >
> >> > On Tue, Jul 26, 2011 at 4:08 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >
> >> >> Madhu,
> >> >>
> >> >> Do you get a specific error message / stack trace? Could you also
> >> >> paste your JT logs?
> >> >>
> >> >> On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak <ph...@gmail.com>
> >> >> wrote:
> >> >> > Hi
> >> >> >  I am using the same APIs but i am not able to run the jobs by just
> >> >> adding
> >> >> > the configuration files and jars . It never create a job in Hadoop
> ,
> >> it
> >> >> just
> >> >> > shows cleaning up staging area and fails.
> >> >> >
> >> >> > On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K <de...@huawei.com>
> >> wrote:
> >> >> >
> >> >> >> Hi Madhu,
> >> >> >>
> >> >> >>   You can submit the jobs using the Job API's programmatically
> from
> >> any
> >> >> >> system. The job submission code can be written this way.
> >> >> >>
> >> >> >>     // Create a new Job
> >> >> >>     Job job = new Job(new Configuration());
> >> >> >>     job.setJarByClass(MyJob.class);
> >> >> >>
> >> >> >>     // Specify various job-specific parameters
> >> >> >>     job.setJobName("myjob");
> >> >> >>
> >> >> >>     job.setInputPath(new Path("in"));
> >> >> >>     job.setOutputPath(new Path("out"));
> >> >> >>
> >> >> >>     job.setMapperClass(MyJob.MyMapper.class);
> >> >> >>     job.setReducerClass(MyJob.MyReducer.class);
> >> >> >>
> >> >> >>     // Submit the job
> >> >> >>     job.submit();
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> For submitting this, need to add the hadoop jar files and
> >> configuration
> >> >> >> files in the class path of the application from where you want to
> >> submit
> >> >> >> the
> >> >> >> job.
> >> >> >>
> >> >> >> You can refer this docs for more info on Job API's.
> >> >> >>
> >> >> >>
> >> >>
> >>
> >>
> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
> >> >> >> uce/Job.html
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Devaraj K
> >> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: madhu phatak [mailto:phatak.dev@gmail.com]
> >> >> >> Sent: Tuesday, July 26, 2011 3:29 PM
> >> >> >> To: common-user@hadoop.apache.org
> >> >> >> Subject: Submitting and running hadoop jobs Programmatically
> >> >> >>
> >> >> >> Hi,
> >> >> >>  I am working on a open source project
> >> >> >> Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
> >> >> >> i am trying to create the hadoop jobs depending upon the user
> input.
> >> I
> >> >> was
> >> >> >> using Java Process API to run the bin/hadoop shell script to
> submit
> >> the
> >> >> >> jobs. But it seems not good way because the process creation model
> is
> >> >> >> not consistent across different operating systems . Is there any
> >> better
> >> >> way
> >> >> >> to submit the jobs rather than invoking the shell script? I am
> using
> >> >> >> hadoop-0.21.0 version and i am running my program in the same user
> >> where
> >> >> >> hadoop is installed . Some of the older thread told if I add
> >> >> configuration
> >> >> >> files in path it will work fine . But i am not able to run in that
> >> way
> >> .
> >> >> So
> >> >> >> anyone tried this before? If So , please can you give detailed
> >> >> instruction
> >> >> >> how to achieve it . Advanced thanks for your help.
> >> >> >>
> >> >> >> Regards,
> >> >> >> Madhukara Phatak
> >> >> >>
> >> >> >>
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >>
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Submitting and running hadoop jobs Programmatically

Posted by Harsh J <ha...@cloudera.com>.

Madhu,

Ditch the '*' in the classpath element that has the configuration
directory. The directory ought to be on the classpath, not the files
AFAIK.

Try and let us know if it then picks up the proper config (right now,
its using the local mode).

On Wed, Jul 27, 2011 at 10:25 AM, madhu phatak <ph...@gmail.com> wrote:
> Hi
> I am submitting the job as follows
>
> java -cp
>  Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
> com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv
> kkk11fffrrw 1
>
> I get the log in CLI as below
>
> 11/07/27 10:22:54 INFO security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> cacheTimeout=300000
> 11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
> 11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with
> processName=JobTracker, sessionId= - already initialized
> 11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging area
> file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001
>
> It doesn't create any job in hadoop.
>
> On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K <de...@huawei.com> wrote:
>
>> Madhu,
>>
>>  Can you check the client logs, whether any error/exception is coming while
>> submitting the job?
>>
>> Devaraj K
>>
>> -----Original Message-----
>> From: Harsh J [mailto:harsh@cloudera.com]
>> Sent: Tuesday, July 26, 2011 5:01 PM
>> To: common-user@hadoop.apache.org
>> Subject: Re: Submitting and running hadoop jobs Programmatically
>>
>> Yes. Internally, it calls regular submit APIs.
>>
>> On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak <ph...@gmail.com>
>> wrote:
>> > I am using JobControl.add() to add a job and running job control in
>> > a separate thread and using JobControl.allFinished() to see all jobs
>> > completed or not . Is this work same as Job.submit()??
>> >
>> > On Tue, Jul 26, 2011 at 4:08 PM, Harsh J <ha...@cloudera.com> wrote:
>> >
>> >> Madhu,
>> >>
>> >> Do you get a specific error message / stack trace? Could you also
>> >> paste your JT logs?
>> >>
>> >> On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak <ph...@gmail.com>
>> >> wrote:
>> >> > Hi
>> >> >  I am using the same APIs but i am not able to run the jobs by just
>> >> adding
>> >> > the configuration files and jars . It never create a job in Hadoop ,
>> it
>> >> just
>> >> > shows cleaning up staging area and fails.
>> >> >
>> >> > On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K <de...@huawei.com>
>> wrote:
>> >> >
>> >> >> Hi Madhu,
>> >> >>
>> >> >>   You can submit the jobs using the Job API's programmatically from
>> any
>> >> >> system. The job submission code can be written this way.
>> >> >>
>> >> >>     // Create a new Job
>> >> >>     Job job = new Job(new Configuration());
>> >> >>     job.setJarByClass(MyJob.class);
>> >> >>
>> >> >>     // Specify various job-specific parameters
>> >> >>     job.setJobName("myjob");
>> >> >>
>> >> >>     job.setInputPath(new Path("in"));
>> >> >>     job.setOutputPath(new Path("out"));
>> >> >>
>> >> >>     job.setMapperClass(MyJob.MyMapper.class);
>> >> >>     job.setReducerClass(MyJob.MyReducer.class);
>> >> >>
>> >> >>     // Submit the job
>> >> >>     job.submit();
>> >> >>
>> >> >>
>> >> >>
>> >> >> For submitting this, need to add the hadoop jar files and
>> configuration
>> >> >> files in the class path of the application from where you want to
>> submit
>> >> >> the
>> >> >> job.
>> >> >>
>> >> >> You can refer this docs for more info on Job API's.
>> >> >>
>> >> >>
>> >>
>>
>> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
>> >> >> uce/Job.html
>> >> >>
>> >> >>
>> >> >>
>> >> >> Devaraj K
>> >> >>
>> >> >> -----Original Message-----
>> >> >> From: madhu phatak [mailto:phatak.dev@gmail.com]
>> >> >> Sent: Tuesday, July 26, 2011 3:29 PM
>> >> >> To: common-user@hadoop.apache.org
>> >> >> Subject: Submitting and running hadoop jobs Programmatically
>> >> >>
>> >> >> Hi,
>> >> >>  I am working on a open source project
>> >> >> Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
>> >> >> i am trying to create the hadoop jobs depending upon the user input.
>> I
>> >> was
>> >> >> using Java Process API to run the bin/hadoop shell script to submit
>> the
>> >> >> jobs. But it seems not good way because the process creation model is
>> >> >> not consistent across different operating systems . Is there any
>> better
>> >> way
>> >> >> to submit the jobs rather than invoking the shell script? I am using
>> >> >> hadoop-0.21.0 version and i am running my program in the same user
>> where
>> >> >> hadoop is installed . Some of the older thread told if I add
>> >> configuration
>> >> >> files in path it will work fine . But i am not able to run in that
>> way
>> .
>> >> So
>> >> >> anyone tried this before? If So , please can you give detailed
>> >> instruction
>> >> >> how to achieve it . Advanced thanks for your help.
>> >> >>
>> >> >> Regards,
>> >> >> Madhukara Phatak
>> >> >>
>> >> >>
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >>
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>>
>



-- 
Harsh J

Re: Submitting and running hadoop jobs Programmatically

Posted by madhu phatak <ph...@gmail.com>.

Thank you . Will have a look on it.

On Wed, Jul 27, 2011 at 3:28 PM, Steve Loughran <st...@apache.org> wrote:

> On 27/07/11 05:55, madhu phatak wrote:
>
>> Hi
>> I am submitting the job as follows
>>
>> java -cp
>>  Nectar-analytics-0.0.1-**SNAPSHOT.jar:/home/hadoop/**
>> hadoop-for-nectar/hadoop-0.21.**0/conf/*:$HADOOP_COMMON_HOME/**
>> lib/*:$HADOOP_COMMON_HOME/*
>> com.zinnia.nectar.regression.**hadoop.primitive.jobs.SigmaJob
>> input/book.csv
>> kkk11fffrrw 1
>>
>
> My code to submit jobs (via a declarative configuration) is up online
>
> http://smartfrog.svn.**sourceforge.net/viewvc/**
> smartfrog/trunk/core/hadoop-**components/hadoop-ops/src/org/**
> smartfrog/services/hadoop/**operations/components/**
> submitter/SubmitterImpl.java?**revision=8590&view=markup<http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/hadoop-components/hadoop-ops/src/org/smartfrog/services/hadoop/operations/components/submitter/SubmitterImpl.java?revision=8590&view=markup>
>
> It's LGPL, but ask nicely and I'll change the header to Apache.
>
> That code doesn't set up the classpath by pushing out more JARs (I'm
> planning to push out .groovy scripts instead), but it can also poll for job
> completion, take a timeout (useful in small test runs), and do other things.
> I currently mainly use it for testing
>
>

Re: Submitting and running hadoop jobs Programmatically

Posted by Steve Loughran <st...@apache.org>.

On 27/07/11 05:55, madhu phatak wrote:
> Hi
> I am submitting the job as follows
>
> java -cp
>   Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
> com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv
> kkk11fffrrw 1

My code to submit jobs (via a declarative configuration) is up online

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/hadoop-components/hadoop-ops/src/org/smartfrog/services/hadoop/operations/components/submitter/SubmitterImpl.java?revision=8590&view=markup

It's LGPL, but ask nicely and I'll change the header to Apache.

That code doesn't set up the classpath by pushing out more JARs (I'm 
planning to push out .groovy scripts instead), but it can also poll for 
job completion, take a timeout (useful in small test runs), and do other 
things. I currently mainly use it for testing

Re: Submitting and running hadoop jobs Programmatically

Posted by madhu phatak <ph...@gmail.com>.

Hi
I am submitting the job as follows

java -cp
 Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv
kkk11fffrrw 1

I get the log in CLI as below

11/07/27 10:22:54 INFO security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
cacheTimeout=300000
11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with
processName=JobTracker, sessionId= - already initialized
11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging area
file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001

It doesn't create any job in hadoop.

On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K <de...@huawei.com> wrote:

> Madhu,
>
>  Can you check the client logs, whether any error/exception is coming while
> submitting the job?
>
> Devaraj K
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Tuesday, July 26, 2011 5:01 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Submitting and running hadoop jobs Programmatically
>
> Yes. Internally, it calls regular submit APIs.
>
> On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak <ph...@gmail.com>
> wrote:
> > I am using JobControl.add() to add a job and running job control in
> > a separate thread and using JobControl.allFinished() to see all jobs
> > completed or not . Is this work same as Job.submit()??
> >
> > On Tue, Jul 26, 2011 at 4:08 PM, Harsh J <ha...@cloudera.com> wrote:
> >
> >> Madhu,
> >>
> >> Do you get a specific error message / stack trace? Could you also
> >> paste your JT logs?
> >>
> >> On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak <ph...@gmail.com>
> >> wrote:
> >> > Hi
> >> >  I am using the same APIs but i am not able to run the jobs by just
> >> adding
> >> > the configuration files and jars . It never create a job in Hadoop ,
> it
> >> just
> >> > shows cleaning up staging area and fails.
> >> >
> >> > On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K <de...@huawei.com>
> wrote:
> >> >
> >> >> Hi Madhu,
> >> >>
> >> >>   You can submit the jobs using the Job API's programmatically from
> any
> >> >> system. The job submission code can be written this way.
> >> >>
> >> >>     // Create a new Job
> >> >>     Job job = new Job(new Configuration());
> >> >>     job.setJarByClass(MyJob.class);
> >> >>
> >> >>     // Specify various job-specific parameters
> >> >>     job.setJobName("myjob");
> >> >>
> >> >>     job.setInputPath(new Path("in"));
> >> >>     job.setOutputPath(new Path("out"));
> >> >>
> >> >>     job.setMapperClass(MyJob.MyMapper.class);
> >> >>     job.setReducerClass(MyJob.MyReducer.class);
> >> >>
> >> >>     // Submit the job
> >> >>     job.submit();
> >> >>
> >> >>
> >> >>
> >> >> For submitting this, need to add the hadoop jar files and
> configuration
> >> >> files in the class path of the application from where you want to
> submit
> >> >> the
> >> >> job.
> >> >>
> >> >> You can refer this docs for more info on Job API's.
> >> >>
> >> >>
> >>
>
> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
> >> >> uce/Job.html
> >> >>
> >> >>
> >> >>
> >> >> Devaraj K
> >> >>
> >> >> -----Original Message-----
> >> >> From: madhu phatak [mailto:phatak.dev@gmail.com]
> >> >> Sent: Tuesday, July 26, 2011 3:29 PM
> >> >> To: common-user@hadoop.apache.org
> >> >> Subject: Submitting and running hadoop jobs Programmatically
> >> >>
> >> >> Hi,
> >> >>  I am working on a open source project
> >> >> Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
> >> >> i am trying to create the hadoop jobs depending upon the user input.
> I
> >> was
> >> >> using Java Process API to run the bin/hadoop shell script to submit
> the
> >> >> jobs. But it seems not good way because the process creation model is
> >> >> not consistent across different operating systems . Is there any
> better
> >> way
> >> >> to submit the jobs rather than invoking the shell script? I am using
> >> >> hadoop-0.21.0 version and i am running my program in the same user
> where
> >> >> hadoop is installed . Some of the older thread told if I add
> >> configuration
> >> >> files in path it will work fine . But i am not able to run in that
> way
> .
> >> So
> >> >> anyone tried this before? If So , please can you give detailed
> >> instruction
> >> >> how to achieve it . Advanced thanks for your help.
> >> >>
> >> >> Regards,
> >> >> Madhukara Phatak
> >> >>
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >>
> >
>
>
>
> --
> Harsh J
>
>

Re: Submitting and running hadoop jobs Programmatically

Posted by Harsh J <ha...@cloudera.com>.

Yes. Internally, it calls regular submit APIs.

On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak <ph...@gmail.com> wrote:
> I am using JobControl.add() to add a job and running job control in
> a separate thread and using JobControl.allFinished() to see all jobs
> completed or not . Is this work same as Job.submit()??
>
> On Tue, Jul 26, 2011 at 4:08 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Madhu,
>>
>> Do you get a specific error message / stack trace? Could you also
>> paste your JT logs?
>>
>> On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak <ph...@gmail.com>
>> wrote:
>> > Hi
>> >  I am using the same APIs but i am not able to run the jobs by just
>> adding
>> > the configuration files and jars . It never create a job in Hadoop , it
>> just
>> > shows cleaning up staging area and fails.
>> >
>> > On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K <de...@huawei.com> wrote:
>> >
>> >> Hi Madhu,
>> >>
>> >>   You can submit the jobs using the Job API's programmatically from any
>> >> system. The job submission code can be written this way.
>> >>
>> >>     // Create a new Job
>> >>     Job job = new Job(new Configuration());
>> >>     job.setJarByClass(MyJob.class);
>> >>
>> >>     // Specify various job-specific parameters
>> >>     job.setJobName("myjob");
>> >>
>> >>     job.setInputPath(new Path("in"));
>> >>     job.setOutputPath(new Path("out"));
>> >>
>> >>     job.setMapperClass(MyJob.MyMapper.class);
>> >>     job.setReducerClass(MyJob.MyReducer.class);
>> >>
>> >>     // Submit the job
>> >>     job.submit();
>> >>
>> >>
>> >>
>> >> For submitting this, need to add the hadoop jar files and configuration
>> >> files in the class path of the application from where you want to submit
>> >> the
>> >> job.
>> >>
>> >> You can refer this docs for more info on Job API's.
>> >>
>> >>
>> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
>> >> uce/Job.html
>> >>
>> >>
>> >>
>> >> Devaraj K
>> >>
>> >> -----Original Message-----
>> >> From: madhu phatak [mailto:phatak.dev@gmail.com]
>> >> Sent: Tuesday, July 26, 2011 3:29 PM
>> >> To: common-user@hadoop.apache.org
>> >> Subject: Submitting and running hadoop jobs Programmatically
>> >>
>> >> Hi,
>> >>  I am working on a open source project
>> >> Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
>> >> i am trying to create the hadoop jobs depending upon the user input. I
>> was
>> >> using Java Process API to run the bin/hadoop shell script to submit the
>> >> jobs. But it seems not good way because the process creation model is
>> >> not consistent across different operating systems . Is there any better
>> way
>> >> to submit the jobs rather than invoking the shell script? I am using
>> >> hadoop-0.21.0 version and i am running my program in the same user where
>> >> hadoop is installed . Some of the older thread told if I add
>> configuration
>> >> files in path it will work fine . But i am not able to run in that way .
>> So
>> >> anyone tried this before? If So , please can you give detailed
>> instruction
>> >> how to achieve it . Advanced thanks for your help.
>> >>
>> >> Regards,
>> >> Madhukara Phatak
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>



-- 
Harsh J

Re: Submitting and running hadoop jobs Programmatically

Posted by madhu phatak <ph...@gmail.com>.

I am using JobControl.add() to add a job and running job control in
a separate thread and using JobControl.allFinished() to see all jobs
completed or not . Is this work same as Job.submit()??

On Tue, Jul 26, 2011 at 4:08 PM, Harsh J <ha...@cloudera.com> wrote:

> Madhu,
>
> Do you get a specific error message / stack trace? Could you also
> paste your JT logs?
>
> On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak <ph...@gmail.com>
> wrote:
> > Hi
> >  I am using the same APIs but i am not able to run the jobs by just
> adding
> > the configuration files and jars . It never create a job in Hadoop , it
> just
> > shows cleaning up staging area and fails.
> >
> > On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K <de...@huawei.com> wrote:
> >
> >> Hi Madhu,
> >>
> >>   You can submit the jobs using the Job API's programmatically from any
> >> system. The job submission code can be written this way.
> >>
> >>     // Create a new Job
> >>     Job job = new Job(new Configuration());
> >>     job.setJarByClass(MyJob.class);
> >>
> >>     // Specify various job-specific parameters
> >>     job.setJobName("myjob");
> >>
> >>     job.setInputPath(new Path("in"));
> >>     job.setOutputPath(new Path("out"));
> >>
> >>     job.setMapperClass(MyJob.MyMapper.class);
> >>     job.setReducerClass(MyJob.MyReducer.class);
> >>
> >>     // Submit the job
> >>     job.submit();
> >>
> >>
> >>
> >> For submitting this, need to add the hadoop jar files and configuration
> >> files in the class path of the application from where you want to submit
> >> the
> >> job.
> >>
> >> You can refer this docs for more info on Job API's.
> >>
> >>
> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
> >> uce/Job.html
> >>
> >>
> >>
> >> Devaraj K
> >>
> >> -----Original Message-----
> >> From: madhu phatak [mailto:phatak.dev@gmail.com]
> >> Sent: Tuesday, July 26, 2011 3:29 PM
> >> To: common-user@hadoop.apache.org
> >> Subject: Submitting and running hadoop jobs Programmatically
> >>
> >> Hi,
> >>  I am working on a open source project
> >> Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
> >> i am trying to create the hadoop jobs depending upon the user input. I
> was
> >> using Java Process API to run the bin/hadoop shell script to submit the
> >> jobs. But it seems not good way because the process creation model is
> >> not consistent across different operating systems . Is there any better
> way
> >> to submit the jobs rather than invoking the shell script? I am using
> >> hadoop-0.21.0 version and i am running my program in the same user where
> >> hadoop is installed . Some of the older thread told if I add
> configuration
> >> files in path it will work fine . But i am not able to run in that way .
> So
> >> anyone tried this before? If So , please can you give detailed
> instruction
> >> how to achieve it . Advanced thanks for your help.
> >>
> >> Regards,
> >> Madhukara Phatak
> >>
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Submitting and running hadoop jobs Programmatically

Posted by Harsh J <ha...@cloudera.com>.

Madhu,

Do you get a specific error message / stack trace? Could you also
paste your JT logs?

On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak <ph...@gmail.com> wrote:
> Hi
>  I am using the same APIs but i am not able to run the jobs by just adding
> the configuration files and jars . It never create a job in Hadoop , it just
> shows cleaning up staging area and fails.
>
> On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K <de...@huawei.com> wrote:
>
>> Hi Madhu,
>>
>>   You can submit the jobs using the Job API's programmatically from any
>> system. The job submission code can be written this way.
>>
>>     // Create a new Job
>>     Job job = new Job(new Configuration());
>>     job.setJarByClass(MyJob.class);
>>
>>     // Specify various job-specific parameters
>>     job.setJobName("myjob");
>>
>>     job.setInputPath(new Path("in"));
>>     job.setOutputPath(new Path("out"));
>>
>>     job.setMapperClass(MyJob.MyMapper.class);
>>     job.setReducerClass(MyJob.MyReducer.class);
>>
>>     // Submit the job
>>     job.submit();
>>
>>
>>
>> For submitting this, need to add the hadoop jar files and configuration
>> files in the class path of the application from where you want to submit
>> the
>> job.
>>
>> You can refer this docs for more info on Job API's.
>>
>> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
>> uce/Job.html
>>
>>
>>
>> Devaraj K
>>
>> -----Original Message-----
>> From: madhu phatak [mailto:phatak.dev@gmail.com]
>> Sent: Tuesday, July 26, 2011 3:29 PM
>> To: common-user@hadoop.apache.org
>> Subject: Submitting and running hadoop jobs Programmatically
>>
>> Hi,
>>  I am working on a open source project
>> Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
>> i am trying to create the hadoop jobs depending upon the user input. I was
>> using Java Process API to run the bin/hadoop shell script to submit the
>> jobs. But it seems not good way because the process creation model is
>> not consistent across different operating systems . Is there any better way
>> to submit the jobs rather than invoking the shell script? I am using
>> hadoop-0.21.0 version and i am running my program in the same user where
>> hadoop is installed . Some of the older thread told if I add configuration
>> files in path it will work fine . But i am not able to run in that way . So
>> anyone tried this before? If So , please can you give detailed instruction
>> how to achieve it . Advanced thanks for your help.
>>
>> Regards,
>> Madhukara Phatak
>>
>>
>



-- 
Harsh J

Re: Submitting and running hadoop jobs Programmatically

Posted by madhu phatak <ph...@gmail.com>.

Hi
 I am using the same APIs but i am not able to run the jobs by just adding
the configuration files and jars . It never create a job in Hadoop , it just
shows cleaning up staging area and fails.

On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K <de...@huawei.com> wrote:

> Hi Madhu,
>
>   You can submit the jobs using the Job API's programmatically from any
> system. The job submission code can be written this way.
>
>     // Create a new Job
>     Job job = new Job(new Configuration());
>     job.setJarByClass(MyJob.class);
>
>     // Specify various job-specific parameters
>     job.setJobName("myjob");
>
>     job.setInputPath(new Path("in"));
>     job.setOutputPath(new Path("out"));
>
>     job.setMapperClass(MyJob.MyMapper.class);
>     job.setReducerClass(MyJob.MyReducer.class);
>
>     // Submit the job
>     job.submit();
>
>
>
> For submitting this, need to add the hadoop jar files and configuration
> files in the class path of the application from where you want to submit
> the
> job.
>
> You can refer this docs for more info on Job API's.
>
> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
> uce/Job.html
>
>
>
> Devaraj K
>
> -----Original Message-----
> From: madhu phatak [mailto:phatak.dev@gmail.com]
> Sent: Tuesday, July 26, 2011 3:29 PM
> To: common-user@hadoop.apache.org
> Subject: Submitting and running hadoop jobs Programmatically
>
> Hi,
>  I am working on a open source project
> Nectar<https://github.com/zinnia-phatak-dev/Nectar> where
> i am trying to create the hadoop jobs depending upon the user input. I was
> using Java Process API to run the bin/hadoop shell script to submit the
> jobs. But it seems not good way because the process creation model is
> not consistent across different operating systems . Is there any better way
> to submit the jobs rather than invoking the shell script? I am using
> hadoop-0.21.0 version and i am running my program in the same user where
> hadoop is installed . Some of the older thread told if I add configuration
> files in path it will work fine . But i am not able to run in that way . So
> anyone tried this before? If So , please can you give detailed instruction
> how to achieve it . Advanced thanks for your help.
>
> Regards,
> Madhukara Phatak
>
>