You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Joan <jo...@gmail.com> on 2011/02/10 20:33:51 UTC

Chain multiple jobs

Hi,

I've two jobs and I'm trying to control them by ControlledJob.

job2 depends on job1 and the job2's input is the job1's output so when i do
this:

          cjob1 = new ControlledJob(job1, null);

          dependingJobs = new ArrayList<ControlledJob>();
          dependingJobs.add(cjob1);

          cjob2 = new ControlledJob(job2, dependingJobs);

          JobControl theControl = new JobControl("name");

          theControl.addJob(cjob1);
          theControl.addJob(cjob2);

          Thread theController = new Thread(theControl);
          theController.start();



I get the exception:

Exception in thread "main" java.io.FileNotFoundException: File "output from
job1" does not exist.

Because when "cjob2 = new ControlledJob(job2, dependingJobs);" is being
instanced the input of job2 doesn't exist.

Can someone help me?

Thanks

Joan

Re: Chain multiple jobs

Posted by Joan <jo...@gmail.com>.
I don't understant why ControlledJob when adds depending jobs (job2) It
tries to load TextInputFormat with output job1, but It obviously that the
output job1 doesn't exist because It didn't run.

How I do it? How I can add this dependency?

Schema:

job1: input(file [exist HDFS]) --> output (file)
job2: input(job1's output) --> output(file)

How to indicate to ControlledJob or JobControl that It doesn't add new
dependency while job1 has not finished?

Thanks

Joan


2011/2/10 Joan <jo...@gmail.com>

> Hi,
>
> I've two jobs and I'm trying to control them by ControlledJob.
>
> job2 depends on job1 and the job2's input is the job1's output so when i do
> this:
>
>           cjob1 = new ControlledJob(job1, null);
>
>           dependingJobs = new ArrayList<ControlledJob>();
>           dependingJobs.add(cjob1);
>
>           cjob2 = new ControlledJob(job2, dependingJobs);
>
>           JobControl theControl = new JobControl("name");
>
>           theControl.addJob(cjob1);
>           theControl.addJob(cjob2);
>
>           Thread theController = new Thread(theControl);
>           theController.start();
>
>
>
> I get the exception:
>
> Exception in thread "main" java.io.FileNotFoundException: File "output from
> job1" does not exist.
>
> Because when "cjob2 = new ControlledJob(job2, dependingJobs);" is being
> instanced the input of job2 doesn't exist.
>
> Can someone help me?
>
> Thanks
>
> Joan
>