You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "nguyenhuynh.mr" <ng...@gmail.com> on 2009/04/21 11:02:11 UTC

How to run many jobs at the same time?

Hi all!


I have some jobs: job1, job2, job3,... . Each job working with the
group. To control jobs, I have JobControllers, each JobController 
control  jobs  follow the  specified  group.


Example:

- Have 2 Group: g1 and g2

-> 2 JobController: jController1, jcontroller2

  + jController1 contains jobs: job1, job2, job3, ...

  + jController2 contains jobs: job1, job2, job3, ...


* To run jobs, I sue:

for (i=0; i<2; i++){

    jCtrl[i]= new jController(group i);

    jCtrl[i].run();   

}


* I want jController1 and jController2 run parallel. But actual, when
jController1 finished,  jController2 begin run.


Why?

Please help me!


* P/s: jController use org.apache.hadoop.mapred.jobcontrol.JobControl


Thanks,


cheer,

Nguyen.


Re: How to run many jobs at the same time?

Posted by "nguyenhuynh.mr" <ng...@gmail.com>.
Billy Pearson wrote:

> The only way I know of is try using different Scheduling Queue's for
> each group
>
> Billy
>
> "nguyenhuynh.mr" <ng...@gmail.com> wrote in message
> news:49EE6E56.7080305@gmail.com...
>> Tom White wrote:
>>
>>> You need to start each JobControl in its own thread so they can run
>>> concurrently. Something like:
>>>
>>>     Thread t = new Thread(jobControl);
>>>     t.start();
>>>
>>> Then poll the jobControl.allFinished() method.
>>>
>>> Tom
>>>
>>> On Tue, Apr 21, 2009 at 10:02 AM, nguyenhuynh.mr
>>> <ng...@gmail.com> wrote:
>>>
>>>> Hi all!
>>>>
>>>>
>>>> I have some jobs: job1, job2, job3,... . Each job working with the
>>>> group. To control jobs, I have JobControllers, each JobController
>>>> control  jobs  follow the  specified  group.
>>>>
>>>>
>>>> Example:
>>>>
>>>> - Have 2 Group: g1 and g2
>>>>
>>>> -> 2 JobController: jController1, jcontroller2
>>>>
>>>>  + jController1 contains jobs: job1, job2, job3, ...
>>>>
>>>>  + jController2 contains jobs: job1, job2, job3, ...
>>>>
>>>>
>>>> * To run jobs, I sue:
>>>>
>>>> for (i=0; i<2; i++){
>>>>
>>>>    jCtrl[i]= new jController(group i);
>>>>
>>>>    jCtrl[i].run();
>>>>
>>>> }
>>>>
>>>>
>>>> * I want jController1 and jController2 run parallel. But actual, when
>>>> jController1 finished,  jController2 begin run.
>>>>
>>>>
>>>> Why?
>>>>
>>>> Please help me!
>>>>
>>>>
>>>> * P/s: jController use org.apache.hadoop.mapred.jobcontrol.JobControl
>>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> cheer,
>>>>
>>>> Nguyen.
>>>>
>>>>
>>>>
>>>
>>>
>> Thanks for your response!
>>
>> I have used Thread to start JobControl, some things like:
>>
>> public class JobController{
>>
>>    public JobController(String g){
>>       .....
>>    }
>>
>>    public run(){
>>       Job j1 = new Job(..);
>>       Job j2 =new Job(..);
>>       JobControl jc = new JobControl("group1");
>>
>>       Threat t=new Thread(jc);
>>       t.start();
>>
>>      while(! jc.allFinish()){
>>          // Display state ....
>>       }
>>    }
>> }
>>
>> * To run the code some like:
>> JobController[] jController=new JController[2];
>> for (int i=0; i<2; i++){
>>    jController[i]=new JobController(group[i]);
>>    JCOntroller[i].run();
>>
>> }
>>
>> * But not parallel run :( !
>>
>> Please help me!
>>
>> Thanks,
>>
>> Best regards,
>> Nguyen,
>>
>>
>
>
>
Thanks for all your help!

Please show detail your solution and give me a example.

Thanks much,

Best regards,
Nguyen.


Re: How to run many jobs at the same time?

Posted by Billy Pearson <sa...@pearsonwholesale.com>.
The only way I know of is try using different Scheduling Queue's for each 
group

Billy

"nguyenhuynh.mr" <ng...@gmail.com> 
wrote in message news:49EE6E56.7080305@gmail.com...
> Tom White wrote:
>
>> You need to start each JobControl in its own thread so they can run
>> concurrently. Something like:
>>
>>     Thread t = new Thread(jobControl);
>>     t.start();
>>
>> Then poll the jobControl.allFinished() method.
>>
>> Tom
>>
>> On Tue, Apr 21, 2009 at 10:02 AM, nguyenhuynh.mr
>> <ng...@gmail.com> wrote:
>>
>>> Hi all!
>>>
>>>
>>> I have some jobs: job1, job2, job3,... . Each job working with the
>>> group. To control jobs, I have JobControllers, each JobController
>>> control  jobs  follow the  specified  group.
>>>
>>>
>>> Example:
>>>
>>> - Have 2 Group: g1 and g2
>>>
>>> -> 2 JobController: jController1, jcontroller2
>>>
>>>  + jController1 contains jobs: job1, job2, job3, ...
>>>
>>>  + jController2 contains jobs: job1, job2, job3, ...
>>>
>>>
>>> * To run jobs, I sue:
>>>
>>> for (i=0; i<2; i++){
>>>
>>>    jCtrl[i]= new jController(group i);
>>>
>>>    jCtrl[i].run();
>>>
>>> }
>>>
>>>
>>> * I want jController1 and jController2 run parallel. But actual, when
>>> jController1 finished,  jController2 begin run.
>>>
>>>
>>> Why?
>>>
>>> Please help me!
>>>
>>>
>>> * P/s: jController use org.apache.hadoop.mapred.jobcontrol.JobControl
>>>
>>>
>>> Thanks,
>>>
>>>
>>> cheer,
>>>
>>> Nguyen.
>>>
>>>
>>>
>>
>>
> Thanks for your response!
>
> I have used Thread to start JobControl, some things like:
>
> public class JobController{
>
>    public JobController(String g){
>       .....
>    }
>
>    public run(){
>       Job j1 = new Job(..);
>       Job j2 =new Job(..);
>       JobControl jc = new JobControl("group1");
>
>       Threat t=new Thread(jc);
>       t.start();
>
>      while(! jc.allFinish()){
>          // Display state ....
>       }
>    }
> }
>
> * To run the code some like:
> JobController[] jController=new JController[2];
> for (int i=0; i<2; i++){
>    jController[i]=new JobController(group[i]);
>    JCOntroller[i].run();
>
> }
>
> * But not parallel run :( !
>
> Please help me!
>
> Thanks,
>
> Best regards,
> Nguyen,
>
> 



Re: How to run many jobs at the same time?

Posted by Tom White <to...@cloudera.com>.
The run() method on your JobController class waits until all its jobs
have finished. To get a parallel run you need to move the loop that
checks for completion out of the run() method. Or make JobController
Runnable, and launch it in two threads instead.

Tom

On Wed, Apr 22, 2009 at 2:09 AM, nguyenhuynh.mr
<ng...@gmail.com> wrote:
> Tom White wrote:
>
>> You need to start each JobControl in its own thread so they can run
>> concurrently. Something like:
>>
>>     Thread t = new Thread(jobControl);
>>     t.start();
>>
>> Then poll the jobControl.allFinished() method.
>>
>> Tom
>>
>> On Tue, Apr 21, 2009 at 10:02 AM, nguyenhuynh.mr
>> <ng...@gmail.com> wrote:
>>
>>> Hi all!
>>>
>>>
>>> I have some jobs: job1, job2, job3,... . Each job working with the
>>> group. To control jobs, I have JobControllers, each JobController
>>> control  jobs  follow the  specified  group.
>>>
>>>
>>> Example:
>>>
>>> - Have 2 Group: g1 and g2
>>>
>>> -> 2 JobController: jController1, jcontroller2
>>>
>>>  + jController1 contains jobs: job1, job2, job3, ...
>>>
>>>  + jController2 contains jobs: job1, job2, job3, ...
>>>
>>>
>>> * To run jobs, I sue:
>>>
>>> for (i=0; i<2; i++){
>>>
>>>    jCtrl[i]= new jController(group i);
>>>
>>>    jCtrl[i].run();
>>>
>>> }
>>>
>>>
>>> * I want jController1 and jController2 run parallel. But actual, when
>>> jController1 finished,  jController2 begin run.
>>>
>>>
>>> Why?
>>>
>>> Please help me!
>>>
>>>
>>> * P/s: jController use org.apache.hadoop.mapred.jobcontrol.JobControl
>>>
>>>
>>> Thanks,
>>>
>>>
>>> cheer,
>>>
>>> Nguyen.
>>>
>>>
>>>
>>
>>
> Thanks for your response!
>
> I have used Thread to start JobControl, some things like:
>
> public class JobController{
>
>    public JobController(String g){
>       .....
>    }
>
>    public run(){
>       Job j1 = new Job(..);
>       Job j2 =new Job(..);
>       JobControl jc = new JobControl("group1");
>
>       Threat t=new Thread(jc);
>       t.start();
>
>      while(! jc.allFinish()){
>          // Display state ....
>       }
>    }
> }
>
> * To run the code some like:
> JobController[] jController=new JController[2];
> for (int i=0; i<2; i++){
>    jController[i]=new JobController(group[i]);
>    JCOntroller[i].run();
>
> }
>
> * But not parallel run :( !
>
> Please help me!
>
> Thanks,
>
> Best regards,
> Nguyen,
>
>

Re: How to run many jobs at the same time?

Posted by "nguyenhuynh.mr" <ng...@gmail.com>.
Tom White wrote:

> You need to start each JobControl in its own thread so they can run
> concurrently. Something like:
>
>     Thread t = new Thread(jobControl);
>     t.start();
>
> Then poll the jobControl.allFinished() method.
>
> Tom
>
> On Tue, Apr 21, 2009 at 10:02 AM, nguyenhuynh.mr
> <ng...@gmail.com> wrote:
>   
>> Hi all!
>>
>>
>> I have some jobs: job1, job2, job3,... . Each job working with the
>> group. To control jobs, I have JobControllers, each JobController
>> control  jobs  follow the  specified  group.
>>
>>
>> Example:
>>
>> - Have 2 Group: g1 and g2
>>
>> -> 2 JobController: jController1, jcontroller2
>>
>>  + jController1 contains jobs: job1, job2, job3, ...
>>
>>  + jController2 contains jobs: job1, job2, job3, ...
>>
>>
>> * To run jobs, I sue:
>>
>> for (i=0; i<2; i++){
>>
>>    jCtrl[i]= new jController(group i);
>>
>>    jCtrl[i].run();
>>
>> }
>>
>>
>> * I want jController1 and jController2 run parallel. But actual, when
>> jController1 finished,  jController2 begin run.
>>
>>
>> Why?
>>
>> Please help me!
>>
>>
>> * P/s: jController use org.apache.hadoop.mapred.jobcontrol.JobControl
>>
>>
>> Thanks,
>>
>>
>> cheer,
>>
>> Nguyen.
>>
>>
>>     
>
>   
Thanks for your response!

I have used Thread to start JobControl, some things like:

public class JobController{
   
    public JobController(String g){
       .....
    }
   
    public run(){
       Job j1 = new Job(..);
       Job j2 =new Job(..);
       JobControl jc = new JobControl("group1");
      
       Threat t=new Thread(jc);
       t.start();
      
      while(! jc.allFinish()){
          // Display state ....
       }
    }
}

* To run the code some like:
JobController[] jController=new JController[2];
for (int i=0; i<2; i++){
    jController[i]=new JobController(group[i]);
    JCOntroller[i].run();

}

* But not parallel run :( !

Please help me!

Thanks,

Best regards,
Nguyen,


Re: How to run many jobs at the same time?

Posted by Tom White <to...@cloudera.com>.
You need to start each JobControl in its own thread so they can run
concurrently. Something like:

    Thread t = new Thread(jobControl);
    t.start();

Then poll the jobControl.allFinished() method.

Tom

On Tue, Apr 21, 2009 at 10:02 AM, nguyenhuynh.mr
<ng...@gmail.com> wrote:
> Hi all!
>
>
> I have some jobs: job1, job2, job3,... . Each job working with the
> group. To control jobs, I have JobControllers, each JobController
> control  jobs  follow the  specified  group.
>
>
> Example:
>
> - Have 2 Group: g1 and g2
>
> -> 2 JobController: jController1, jcontroller2
>
>  + jController1 contains jobs: job1, job2, job3, ...
>
>  + jController2 contains jobs: job1, job2, job3, ...
>
>
> * To run jobs, I sue:
>
> for (i=0; i<2; i++){
>
>    jCtrl[i]= new jController(group i);
>
>    jCtrl[i].run();
>
> }
>
>
> * I want jController1 and jController2 run parallel. But actual, when
> jController1 finished,  jController2 begin run.
>
>
> Why?
>
> Please help me!
>
>
> * P/s: jController use org.apache.hadoop.mapred.jobcontrol.JobControl
>
>
> Thanks,
>
>
> cheer,
>
> Nguyen.
>
>