You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by kartheek muthyala <ka...@gmail.com> on 2011/09/16 17:08:53 UTC

Job Scheduler, Task Scheduler and Fair Scheduler

Hi all,
Can any one explain me the responsibilities of each scheduler?. I am
interested in the flow of commands that goes between these scheduler. And if
any one have any info regarding how the job scheduler schedules a job based
on the data locality?. As of I know, there is some heartbeat mechanism that
goes from task scheduler to job scheduler and in response job scheduler does
something here to find out the node where the data is more closely located
and schedules the task in that node. Is there an elaborate way of
explanation around this area?. Any help will be greatly appreciated.
Thanks and Regards,
Kartheek.

Re: Job Scheduler, Task Scheduler and Fair Scheduler

Posted by kartheek muthyala <ka...@gmail.com>.

Hey Arun,
Thanks for the information. And sorry for my previous mail regarding
updates!! I just wanted to emphasize the importance of the query. I couldn't
get enough time to go through the code that's why i approached you guys, as
you are expertise in this area.
Thanks & Regards,
Kartheek.

On Sat, Sep 17, 2011 at 12:09 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

>
> On Sep 16, 2011, at 11:26 PM, kartheek muthyala wrote:
>
> > Any updates!!
>
> A bit of patience will help. It also helps to do some homework and ask
> specific questions.
>
> I don't know if you have looked at any of the code, but there are 3
> schedulers:
> JobQueueTaskScheduler (aka default scheduler or fifo scheduler)
> Capacity Scheduler (CS)
> Fair Scheduler (FS).
>
> TaskScheduler is just an interface for all schedulers (default, CS, FS).
>
> Then there is JobInProgress which handles scheduling for map tasks of an
> individual job based on data locality (JobInProgress.obtainNew*MapTask).
>
> Other than that each of the schedulers (default, CS, FS) use different
> criteria for picking a certain job to offer a 'slot' on a given TT when it's
> available.
>
> All this has changed radically and completely with MRv2 which is now in
> branch-0.23 and trunk to allow MR and non-MR apps on same Hadoop cluster:
> http://wiki.apache.org/hadoop/NextGenMapReduce
>
> Arun
>
> >
> > ---------- Forwarded message ----------
> > From: kartheek muthyala <ka...@gmail.com>
> > Date: Fri, Sep 16, 2011 at 8:38 PM
> > Subject: Job Scheduler, Task Scheduler and Fair Scheduler
> > To: common-user@hadoop.apache.org
> >
> >
> > Hi all,
> > Can any one explain me the responsibilities of each scheduler?. I am
> > interested in the flow of commands that goes between these scheduler. And
> if
> > any one have any info regarding how the job scheduler schedules a job
> based
> > on the data locality?. As of I know, there is some heartbeat mechanism
> that
> > goes from task scheduler to job scheduler and in response job scheduler
> does
> > something here to find out the node where the data is more closely
> located
> > and schedules the task in that node. Is there an elaborate way of
> > explanation around this area?. Any help will be greatly appreciated.
> > Thanks and Regards,
> > Kartheek.
>
>

Re: Job Scheduler, Task Scheduler and Fair Scheduler

Posted by Arun C Murthy <ac...@hortonworks.com>.

On Sep 16, 2011, at 11:26 PM, kartheek muthyala wrote:

> Any updates!!

A bit of patience will help. It also helps to do some homework and ask specific questions.

I don't know if you have looked at any of the code, but there are 3 schedulers:
JobQueueTaskScheduler (aka default scheduler or fifo scheduler)
Capacity Scheduler (CS)
Fair Scheduler (FS).

TaskScheduler is just an interface for all schedulers (default, CS, FS).

Then there is JobInProgress which handles scheduling for map tasks of an individual job based on data locality (JobInProgress.obtainNew*MapTask).

Other than that each of the schedulers (default, CS, FS) use different criteria for picking a certain job to offer a 'slot' on a given TT when it's available.

All this has changed radically and completely with MRv2 which is now in branch-0.23 and trunk to allow MR and non-MR apps on same Hadoop cluster:
http://wiki.apache.org/hadoop/NextGenMapReduce

Arun

> 
> ---------- Forwarded message ----------
> From: kartheek muthyala <ka...@gmail.com>
> Date: Fri, Sep 16, 2011 at 8:38 PM
> Subject: Job Scheduler, Task Scheduler and Fair Scheduler
> To: common-user@hadoop.apache.org
> 
> 
> Hi all,
> Can any one explain me the responsibilities of each scheduler?. I am
> interested in the flow of commands that goes between these scheduler. And if
> any one have any info regarding how the job scheduler schedules a job based
> on the data locality?. As of I know, there is some heartbeat mechanism that
> goes from task scheduler to job scheduler and in response job scheduler does
> something here to find out the node where the data is more closely located
> and schedules the task in that node. Is there an elaborate way of
> explanation around this area?. Any help will be greatly appreciated.
> Thanks and Regards,
> Kartheek.

Fwd: Job Scheduler, Task Scheduler and Fair Scheduler

Posted by kartheek muthyala <ka...@gmail.com>.

Any updates!!

---------- Forwarded message ----------
From: kartheek muthyala <ka...@gmail.com>
Date: Fri, Sep 16, 2011 at 8:38 PM
Subject: Job Scheduler, Task Scheduler and Fair Scheduler
To: common-user@hadoop.apache.org

Hi all,
Can any one explain me the responsibilities of each scheduler?. I am
interested in the flow of commands that goes between these scheduler. And if
any one have any info regarding how the job scheduler schedules a job based
on the data locality?. As of I know, there is some heartbeat mechanism that
goes from task scheduler to job scheduler and in response job scheduler does
something here to find out the node where the data is more closely located
and schedules the task in that node. Is there an elaborate way of
explanation around this area?. Any help will be greatly appreciated.
Thanks and Regards,
Kartheek.