You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2008/07/12 02:27:31 UTC

[jira] Issue Comment Edited: (HADOOP-3412) Refactor the scheduler out of the JobTracker

    [ https://issues.apache.org/jira/browse/HADOOP-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613024#action_12613024 ] 

acmurthy edited comment on HADOOP-3412 at 7/11/08 5:27 PM:
----------------------------------------------------------------

Brice/Tom, this is looking good. Couple of brief comments - pardon me jumping in late, I've only recently started looking at this and other related jiras.

1. Let me get the minor one out of way first: can we call the main scheduler interface 'Scheduler' rather than 'TaskScheduler', it might be confusing vis-a-vis JobScheduler? *smile* 

Ok, the serious stuff:

2. I propose we update TaskScheduler.assignTask to reflect that a TaskTracker might have multiple slots free (HADOOP-3136 has very important utilization benefits). With that change it becomes explicit that the TaskTracker could have multiple map/reduce slots available and the scheduler services that request by giving it tasks from possibly different jobs.

{noformat}
public List<Task> assignTasks(TaskTrackerId taskTracker);
{noformat}

Oh, this might be good time to introduce a notion of TaskTrackerId similar to what Enis did for Job/Task/TaskAttempt (HADOOP-544) ?

3. The one major comment I had is to help quickly resolve the gap between this and HADOOP-3445. The major change coming with HADOOP-3445 is the notion of Queues. I'm inclined to believe that it will be beneficial to explicitly state the notion of Queues (and multiple Queues) in the Scheduler interface. To that effect I propose a minor change to the jobAdded/jobRemoved/jobUpdated apis:

{noformat}
public void jobAdded(QueueId, Job);
public void jobRemoved(QueueId, Job);
public void jobUpdated(QueueId, Job);
{noformat}

With this, it will be pave the way for HADOOP-3445 to get in quite easily.

4. Also, we'd need some query capabilities given the notion of Queues:

{noformat}
List<Queue> getQueues();
List<Job> getJobs(QueueId, State); // State is RUNNING/PENDING/COMPLETE
{noformat}

----

bq. I'm not comfortable with just making all of these classes public without thinking through the interfaces, since we have to maintain these public interfaces, and be careful (and backwards compatible) with evolution. So I suggest we keep them package private for the first release, and figure how to open it up later.

I'm inclined to go with Tom on keep these interfaces package-private for the first release, but I do realise Matei and others might be eager to run with it right-away!



      was (Author: acmurthy):
    Brice/Tom, this is looking good. Couple of brief comments - pardon me jumping in late, I've only recently started looking at this and other related jiras.

1. Let me get the minor one out of way first: can we call the main scheduler interface 'Scheduler' rather than 'TaskScheduler', it might be confusing vis-a-vis JobScheduler? *smile* 

Ok, the serious stuff:

2. I propose we update TaskScheduler.assignTask to reflect that a TaskTracker might have multiple slots free (HADOOP-3136 has very important utilization benefits). With that change it becomes explicit that the TaskTracker could have multiple map/reduce slots available and the scheduler services that request by giving it tasks from possibly different jobs.

{noformat}
public List<Task> assignTasks(TaskTrackerId taskTracker);
{noformat}

Oh, this might be good time to introduce a notion of TaskTrackerId similar to what Enis did for Job/Task/TaskAttempt (HADOOP-544) ?

3. The one major comment I had is to help quickly resolve the gap between this and HADOOP-3445. The major change coming with HADOOP-3445 is the notion of Queues. I'm inclined to believe that it will be beneficial to explicitly state the notion of Queues (and multiple Queues) in the Scheduler interface. To that effect I propose a minor change to the jobAdded/jobRemoved/jobUpdated apis:

{noformat}
public void jobAdded(QueueId, Job);
public void jobRemoved(QueueId, Job);
public void jobUpdated(QueueId, Job);
{noformat}

With this, it will be pave the way for HADOOP-3445 to get in quite easily.

----

bq. I'm not comfortable with just making all of these classes public without thinking through the interfaces, since we have to maintain these public interfaces, and be careful (and backwards compatible) with evolution. So I suggest we keep them package private for the first release, and figure how to open it up later.

I'm inclined to go with Tom on keep these interfaces package-private for the first release, but I do realise Matei and others might be eager to run with it right-away!


  
> Refactor the scheduler out of the JobTracker
> --------------------------------------------
>
>                 Key: HADOOP-3412
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3412
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Brice Arnould
>            Assignee: Brice Arnould
>            Priority: Minor
>             Fix For: 0.19.0
>
>         Attachments: JobScheduler-v9.1.patch, JobScheduler-v9.2.patch, JobScheduler-v9.patch, JobScheduler.patch, JobScheduler_v2.patch, JobScheduler_v3.patch, JobScheduler_v3b.patch, JobScheduler_v4.patch, JobScheduler_v5.patch, JobScheduler_v6.1.patch, JobScheduler_v6.2.patch, JobScheduler_v6.3.patch, JobScheduler_v6.4.patch, JobScheduler_v6.patch, JobScheduler_v7.1.patch, JobScheduler_v7.patch, JobScheduler_v8.patch, RackAwareJobScheduler.java, SimpleResourceAwareJobScheduler.java
>
>
> First I would like warn you that my proposition is assumed to be very naive. I just hope that reading it won't make you lose time.
> h4. The aim
> It seems to me that improving Hadoop scheduling could be very profitable. But, it is hard to implement and compare schedulers, because the scheduling logic is mixed within the rest of the JobTracker.
> This bug is the first step of an attempt to improve the Hadoop scheduler. It re-implements the current scheduling algorithm in a separate class called JobScheduler. This new class is instantiated in the JobTracker.
> h4. Bug fixed as a side effects
> This patch probably cannot be submited as it is.
> A first difficulty is that it does not have exactly the same behaviour than the current JobTracker. More precisely, it doesn't re-implement things like code that seems to be never called or concurency problems.
> I wrote TOCONFIRM where my proposition differ from the current implementation, so you can find them easily.
> I know that fixing bugs silently is bad. So, independently of what you decide about this patch, I will open issues for bugs that you confirm.
> h4. Other side effects
> Another side effect of this patch is to add documentation about each step of the scheduling. I hope that it will help future improvement by lowering the level required to contribute to the scheduler.
> It also reduces the complexity and the granularity of the JobTracker (making it more parallel).
> h4. The future
> If you feel that this is a step the right direction, I will try to propose a JobSchedulerInterface that many JobSchedulers could implement and to propose alternatives to the current « FifoJobScheduler ».  If some of you have ideas about that please tell ^^ I will also open issues for things marked as FIXME in the patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.