You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hemanth Yamijala (JIRA)" <ji...@apache.org> on 2008/10/06 13:39:44 UTC

[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637087#action_12637087 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

Some comments:

JobChangeEvent: 
- Should be package-private.
- As a convention, should we extend {{java.util.Event}} ? Source can be the JIP.

JobStatusChangeEvent:
- Should be package-private.
- Javadoc seems to indicate an event is raised for progress and finish time changes as well, when it is not. This should be removed.
- The enum {{Events}} is actually specifying type of events - so should it read {{EventType}} ?
- IMO, doing the {{clone}} inside the methods of this class, while convenient, can lead to erroneous usage. For e.g. if someone changes the JobStatus before creating the event object, the old status is lost. And this cannot be captured anywhere else except in documentation, which could be missed. I think the usage will become more clear if {{JobStatusChangeEvent}} is written like this:
{code}
class JobStatusChangeEvent extends JobChangeEvent {
  public JobStatusChangeEvent(JobInProgress source, 
                                List<EventType> events,
                                JobStatus oldStatus, JobStatus newStatus) {
    super(source);
    this.events = events
    this.oldStatus = oldStatus;
    this.newStatus = newStatus;
  }
  // ... other APIs
}
{code}
and leave the responsibility of taking a snapshot to the callers. This means the callers write a bit more code, but it is less error prone. Also, I checked some of the event classes in the java API, and they seem to have a similar structure. For e.g. look at {{javax.naming.event.NamingEvent}} or {{java.util.prefs.NodeChangeEvent}}

Other JobInProgressListener sub-classes:
- So that future code is easier to write, we need to check for the type of event being {{JobStatusChangeEvent}} and the events enum is of the type we are interested in (time changes and priority changes) to default to the current implementation.
- I think it is OK to handle run state changes also in this JIRA, and behave similarly to {{jobRemoved}} atleast for the {{JobQueueJobInProgressListener}} and {{EagerTaskInitializationListener}}.

JobQueuesManager:
- It would be nice to add a comment in {{jobRemoved}} mentioning we already handle removals when the run state changes.
- It may be safer to check for {{runState}} to be RUNNING before calling {{promoteJob}}. And maybe {{promoteJob}} should be called {{makeJobRunning}} or something.

JobTracker:
- In {{RecoveryManager}}, previously {{jobUpdated}} was handled in the scheduler only once. Now, since we add separate events for PRIORITY and START_TIME, the same code would be executed twice. I think this should be avoided, maybe in the implementation of {{JobQueuesManager.jobUpdated}}.

Tests:
- There are some spurious System.out.printlns, which can change to LOG.debug
- We can use {{scheduler.getJobs()}} and check the job is not present in the list. This would make sure that the UI also will reflect the change.
- Can we add a test case for job priority change and the {{promoteJob}} code path as well ? 

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.