You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Vivek Ratan (JIRA)" <ji...@apache.org> on 2008/09/02 06:08:44 UTC

[jira] Created: (HADOOP-4053) Schedulers need to know when a job has completed

Schedulers need to know when a job has completed
------------------------------------------------

                 Key: HADOOP-4053
                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
             Project: Hadoop Core
          Issue Type: Improvement
            Reporter: Vivek Ratan


The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634761#action_12634761 ] 

Owen O'Malley commented on HADOOP-4053:
---------------------------------------

I think that jobUpdated() should be called on all state changes of the job. So 

init -> run -> complete 

should all generate calls to jobUpdated(). Then it would solve the problem, right?

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639538#action_12639538 ] 

Hadoop QA commented on HADOOP-4053:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12392097/HADOOP-4053-v5.4.patch
  against trunk revision 704536.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3458/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3458/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3458/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3458/console

This message is automatically generated.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v5.5.patch

Attaching a patch that fixes {{TestJobQueueInformation}}. {{TestJobQueueInformation}} checks queue-info after the job finishes. Fixed that to check queue-info for running job. _test-patch_ and _ant test_ passed on my box. 

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch, HADOOP-4053-v5.5.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637528#action_12637528 ] 

Steve Loughran commented on HADOOP-4053:
----------------------------------------

Hemanth, why do you want the event classes to be package private? I'm going to listen to them and then feed the events off to other things that may be interested. Making them package private forces me to write my classes in the hadoop package, and prevents me relaying the events to code in different packages.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v3.1.patch

Attaching a patch that implements the {{JobChangeEvent}} concept. Here is how it is implemented.

_Assumptions :_
Everything that has the potential to change a job's state is captured and bundled under {{JobStatus}}. Hence taking snapshot of job's status before and after the event should be sufficient determine the state change.

_Working :_
1) {{JobInProgressListener.jobUpdated()}} now takes {{JobChangeEvent}} as a parameter.

2) {{JobChangeEvent}} is an abstract class that has just one api, {{getJobInProgress()}}.

3) For the task at hand, i.e handling _priority-change_, _start-time-change_ and _job-runstate-change_, I have extended {{JobChangeEvent}} to {{JobStatusChangeEvent}}. 

4) {{JobStateChangeEvent}} hosts a set of _sub-events_ that can lead to job-status change. These are fields from {{JobStatus}} that has a potential to change for a given job. Some of them are _priority, start-time, run-state_ etc. While composing an event, one can specify what all _sub-events_ constitute the state change. Note that the order in which the _sub-events_ are specified is also preserved.

5) For capacity-scheduler,  based on the _sub-events_ constituting the state transition, appropriate action is performed. For now the actions are
    - promote a job from the waiting queue to the running queue
    - remove a job upon job completion
    - re-position the job in the queue as the parameters that decide where the job is positioned has changed

6) If {{JobStateChangeEvent}} fails to capture all the events then {{JobChangeEvent}} can be extended to cater that case.

7) Other listener implementations remain unchanged as they just require {{jobInProgress}} which is obtained from {{JobChangeEvent}}.

Tested the patch with capacity scheduler and it works fine. The web-ui doesnt show completed jobs in the job queue which means that the job is removed upon completion. _test-patch_ and _ant test_ pass on my box. Rest of the listener implementations should not be affected.
This patch is meant for 0.19.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4053:
-------------------------------------

    Priority: Blocker  (was: Major)

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638729#action_12638729 ] 

Amar Kamat commented on HADOOP-4053:
------------------------------------

I had an offline discussion with Hemanth and Devaraj and here is what we all agreed upon :

bq. Do we need JobStatusChangeEvent(JobInProgress jip, EventType eventType, JobStatus status) ?
No. Job's state is completely manipulated and controlled by an external entity (JobTracker, JobInProgressListener etc) and hence we should always have an _old_ and a _new_ state. Job's state changes via well known apis and hence we should capture the state changes whenever these apis are invoked. Following is the list of knows events w.r.t to a job
||event-type||api||
|Job added|JIP constructor|
|Job inited|JIP.initTasks()|
|Job priority change|JobTracker.setJobPriority()|
|Running Job killed|JobTracker's heartbeat (cleanup tasks)|
|Queued job killed|JobTracker.killJob() --> JIP.terminateJob()|
|Job completed|JobTracker's heartbeat|
|Job removed|JobTracker.finalizeJob()/JobTracker's job expiry thread|

As of now the following changes are tracked 
- job added
- job inited
- job priority change

We just need to track the job-state change in:
- JobTracker.updateTaskStatuses() (job-completion, running-job-kill [external-kill-leading-to-cleanup-tasks, too-many-failures])
- JobTracker.killJob() (external-queued-job-kill)

bq. Seems OK to have JonInProgress.initTasks() do the event notification
We cant just change the method signature as its a public api. Hence I introduced {{JobInProgress.initTasksAndReportChange()}}. Looking at Hemanth's comment#2 [here|https://issues.apache.org/jira/browse/HADOOP-4053?focusedCommentId=12638478#action_12638478], I feel its ok to make the init-event notification inline and whereever needed. We can factor it out later, if needed.


> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v3.2.patch

Removed some unnecessary changes.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634820#action_12634820 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

Owen, the jobUpdated API currently passes the JobInProgress argument. If this is called at every state change, the listeners would need to keep track of what actually changed. I think that would be a bit expensive, because it would mean the schedulers need to keep a copy of the old state doubling the memory requirements. 

A better API would have been something like {{jobStateChanged(ChangeEvent e)}} where the ChangeEvent captures the state transition in some way. This would keep it generic and allow for additional state transitions to be added in the future. If done this way, it would solve the problem.

Looking at the way it currently is implemented, it seems like the intent is to capture each state change via a separate call, and somehow the job completion event was missed (or thought unnecessary). In that case, possibly adding jobCompleted is the right approach. This does have the problem that additional state changes that may come in future (for e.g. paused ?) may need changes to the interface again.

So, either way, I think we would need to change the current interface a bit. And it seems like adding jobCompleted is less intrusive at this stage. Comments ?


> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634434#action_12634434 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

The default implementation for {{jobCompleted}} is calling {{jobRemoved}}. This would result in {{jobRemoved}} being called twice - once when the job completes, and once when it retires. Hence all implementations of {{jobRemoved}} need to work correctly even if called twice. For e.g. in the {{CapacityTaskScheduler}} we clear some data structures. I see that if called twice, this may result in problems. For e.g. {{qsi.numJobsByUser.get(job.getProfile().getUser());}} may return null when called a second time, which is not being checked. I think it is safer to have {{jobCompleted}} similar to other APIs - that is, as an abstract method and implement it in schedulers. This way, we will be forced to check the implementation is correct.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-4053:
--------------------------------

    Component/s: mapred

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v5.2.patch

Attaching a new patch with 
1) updates to listeners when the job  completes. These updates are done in {{JobTracker.updateTaskStatuses()}} and {{JobTracker.killJob()}}.
2) a test case to check if the _job-completion_ events are propagated properly.

Result of _test-patch_ on my box
{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 12 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.

{code}

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634452#action_12634452 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

bq. 1. I'd expect jobCompleted() to be a no-op, and have jobRemoved() get called when the job is removed.

I can only speak from the {{CapacityTaskScheduler}}'s perspective. It cares about jobCompleted, and doesn't care about jobRemoved. Actually, I would have thought jobCompleted is the more important event for a scheduler. While the number of implementations is still small (only four right now), I would still advocate an abstract method and leave the schedulers to actually deal with the implementation.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v5.3.patch

Attaching a new patch with the changes. _test-patch_ result
{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 9 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
{code}

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638486#action_12638486 ] 

Amar Kamat commented on HADOOP-4053:
------------------------------------

bq. Do we need JobStatusChangeEvent(JobInProgress jip, EventType eventType, JobStatus status) ? 
The reason for providing this api is to take care of the case where the prev state or the next state doesnt matter. For example, {{finalizeJob()}}. In such case the end state is what ultimately matters and  passing a _null_ is a hack. I am ok to change this to pass both _old_ and _new_ status. But the case I have mentioned earlier made sense to me. 

bq. I don't think initTasksAndReportChange should be a public API.
I wanted to keep it in SYNC with {{JobInProgress.initTasks()}}. But I think, for now, we should keep it package private. +1

bq. In fact, it seems to me that it is better to have this wrapper API where needed.
I wanted to keep all the code close to the source (here _initTasks()_). The authority to _init_ lies with the {{JobInProgress}} and hence I thought the authority to _init_ and _report-change_ should also go to {{JobInProgress}}.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627573#action_12627573 ] 

Vivek Ratan commented on HADOOP-4053:
-------------------------------------

A job is marked complete in JobTracker.finalizeJob(). At this point, a JobInProgressListener listener should be notified of this change. I can see at least two options to do this: 
1. Call JobInProgressListener.jobUpdated() at this point. Ideally, this method should include two arguments/objects: one capturing the old state and one the new state, otherwise the listener has to figure out how the job has been updated. 
2. Add a new method to JobInProgressListener called jobCompleted(), and call that from finalizeJob(). 

The second is cleaner, IMO, as a job's completion is a distinct, well understood event in the system. 

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>            Reporter: Vivek Ratan
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sameer Paranjpye updated HADOOP-4053:
-------------------------------------

    Fix Version/s: 0.19.0

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634449#action_12634449 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

bq. It may be wise to have a bundled implementation that queues up the events and returns fast, so that people who implement this interface (myself included) have a base implementation that works the way the JobTracker expects;

Are you suggesting having something like the following:

{code}
public class DefaultJobInProgressListener extends JobInProgressListener {

  Queue<JobInProgress> completedJobs;
  // ...

  @override
  public void jobCompleted(JobInProgress jip) [
    completedJobs.add(jip);
  }

  // ...
}
{code}

And others would extend this class to take out events from the queues and act on them, perhaps in a separate thread ?

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Status: Patch Available  (was: Open)

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634448#action_12634448 ] 

Amar Kamat commented on HADOOP-4053:
------------------------------------

bq. 1. I'd expect jobCompleted() to be a no-op, and have jobRemoved() get called when the job is removed.
I would expect {{jobRemoved()}} to be a no-op and {{jobComplete()}} to do what is currently done by {{jobRemoved()}}. For all the current schedulers, {{jobRemoved()}} and {{jobComplete()}} are essentially the same. {{jobRemoved()}} is important to schedulers for which the presence of a job in the memory makes a difference e.g, memory sensitive schedulers. For others the only thing that matters is the run-state of a job and hence it makes more sense to clean up completed jobs as soon as possible. 

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637514#action_12637514 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

Some questions on the changes:

- In the implementation of jobUpdated, we are checking for instance of event being {{JobStatusChangeEvent}}, but not for relevant event types. So, if we add more event types later, we might need to change this implementation. I think it is safe to check that the event types are ones we are interested in.
- In {{JobQueueManager}}, I am thinking if it's a good idea to retain the implementation in {{jobRemoved}} just in case.
- I couldn't find a test case that tests priority changes are handled correctly. Is this added ?

Rest of the changes look ok.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v1.patch

Attaching a patch the adds {{jobCompleted()}} api to {{JobInProgressListener}} which defaults to {{jobRemoved()}}. {{JobTracker}} now calls {{jobCompleted()}} once the job finishes. Changed {{TestCapcityScheduler}} to check if the job is removed from the queues. Testing in progress.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637531#action_12637531 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

Steve, the {{JobInProgressListener}} class which is going to listen to these events is package private. So, wouldn't that require you to have classes in the hadoop.mapred package anyway ? And AFAIK, we have kept the scheduler related APIs package private where possible as we are still testing waters. HADOOP-3822 would be when we open this up, I guess.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638478#action_12638478 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

We are getting close. But I missed two points from the last patch which I think are important. Sorry !

- Do we need {{JobStatusChangeEvent(JobInProgress jip, EventType eventType, JobStatus status)}} ? It is confusing to set {{oldStatus}} and {{newStatus}} to the same object. It might be much better to say that {{oldStatus}} is null and not available, but that's also confusing, for why would it not be available. I think the main reason is that in {{JobTracker.finalizeJob}}, maybe by the time we can raise the event, the status is already changed. If that's the case, we should actually try and clone the status before changing. This may be a bit more code, but it seems clearer to keep the model that {{JobStatusChangeEvent}} always has a correct {{oldStatus}} and {{newStatus}}.

- I don't think {{initTasksAndReportChange}} should be a public API. It seems like a helpful wrapper - that's all. Also, I'm not sure if what it's doing is 'reporting' a change. In fact, it seems to me that it is better to have this wrapper API where needed. Currently, I think it is in {{JobTracker}} and {{CapacityTaskScheduler}}. This is a little code duplication, but since the choice of raising the event in this case is flexible, it seems OK to have this duplication.



> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638471#action_12638471 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

bq. how about it is kept package private for .19, for .20 it is moved and made public-but-internal and intended for monitoring above and beyond scheduling? 

+1 to this suggestion.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639944#action_12639944 ] 

Hadoop QA commented on HADOOP-4053:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12392190/HADOOP-4053-v5.5.patch
  against trunk revision 704989.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 12 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to cause Findbugs to fail.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3467/testReport/
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3467/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3467/console

This message is automatically generated.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch, HADOOP-4053-v5.5.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627575#action_12627575 ] 

Amar Kamat commented on HADOOP-4053:
------------------------------------

I think {{JobInProgressListener.jobUpdated()}} should have {{JobStatus}} as a parameter which will save us from writing new specific apis. Internally one can keep a map of the old statuses and infer what has changed. 

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>            Reporter: Vivek Ratan
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v5.4.patch

With HADOOP-4261, a job moves to {{RUNNING}} state only after the _setup_ tasks are complete. Hence invoking {{jobUpdated()}} after {{initTasks()}} is  useless as there is no _run-state_ change. Attaching a patch that correctly captures the run-state changes in a job and informs the listeners about it. 

Result of _test-patch_ on my box :

{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 9 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
{code}

Testing in progress.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala reassigned HADOOP-4053:
----------------------------------------

    Assignee: Amar Kamat

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4053:
-------------------------------------

    Affects Version/s: 0.19.0
        Fix Version/s: 0.19.0

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>             Fix For: 0.19.0
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640197#action_12640197 ] 

Hudson commented on HADOOP-4053:
--------------------------------

Integrated in Hadoop-trunk #635 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/635/])
    HADOOP-4426. TestCapacityScheduler broke due to the two commits  and HADOOP-4373. This patch fixes that. Contributed by Hemanth Yamijala.
. Schedulers must be notified when jobs complete. (Amar Kamat via omalley)


> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch, HADOOP-4053-v5.5.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638582#action_12638582 ] 

Vivek Ratan commented on HADOOP-4053:
-------------------------------------

bq. Do we need JobStatusChangeEvent(JobInProgress jip, EventType eventType, JobStatus status) ?

I also feel we don't need this. Whether the old state or new state matters or not is a decision left to the code that consumes the event change. The creator of this event shouldn't make that decision. I think we should ensure that there is always an old state and new state (so 'null' should not be used). _finalizeJob()_ should also provide an old and new state. If the Capacity Scheduler chooses to ignore the old state, so be it, but some other listener may not. So I'm basically recommending that: 
* have only one constructor in JobStatusChangeEvent, which accepts both an old state and new state. 
* Neither of these states should be null. 
* Can these two states be the same object? I don't see why not. I can't think of a situation where they would be, but forcing the restriction that the states be different seems unnecessary. 

Seems OK to have _JonInProgress.initTasks()_ do the event notification. _initTasks()_ changes the state of a job, so the event notification should always happen. 

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-4053:
--------------------------------

    Fix Version/s:     (was: 0.19.0)

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634460#action_12634460 ] 

Steve Loughran commented on HADOOP-4053:
----------------------------------------

My needs aren't so much job scheduling as workflow integration. I'm just listening for job lifecycle events so that I can match that lifecycle in remote code. As of yesterday I have simple MR jobs being deployed against a dynamically instantiated set of hadoop processes, using job.getStatus() to poll the state of the job and detecting success/failure when the job declares itself completed. But already I can see that my tests get into trouble here as they tear down the processes once the job is finished, and I see error messages in the test log complaining that the trackers can't write their its task/job histories as the filesystem has gone down. I need to 
 -consider moving from polling to notifiications to check job state (these would be RMI calls or something similar, hence slow)
 -wait until the job and task trackers are completely done with processing the jobs before pulling out the results and shutting down the cluster

so: no expectation that the base methods do anything, I'm just relaying events to other programs that may or may not care

For the queue, I'd have a single queue of job events {{Queue<JobLifecycleEvent> events}} and handle
{{{
  public void jobCompleted(JobInProgress jip) [
    events.add(new JobLifecycleEvent(JobLifecycleEventType.COMPLETED,jip)
  }
}}} then the queue thread would forward these off to whatever remote entity cared. 

Given that schedulers and other listeners behave differently, I'm now not so sure about a base class. The javadocs for the listener need to make it clear that blocking isn't allowed so that anyone providing a listener knows to do async work if needed.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637860#action_12637860 ] 

Steve Loughran commented on HADOOP-4053:
----------------------------------------

how about it is kept package private for .19, for .20 it is moved and made public-but-internal and intended for monitoring above and beyond scheduling?

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v2.patch

Attaching a patch that
- adds an abstract {{jobCompleted()}} api to {{JobInProgressListener}}
- changes the behaviour of capacity scheduler to update itself on a completed job event
- adds no-op for other {{JobInProgressListener}} implementations
- changes the capacity scheduler test case to test the desired behaviour
- update the listeners upon job completion from JobTracker.

_test-patch_ and _ant test_ pass on my box. 


> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635446#action_12635446 ] 

Owen O'Malley commented on HADOOP-4053:
---------------------------------------

I propose that we protect ourselves from API changes in the future, by making the API:

{code}
  public void jobUpdated(JobChangeEvent e);
{code}

and the JobChangeEvent includes a getJobInProgress() method. That will allow us to add new events in the future without having to change all of the schedulers.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637828#action_12637828 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

I agree with Owen about not making this public as yet. We could put it in a different package, but more fundamentally, I think it is still young to expose this API and classes publicly. I recommend we keep this package private for this release.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Status: Patch Available  (was: Open)

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch, HADOOP-4053-v5.5.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634444#action_12634444 ] 

Steve Loughran commented on HADOOP-4053:
----------------------------------------

1. I'd expect jobCompleted() to be a no-op, and have jobRemoved() get called when the job is removed.

2. There's a big assumption in JobTracker that the callbacks are fast and don't block on anything, as the calls are   synchronized on the JobTracker, the jobs and the taskScheduler (always in that order). It may be wise to have a bundled implementation that queues up the events and returns fast, so that people who implement this interface (myself included) have a base implementation that works the way the JobTracker expects; so things we add things like postings to tuple-spaces or other notifications happen outside the synchronized block.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Status: Open  (was: Patch Available)

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637087#action_12637087 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

Some comments:

JobChangeEvent: 
- Should be package-private.
- As a convention, should we extend {{java.util.Event}} ? Source can be the JIP.

JobStatusChangeEvent:
- Should be package-private.
- Javadoc seems to indicate an event is raised for progress and finish time changes as well, when it is not. This should be removed.
- The enum {{Events}} is actually specifying type of events - so should it read {{EventType}} ?
- IMO, doing the {{clone}} inside the methods of this class, while convenient, can lead to erroneous usage. For e.g. if someone changes the JobStatus before creating the event object, the old status is lost. And this cannot be captured anywhere else except in documentation, which could be missed. I think the usage will become more clear if {{JobStatusChangeEvent}} is written like this:
{code}
class JobStatusChangeEvent extends JobChangeEvent {
  public JobStatusChangeEvent(JobInProgress source, 
                                List<EventType> events,
                                JobStatus oldStatus, JobStatus newStatus) {
    super(source);
    this.events = events
    this.oldStatus = oldStatus;
    this.newStatus = newStatus;
  }
  // ... other APIs
}
{code}
and leave the responsibility of taking a snapshot to the callers. This means the callers write a bit more code, but it is less error prone. Also, I checked some of the event classes in the java API, and they seem to have a similar structure. For e.g. look at {{javax.naming.event.NamingEvent}} or {{java.util.prefs.NodeChangeEvent}}

Other JobInProgressListener sub-classes:
- So that future code is easier to write, we need to check for the type of event being {{JobStatusChangeEvent}} and the events enum is of the type we are interested in (time changes and priority changes) to default to the current implementation.
- I think it is OK to handle run state changes also in this JIRA, and behave similarly to {{jobRemoved}} atleast for the {{JobQueueJobInProgressListener}} and {{EagerTaskInitializationListener}}.

JobQueuesManager:
- It would be nice to add a comment in {{jobRemoved}} mentioning we already handle removals when the run state changes.
- It may be safer to check for {{runState}} to be RUNNING before calling {{promoteJob}}. And maybe {{promoteJob}} should be called {{makeJobRunning}} or something.

JobTracker:
- In {{RecoveryManager}}, previously {{jobUpdated}} was handled in the scheduler only once. Now, since we add separate events for PRIORITY and START_TIME, the same code would be executed twice. I think this should be avoided, maybe in the implementation of {{JobQueuesManager.jobUpdated}}.

Tests:
- There are some spurious System.out.printlns, which can change to LOG.debug
- We can use {{scheduler.getJobs()}} and check the job is not present in the list. This would make sure that the UI also will reflect the change.
- Can we add a test case for job priority change and the {{promoteJob}} code path as well ? 

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Status: Patch Available  (was: Open)

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637683#action_12637683 ] 

Owen O'Malley commented on HADOOP-4053:
---------------------------------------

I'm not  comfortable with making the classes public in the mapred package. Could we make them public in another package? Or is the visibility wrong? I just don't want the public javadoc to include these events that absolutely don't matter to users of map/reduce. If we move them to another package, we should turn off the public javadoc from that package.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637124#action_12637124 ] 

Vivek Ratan commented on HADOOP-4053:
-------------------------------------

I had a few questions/comments on _JobStatusChangeEvent_.

- agree with Hemanth that the old JobStatus and new JobStatus should be passed in explicitly. Otherwise there are hidden dependencies in the calling sequence. 
- It's not clear to me how we're naming the enum values for Events in _JobStatusChangeEvent_. What does RUN_STATE mean? Does it mean an event that cases a Job's run state to change? If so, do you mean the job was in a running state and changed to something else or that its state changed to a running state. I see the same enum value used for both. In CapacityScheduler.getTaskFromQueue(), you add a RUN_STATE event when the job's state changes from PREP to RUNNING. In JobTracker.finalizeJob(), you add a RUN_STATE event when the job's state changes from RUNNING To something else. I think you need to use separate events and name the events a little more consistently. Or else, just rename the enum to STATE_CHANGE, which can be used for any state change. This should be OK, given that you have an old and new job status and can figure out how the state changed. In general, the enum values should be verbs: FINISH_TIME_CHANGED , rather than FINISH_TIME. 
- I don't feel very comfortable with the fact that  _JobStatusChangeEvent_ can contain multiple Events? I see that the only use case is in the job recovery, when more than one attribute of a job status has changed. But, abstractly, having a single _JobStatusChangeEvent_ object handle multiple events is not intuitive. Each event changes the job status. Since _JobStatusChangeEvent_ only tracks a single pair of old and new JobStatus objects, what you're really saying is that you can add events as long as each one independently changes the job status without affecting the other events. What prevents a user, for example, from adding two RUN_STATE events? Each one changes the job status, but you can only keep track of two of them. I think conceptually, a _JobStatusChangeEvent_ object should map to a single event change, which in turn maps to a single pair of JobStatus objects. That's much cleaner. During the normal running of the JobTracker, you only create a _JobStatusChangeEvent_ object for a single event. It's only in that one use case for recovering jobs where you apply multiple changes to a job status, and i think it's ok to call updateJobListeners() multiple times. Otherwise, you muddle up the semantics of a _JobStatusChangeEvent_ object. 



> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v4.patch

Attaching a patch the incorporates Hemanth's and Vivek's comments.

bq. JobChangeEvent:
{{JobChangeEvent}} clearly reflects that the job in hadoop has changed. So any external entity that needs to monitor hadoop jobs should expect {{JobChangeEvent}}. I think for now we can keep it simple and extend {{JobChangeEvent}} if required in future. 
bq. JobStatusChangeEvent:
+1
bq. Other JobInProgressListener sub-classes:
+1
bq. JobQueuesManager:
+1
bq. JobTracker:
Considering Vivek's comments we would probably need to call {{jobUpdated()}} twice.
bq. Tests:
+1
bq. What does RUN_STATE mean?
_RUN_STATE_ is used throughout the framework to indicate the state of the object. It has nothing to do with the object being in the _RUNNING_ state. In order to be  consistent, I think we should keep it _RUN_STATE_. I think appending the events with _CHANGED_ looks ok to me.
bq. I don't feel very comfortable with the fact that JobStatusChangeEvent can contain multiple Events?
+1. Will make it 1-1 mapping from {{JobStatusChangeEvent}} object to {{EventType}}.

_test patch_ and _ant test_ passed on my box.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-4053:
----------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just commited this. Thanks, Amar!

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch, HADOOP-4053-v5.5.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v4.1.patch

Attaching a patch incorporating Hemanth's comments
bq. I think it is safe to check that the event types are ones we are interested in.
done.

bq. In JobQueueManager, I am thinking if it's a good idea to retain the implementation in jobRemoved just in case.
Keeping {{jobRemoved}} might mask some errors if any as we are sure that the job should be removed upon job completion.

bq. I couldn't find a test case that tests priority changes are handled correctly. Is this added ?
Added now.

_ant -test-patch_ result from my box :
{code}
+1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
{code}

_ant test_ passed on my box.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634847#action_12634847 ] 

Steve Loughran commented on HADOOP-4053:
----------------------------------------

I haven't implemented anything, so its not intrusive to me. I would end up serializing something like a {{ChangeEvent}} regardless of which approach was taken.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634462#action_12634462 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

bq. The javadocs for the listener need to make it clear that blocking isn't allowed so that anyone providing a listener knows to do async work if needed.

+1

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637540#action_12637540 ] 

Steve Loughran commented on HADOOP-4053:
----------------------------------------

well, if you make the events public, I can have a listener in package that relays out. This is what I do today, with a source tree that has some in org.apache.hadoop, and the rest in org.smartfrog :-

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/

keeping it private just increases the amount of marshalling I have to do.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631672#action_12631672 ] 

Hemanth Yamijala commented on HADOOP-4053:
------------------------------------------

The {{JobInProgressListener}} states it is for capturing lifecycle of a job. Given that, a job's state change when it completes is a lifecycle change. And therefore, it should be notified by a separate API such as {{jobCompleted()}}. So +1 for that approach.

Since there are schedulers already implemented, should we make this method a concrete one in {{JobInProgressListener}} that's a no-op, and allow the individual schedulers to override and implement the desired functionality ?

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Status: Open  (was: Patch Available)

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch, HADOOP-4053-v5.3.patch, HADOOP-4053-v5.4.patch, HADOOP-4053-v5.5.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639341#action_12639341 ] 

Amar Kamat commented on HADOOP-4053:
------------------------------------

Changes to {{TestJobTrackerRestart}} should be ignored.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch, HADOOP-4053-v3.2.patch, HADOOP-4053-v4.1.patch, HADOOP-4053-v4.patch, HADOOP-4053-v5.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to know that a job has completed. jobRemoved() is called when a job is retired, which can happen many hours after a job is actually completed. jobUpdated() is called when a job's priority is changed. We need to notify a listener when a job has completed (either successfully, or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.