You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Vivek Ratan (JIRA)" <ji...@apache.org> on 2009/01/20 12:41:59 UTC

[jira] Commented: (HADOOP-5049) Jobs with 0 maps will never get removed from the default scheduler

    [ https://issues.apache.org/jira/browse/HADOOP-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665408#action_12665408 ] 

Vivek Ratan commented on HADOOP-5049:
-------------------------------------

Since initTasks() is called by each scheduler, and it can result in a state change for the job without any evenst being raised, this issue potentially affects every scheduler. I can see the following ways of fixing the problem: 

# JobInProgress.initTasks() should notify all listeners via JobInProgressListener.jobUpdated(). This seems clean and the right way to do things. Only problem is, multi-threaded schedulers need to be careful of synchronization issues. The scheduler calls initTasks(), which calls the scheduler back through the JobInProgressListener interface. Another issue, minor, is that JT needs to expose the listeners to JobInProgress, which, I think, is inevitable, given JobInProgress code has a whole lot of state changes. 
# Amar/Sreekanth suggested another, slightly different, approach which limits any state change notifications to be raised by the JT. Either JobInProgress.initTasks() lets the JT knwo of a state change in the job and the JT propagates that to the listeners, or initTasks() does not set the job to completed; rather, the JT, when looking at jobs in PREP state to detect running of a setup job, detects that a job has 0 maps, causes it to change state, and propagates that change to the listeners. This is not very different from the prviosu approach - we're still making the JT/JobInProgress responsible for propagating job state changes, but you do allow the JT to keep its listeners private. 
# Another approach is for the Schedulers to know that initTasks() can change the state of a job without raising an event, and deal with that. Amar's patch for the default scheduler does just that. As he points out, the Fair Scheduler doesn't really care. But the Capacity Scheduler will need to deal with this. You could argue that this is less clean since the schedulers are aware of what goes on in initTasks(), but it all depends on who you think 'owns' initTasks() - the schedulers or the framework. 

Personally, I think #1 is the best option as it ensures that any job state changes are propagated to the Schedulers through the listeners, but it does have its drawbacks too. 

> Jobs with 0 maps will never get removed from the default scheduler
> ------------------------------------------------------------------
>
>                 Key: HADOOP-5049
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5049
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-5049-v1.1.patch
>
>
> Jobs' with 0 maps finish/succeed in the init phase i.e while the job is in the _PREP_ state. {{EagerTaskInitializationListener}} removes the job after initing but {{JobQueueJobInProgressListener}} waits for a job-state change event to be raised and aonly then removes the job from the queue and hence the job will stay forever with the {{JobQueueJobInProgressListener}}. Looks like {{FairScheduler}} periodically scans the job list and removes completed jobs. {{CapacityScheduler}} has a concept of waiting jobs and scans waiting queue for completed jobs and purges them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.