You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hemanth Yamijala (JIRA)" <ji...@apache.org> on 2009/05/15 05:43:45 UTC

[jira] Commented: (HADOOP-4981) Prior code fix in Capacity Scheduler prevents speculative execution in jobs

    [ https://issues.apache.org/jira/browse/HADOOP-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709688#action_12709688 ] 

Hemanth Yamijala commented on HADOOP-4981:
------------------------------------------

In general, we would like to have some support from JobInProgress itself. However, given that we are not *finding* a speculative task, only want to know if there's one, and the impact of this decision is not drastic, I am fine with this for now. I've also spoken to Devaraj to try and see if HADOOP-2141 can provide such an API in JobInProgress.

That said, few comments on the current patch:

- Use getMapTasks, getReduceTasks and getStatus() instead of directly accessing the members.
- We should check if this job has speculation enabled. JobInProgress.getJobConf().getMapSpeculativeExecution() and similar for reduce.

In test cases:
- For speculative tasks is the attempt # the same, or does it increment. If it does, then when new taskattemptid is being created, shouldn't the last two params be interchanged ? 
- Rather than extending Fake[Job|Task]InProgress for the speculative case, we should just set speculation on in the job conf of the job and have the corresponding speculation code on in the appropriate methods check for this condition and work accordingly. This will make the class more easy to change without breaking subclasses.
- Need a documentation summarizing the new tests.
- assert pending tasks are 0 before checking for speculative assignment, so we can be sure the code path for hasSpeculativeTask is traversed.
- I think the Test case for HighRAM jobs should be as follows:
-- Submit a High RAM job with spec enabled. Do not have a reduce for this, so that when the map is scheduled, pending becomes 0 for both maps and reduces. Submit a normal job that would fit after the high RAM job without speculation.
-- Schedule the task from high ram job on tt
-- Same TT comes back, and it shouldn't get a task to run though normal job's task can fit.
-- Another TT comes and the speculative task should get executed.
-- Then finish the first task, the other job should get scheduled.
- Add a similar test case for testing speculative reduces as well, on similar lines.

> Prior code fix in Capacity Scheduler prevents speculative execution in jobs
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-4981
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4981
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Vivek Ratan
>         Attachments: 4981.1.patch, 4981.2.patch, HADOOP-4981-1.patch
>
>
> As part of the code fix for HADOOP-4035, the Capacity Scheduler obtains a task from JobInProgress (calling obtainNewMapTask() or obtainNewReduceTask()) only if the number of pending tasks for a job is greater than zero (see the if-block in TaskSchedulingMgr.getTaskFromJob()). So, if a job has no pending tasks and only has running tasks, it will never be given a slot, and will never have a chance to run a speculative task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.