You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org> on 2010/07/08 00:30:54 UTC

[jira] Commented: (MAPREDUCE-1924) Mappers running when reducers have finished

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886127#action_12886127 ] 

Joydeep Sen Sarma commented on MAPREDUCE-1924:
----------------------------------------------

this cannot be done by default i think - because mappers can have side effects. but an option seems desirable.

> Mappers running when reducers have finished
> -------------------------------------------
>
>                 Key: MAPREDUCE-1924
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1924
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Adam Kramer
>
> Occasionally, I will run jobs for which some reducers are able to finish but there are still mappers running. I understand why sometimes mappers restart themselves even after the reduce phase has begun--too many fetch-failures, for example. But in today's case, ALL of the reducers have succeeded and are done, so these mappers really ARE unnecessary...so it is a bug that they are running.
> Then, I killed one of them to see what was up--it just restarted itself. So, it is another bug that mappers don't know they're unnecessary when they're killed.
> My guess is that if one of these jobs, which clearly finished at least once, were to die randomly a few times, it would take the whole job with it--even though the job has completed.
> Whenever all reduce tasks are complete, Hadoop should kill ALL remaining map tasks and immediately move to finish the job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.