You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Joe Mudd (JIRA)" <ji...@apache.org> on 2014/06/06 15:40:01 UTC

[jira] [Commented] (MAPREDUCE-5860) Hadoop pipes Combiner is closed before all of its reduce calls

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019845#comment-14019845 ] 

Joe Mudd commented on MAPREDUCE-5860:
-------------------------------------

Bumped up to major since this issue could cause rows to be lost or a crash due to close() cleaning up before all of the Combiner's reduce() calls.

> Hadoop pipes Combiner is closed before all of its reduce calls
> --------------------------------------------------------------
>
>                 Key: MAPREDUCE-5860
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5860
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.23.0
>         Environment: 0.23.0 on 64 bit linux
>            Reporter: Joe Mudd
>         Attachments: HadoopPipes.cc.patch, MAPREDUCE-5860.patch
>
>
> When a Combiner is specified to runTask() its reduce() method may be called after its close() method has been called due to how the Combiner's containing object, CombineRunner, is closed after the TaskContextImpl's reducer member is closed (see TaskContextImpl::closeAll()).
> I believe the fix is to delegate the Combiner's ownership to CombineRunner, making it responsible for calling the Combiner's close() method and deleting the Combiner instance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)