You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Yoram Arnon (JIRA)" <ji...@apache.org> on 2006/09/20 20:48:24 UTC

[jira] Updated: (HADOOP-444) In streaming with a NONE reducer, you get duplicate files if a mapper fails, is restarted, and succeeds next time.

     [ http://issues.apache.org/jira/browse/HADOOP-444?page=all ]

Yoram Arnon updated HADOOP-444:
-------------------------------

    Status: Patch Available  (was: Open)

this issue is resolved as part of the patch submitted for HADOOP-542.
Please mark it as fixed.
Thanks,
Yoram

> In streaming with a NONE reducer, you get duplicate files if a mapper fails, is restarted, and succeeds next time.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-444
>                 URL: http://issues.apache.org/jira/browse/HADOOP-444
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.5.0
>            Reporter: Dick King
>         Assigned To: Michel Tourn
>
> When the dust settled after a streaming run, the directory ended up looking like this:
>   /user/dking/<project-name>/K-HTML-UTF8-2006-08-09-rescued-abstracted/task_0026_m_007384_0	<r 3>	10563406
>   /user/dking/<project-name>/K-HTML-UTF8-2006-08-09-rescued-abstracted/task_0026_m_007384_1	<r 3>	10563406
> Future processing will receive duplicated data.
> -dk

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira