You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2010/07/12 13:14:50 UTC

[jira] Resolved: (MAPREDUCE-613) Streaming should allow to re-start the command if it failed in the middle of input

     [ https://issues.apache.org/jira/browse/MAPREDUCE-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu resolved MAPREDUCE-613.
-----------------------------------------------

    Resolution: Duplicate

Can be achieved through skipping bad records feature.

> Streaming should allow to re-start the command if it failed in the middle of input
> ----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-613
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-613
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: arkady borkovsky
>
> Sometimes, we need to use imperfect programs to process data.
> Recently, I used a public domain program that did what I needed, but crashed after processing few million records (in my case, more than half of the mappers would succeed, with the rest failing at different %%).
> It would be nice to be able to tell the Streaming Framework :
>      if the streaming command fails at some input record (and you get "pipe broken" from it), 
>      restart the command and continue feeding it the data.
>      Please log the failing record.
> In textmining, quite often, loosing few record of the input makes no  difference at all.
> Of course this feature should be disabled by default, and should some "are really sure" provision.  (an expert feature).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.