You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2010/07/12 13:14:50 UTC
[jira] Resolved: (MAPREDUCE-613) Streaming should allow to re-start
the command if it failed in the middle of input
[ https://issues.apache.org/jira/browse/MAPREDUCE-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu resolved MAPREDUCE-613.
-----------------------------------------------
Resolution: Duplicate
Can be achieved through skipping bad records feature.
> Streaming should allow to re-start the command if it failed in the middle of input
> ----------------------------------------------------------------------------------
>
> Key: MAPREDUCE-613
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-613
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: arkady borkovsky
>
> Sometimes, we need to use imperfect programs to process data.
> Recently, I used a public domain program that did what I needed, but crashed after processing few million records (in my case, more than half of the mappers would succeed, with the rest failing at different %%).
> It would be nice to be able to tell the Streaming Framework :
> if the streaming command fails at some input record (and you get "pipe broken" from it),
> restart the command and continue feeding it the data.
> Please log the failing record.
> In textmining, quite often, loosing few record of the input makes no difference at all.
> Of course this feature should be disabled by default, and should some "are really sure" provision. (an expert feature).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.