You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Rick Cox (JIRA)" <ji...@apache.org> on 2007/10/15 18:58:50 UTC

[jira] Created: (HADOOP-2057) streaming should optionally treat a non-zero exit status of a child process as a failed task

streaming should optionally treat a non-zero exit status of a child process as a failed task
--------------------------------------------------------------------------------------------

                 Key: HADOOP-2057
                 URL: https://issues.apache.org/jira/browse/HADOOP-2057
             Project: Hadoop
          Issue Type: Improvement
          Components: contrib/streaming
    Affects Versions: 0.14.2
            Reporter: Rick Cox


The exit status of the external processes spawned by streaming tasks is currently logged, but not used to indicate success or failure of the task. While this is reasonable for some UNIX tools (e.g. grep), many programs will indicate failure by a non-zero exit status. (Also, even for custom programs, intentionally indicating the failure of a streaming task is currently rather tricky.)

This could be supported by adding a new job-configuration setting, 'stream.non.zero.exit.is.failure'. If true, a non-zero exit status of a child process would throw an exception in the PipeMapRed, causing task failure. The current behavior would be preserved by using a default setting of false. 

This would allow streaming tasks to easily indicate failure, even if all input has already been consumed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2057) streaming should optionally treat a non-zero exit status of a child process as a failed task

Posted by "Rick Cox (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Cox updated HADOOP-2057:
-----------------------------

    Attachment:     (was: exit-status-2057.patch)

> streaming should optionally treat a non-zero exit status of a child process as a failed task
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2057
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2057
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>    Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.14.3, 0.15.0
>            Reporter: Rick Cox
>             Fix For: 0.16.0
>
>         Attachments: exit-status-2057-0.16.patch
>
>
> The exit status of the external processes spawned by streaming tasks is currently logged, but not used to indicate success or failure of the task. While this is reasonable for some UNIX tools (e.g. grep), many programs will indicate failure by a non-zero exit status. (Also, even for custom programs, intentionally indicating the failure of a streaming task is currently rather tricky.)
> This could be supported by adding a new job-configuration setting, 'stream.non.zero.exit.is.failure'. If true, a non-zero exit status of a child process would throw an exception in the PipeMapRed, causing task failure. The current behavior would be preserved by using a default setting of false. 
> This would allow streaming tasks to easily indicate failure, even if all input has already been consumed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2057) streaming should optionally treat a non-zero exit status of a child process as a failed task

Posted by "Rick Cox (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Cox updated HADOOP-2057:
-----------------------------

        Fix Version/s: 0.16.0
    Affects Version/s: 0.14.0
                       0.14.1
                       0.14.3
                       0.15.0

> streaming should optionally treat a non-zero exit status of a child process as a failed task
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2057
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2057
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>    Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.14.3, 0.15.0
>            Reporter: Rick Cox
>             Fix For: 0.16.0
>
>         Attachments: exit-status-2057-0.16.patch
>
>
> The exit status of the external processes spawned by streaming tasks is currently logged, but not used to indicate success or failure of the task. While this is reasonable for some UNIX tools (e.g. grep), many programs will indicate failure by a non-zero exit status. (Also, even for custom programs, intentionally indicating the failure of a streaming task is currently rather tricky.)
> This could be supported by adding a new job-configuration setting, 'stream.non.zero.exit.is.failure'. If true, a non-zero exit status of a child process would throw an exception in the PipeMapRed, causing task failure. The current behavior would be preserved by using a default setting of false. 
> This would allow streaming tasks to easily indicate failure, even if all input has already been consumed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2057) streaming should optionally treat a non-zero exit status of a child process as a failed task

Posted by "Rick Cox (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Cox updated HADOOP-2057:
-----------------------------

    Attachment: exit-status-2057.patch

If the premise of this change is acceptable, than I'd like to request a code-review of this patch. 

It:
* supports the stream.non.zero.exit.is.failure job configuration setting in PipeMapRed
* adds a mention about that setting to the -info text in StreamJob
* adds a test case
* adds stream.non.zero.exit.is.failure to hadoop-default.xml, with a backwards-compatible default of false


> streaming should optionally treat a non-zero exit status of a child process as a failed task
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2057
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2057
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>    Affects Versions: 0.14.2
>            Reporter: Rick Cox
>         Attachments: exit-status-2057.patch
>
>
> The exit status of the external processes spawned by streaming tasks is currently logged, but not used to indicate success or failure of the task. While this is reasonable for some UNIX tools (e.g. grep), many programs will indicate failure by a non-zero exit status. (Also, even for custom programs, intentionally indicating the failure of a streaming task is currently rather tricky.)
> This could be supported by adding a new job-configuration setting, 'stream.non.zero.exit.is.failure'. If true, a non-zero exit status of a child process would throw an exception in the PipeMapRed, causing task failure. The current behavior would be preserved by using a default setting of false. 
> This would allow streaming tasks to easily indicate failure, even if all input has already been consumed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2057) streaming should optionally treat a non-zero exit status of a child process as a failed task

Posted by "Rick Cox (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Cox updated HADOOP-2057:
-----------------------------

    Attachment: exit-status-2057-0.16.patch

Here's a patch updated for the latest trunk (just removes the CHANGES.txt entry).

> streaming should optionally treat a non-zero exit status of a child process as a failed task
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2057
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2057
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>    Affects Versions: 0.14.2
>            Reporter: Rick Cox
>         Attachments: exit-status-2057-0.16.patch, exit-status-2057.patch
>
>
> The exit status of the external processes spawned by streaming tasks is currently logged, but not used to indicate success or failure of the task. While this is reasonable for some UNIX tools (e.g. grep), many programs will indicate failure by a non-zero exit status. (Also, even for custom programs, intentionally indicating the failure of a streaming task is currently rather tricky.)
> This could be supported by adding a new job-configuration setting, 'stream.non.zero.exit.is.failure'. If true, a non-zero exit status of a child process would throw an exception in the PipeMapRed, causing task failure. The current behavior would be preserved by using a default setting of false. 
> This would allow streaming tasks to easily indicate failure, even if all input has already been consumed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.