You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2008/07/26 00:01:34 UTC

[jira] Created: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

hadop streaming does not use progress reporting to detect hung tasks
--------------------------------------------------------------------

                 Key: HADOOP-3837
                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
             Project: Hadoop Core
          Issue Type: Bug
          Components: contrib/streaming
    Affects Versions: 0.17.0
            Reporter: dhruba borthakur


Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.

I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3837:
-------------------------------------

    Attachment: streamTimeout.patch

Streaming code does not set the task-timeout value of 0.
Patch that applies to trunk as well as 0.18 branch.

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618341#action_12618341 ] 

Devaraj Das commented on HADOOP-3837:
-------------------------------------

+1 on the patch and +1 on including this in 0.18

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur reassigned HADOOP-3837:
----------------------------------------

    Assignee: dhruba borthakur

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler updated HADOOP-3837:
------------------------------------

    Release Note: Changed streaming tasks to adhere to task timeout value specified in the job configuration.  (was: Streaming tasks adhere to task timeout value specified in the job configuration.)

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3837:
-------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619158#action_12619158 ] 

Doug Cutting commented on HADOOP-3837:
--------------------------------------

> I will commit this to trunk and 0.18 by the end of this week. 

The week is almost up...

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3837:
-------------------------------------

    Attachment: streamTimeout.patch

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617615#action_12617615 ] 

Hadoop QA commented on HADOOP-3837:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12387058/streamTimeout.patch
  against trunk revision 679930.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2962/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2962/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2962/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2962/console

This message is automatically generated.

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618551#action_12618551 ] 

dhruba borthakur commented on HADOOP-3837:
------------------------------------------

I will commit this to trunk and 0.18 by the end of this week. 

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3837:
-------------------------------------

    Fix Version/s: 0.19.0
                   0.18.0
     Release Note: Streaming tasks adhere to task timeout value specified in the job configuration.
     Hadoop Flags: [Incompatible change]
           Status: Patch Available  (was: Open)

I would like to apply this patch to trunk as well as 0.18. 

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3837) hadop streaming does not use progress reporting to detect hung tasks

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624673#action_12624673 ] 

Hudson commented on HADOOP-3837:
--------------------------------

Integrated in Hadoop-trunk #581 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/])

> hadop streaming does not use progress reporting to detect hung tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3837
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.17.0
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0, 0.19.0
>
>         Attachments: streamTimeout.patch, streamTimeout.patch
>
>
> Hadoop streaming (StreamJob.java) sets mapred.task.timeout to 0. This effectively means that if tasks hang (either due to bad user code or machine related issue), these tasks never encounters a timeout. This causes the entire job to hang.
> I propose that hadoop streaming not set the timeout to 0. By default, the settings in hadoop-default.xml should be effective for streaming jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.