You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Christian Kunz (JIRA)" <ji...@apache.org> on 2007/10/02 07:27:50 UTC

[jira] Created: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
-----------------------------------------------------------------------------------------------------------------------------

                 Key: HADOOP-1983
                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
             Project: Hadoop
          Issue Type: Bug
          Components: pipes
    Affects Versions: 0.15.0
            Reporter: Christian Kunz


When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.

The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532015 ] 

Owen O'Malley commented on HADOOP-1983:
---------------------------------------

Would adding a flush on the command socket when sending the progress address the problem? It is already limited to 1/second, so it shouldn't cause excessive load.

> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532305 ] 

Hadoop QA commented on HADOOP-1983:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12367040/pipe-flush.patch
against trunk revision r581745.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/879/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/879/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/879/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/879/console

This message is automatically generated.

> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>            Assignee: Owen O'Malley
>             Fix For: 0.15.0
>
>         Attachments: pipe-flush.patch
>
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-1983:
----------------------------------

    Status: Patch Available  (was: Open)

> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>            Assignee: Owen O'Malley
>             Fix For: 0.15.0
>
>         Attachments: pipe-flush.patch
>
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532025 ] 

Christian Kunz commented on HADOOP-1983:
----------------------------------------

That was my work around to pin point the failure, and it resolved the issue.

> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532307 ] 

Christian Kunz commented on HADOOP-1983:
----------------------------------------

+1
One could argue, that the status message should be flushed as well such that users see updated c++ application status in JobTracker GUI as soon as possible.
Or even better, move 
          uplink->progress(progressFloat);
after
          uplink->status(status)


> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>            Assignee: Owen O'Malley
>             Fix For: 0.15.0
>
>         Attachments: pipe-flush.patch
>
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-1983:
----------------------------------

    Fix Version/s: 0.15.0
         Assignee: Owen O'Malley

> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>            Assignee: Owen O'Malley
>             Fix For: 0.15.0
>
>         Attachments: pipe-flush.patch
>
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532855 ] 

Hudson commented on HADOOP-1983:
--------------------------------

Integrated in Hadoop-Nightly #263 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/263/])

> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>            Assignee: Owen O'Malley
>             Fix For: 0.15.0
>
>         Attachments: pipe-flush.patch
>
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-1983:
----------------------------------

    Attachment: pipe-flush.patch

This patch adds a flush method to the stream interface and causes the progress method to call it.

> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>             Fix For: 0.15.0
>
>         Attachments: pipe-flush.patch
>
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1983) jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-1983:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this, with Christian's suggestion for sending the status before the flush instead of after.

> jobs using pipes interface with tasks not using java output format have a good chance of not updating progress and timing out
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1983
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1983
>             Project: Hadoop
>          Issue Type: Bug
>          Components: pipes
>    Affects Versions: 0.15.0
>            Reporter: Christian Kunz
>            Assignee: Owen O'Malley
>             Fix For: 0.15.0
>
>         Attachments: pipe-flush.patch
>
>
> When using C++-pipes interface, mappers/reducers not emitting any key-values pairs, but running longer than 'mapred.task.timeout' might timeout even when they send periodical status updates upstream.
> The cause of the problem is that all upstream messages are buffered. Progress and status updates should be flushed immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.