You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Michel Tourn (JIRA)" <ji...@apache.org> on 2006/09/23 02:27:22 UTC

[jira] Created: (HADOOP-556) Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records

Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records
--------------------------------------------------------------------------------------------

                 Key: HADOOP-556
                 URL: http://issues.apache.org/jira/browse/HADOOP-556
             Project: Hadoop
          Issue Type: Improvement
          Components: contrib/streaming
            Reporter: Michel Tourn


Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records.
A first version of this was already implemented but was not satisfactory.

Now we check whether 10sec have passed:

-after reading a *line* of Application's stderr 
 (so please use as your stderr heartbeat: cerr << ".\n", not cerr << "."  )

-and after outputting a record in mapper / combiner / reducer.

If 10 sec have passed then we set Reporter status.

Effects:

-the reporter status changes more often and provides useful feedback in the Web UI or in another client.
-a Task will not time out after 10 minutes just because it outputs records slowly. 

No artificial heartbeat is introduced in this proposal.
The streaming Application still has to show activity (either on stdout or on stderr)









-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-556) Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records

Posted by "Michel Tourn (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-556?page=all ]

Michel Tourn updated HADOOP-556:
--------------------------------

    Attachment: hadoop-timedreporter.patch

> Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-556
>                 URL: http://issues.apache.org/jira/browse/HADOOP-556
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: Michel Tourn
>         Attachments: hadoop-timedreporter.patch
>
>
> Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records.
> A first version of this was already implemented but was not satisfactory.
> Now we check whether 10sec have passed:
> -after reading a *line* of Application's stderr 
>  (so please use as your stderr heartbeat: cerr << ".\n", not cerr << "."  )
> -and after outputting a record in mapper / combiner / reducer.
> If 10 sec have passed then we set Reporter status.
> Effects:
> -the reporter status changes more often and provides useful feedback in the Web UI or in another client.
> -a Task will not time out after 10 minutes just because it outputs records slowly. 
> No artificial heartbeat is introduced in this proposal.
> The streaming Application still has to show activity (either on stdout or on stderr)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-556) Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records

Posted by "Michel Tourn (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-556?page=all ]

Michel Tourn updated HADOOP-556:
--------------------------------

    Status: Patch Available  (was: Open)

Attachment hadoop-timedreporter.patch is the patch


> Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-556
>                 URL: http://issues.apache.org/jira/browse/HADOOP-556
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: Michel Tourn
>         Attachments: hadoop-timedreporter.patch
>
>
> Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records.
> A first version of this was already implemented but was not satisfactory.
> Now we check whether 10sec have passed:
> -after reading a *line* of Application's stderr 
>  (so please use as your stderr heartbeat: cerr << ".\n", not cerr << "."  )
> -and after outputting a record in mapper / combiner / reducer.
> If 10 sec have passed then we set Reporter status.
> Effects:
> -the reporter status changes more often and provides useful feedback in the Web UI or in another client.
> -a Task will not time out after 10 minutes just because it outputs records slowly. 
> No artificial heartbeat is introduced in this proposal.
> The streaming Application still has to show activity (either on stdout or on stderr)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-556) Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-556?page=all ]

Doug Cutting updated HADOOP-556:
--------------------------------

           Status: Resolved  (was: Patch Available)
    Fix Version/s: 0.7.0
       Resolution: Fixed

I just committed this.  Thanks, Michel!

> Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-556
>                 URL: http://issues.apache.org/jira/browse/HADOOP-556
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: Michel Tourn
>             Fix For: 0.7.0
>
>         Attachments: hadoop-timedreporter.patch
>
>
> Streaming should send keep-alive signals to Reporter every 10 seconds, not every 100 records.
> A first version of this was already implemented but was not satisfactory.
> Now we check whether 10sec have passed:
> -after reading a *line* of Application's stderr 
>  (so please use as your stderr heartbeat: cerr << ".\n", not cerr << "."  )
> -and after outputting a record in mapper / combiner / reducer.
> If 10 sec have passed then we set Reporter status.
> Effects:
> -the reporter status changes more often and provides useful feedback in the Web UI or in another client.
> -a Task will not time out after 10 minutes just because it outputs records slowly. 
> No artificial heartbeat is introduced in this proposal.
> The streaming Application still has to show activity (either on stdout or on stderr)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira