You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2009/05/08 09:11:45 UTC

[jira] Created: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

The length of the heartbeat cycle should be configurable.
---------------------------------------------------------

                 Key: HADOOP-5784
                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
             Project: Hadoop Core
          Issue Type: Improvement
          Components: mapred
            Reporter: Owen O'Malley


Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713435#action_12713435 ] 

Amar Kamat commented on HADOOP-5784:
------------------------------------

Wondering which one is more intuitive, _number-of-heartbeats-per-sec_ or _heartbeat-interval_. The title says _heartbeat-interval_ should be configurable whereas the description states _number-of-heartbeats-per-sec_ should be configurable. I personally think _heartbeat-interval_ is easier to set and play around. Thoughts? 

Regarding the test case, cant we spoof tasktracker status and invoke JobTracker.heartbeat() ? This way we can increment the tracker count and query the jobtracker for the current heartbeat interval? Thoughts?

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5784:
--------------------------------------------

    Status: Patch Available  (was: Open)

test-patch result:
{noformat}
     [exec]
     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]
{noformat}
ant test passed on my machine

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784-1.txt, patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715065#action_12715065 ] 

Amareshwari Sriramadasu commented on HADOOP-5784:
-------------------------------------------------

Test failures are not related to the patch. All tests passed on my machine

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784-1.txt, patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711466#action_12711466 ] 

Amareshwari Sriramadasu commented on HADOOP-5784:
-------------------------------------------------

Current heartbeat interval is set to  {{clusterSize / 100}}, and minimum interval is capped at 3seconds. 
It assumes that JT can process 100 heartbeats in a second. See http://issues.apache.org/jira/browse/HADOOP-1900?focusedCommentId=12542530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12542530

Now, if we make number of heartbeats that should arrive in a second configurable (with default value as 100) , heartbeat interval can be calculated as 
{code}
 heartbeatInterval = max((clusterSize / #heartbeats in a second),  HEARTBEAT_INTERVAL_MIN) ;
{code}

Thoughts?

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713487#action_12713487 ] 

Hadoop QA commented on HADOOP-5784:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12408937/patch-5784.txt
  against trunk revision 778994.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 release audit.  The applied patch generated 493 release audit warnings (more than the trunk's current 492 warnings).

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/410/console

This message is automatically generated.

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713438#action_12713438 ] 

Owen O'Malley commented on HADOOP-5784:
---------------------------------------

This looks good, but I wish there was a good way to set up a test case. I guess the best way would be to create a JobTracker and call the heartbeat method and observe the requested heartbeat interval.

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5784:
--------------------------------------------

    Fix Version/s: 0.21.0
           Status: Patch Available  (was: Open)

test-patch result:
{noformat}
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
{noformat}

It is difficult to write unit test for this.
Tested the patch by running sort on 500 nodes with mapred.heartbeats.in.second=200. 

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu reassigned HADOOP-5784:
-----------------------------------------------

    Assignee: Amareshwari Sriramadasu

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5784:
--------------------------------------------

    Status: Open  (was: Patch Available)

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784-1.txt, patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5784:
--------------------------------------------

    Attachment: patch-5784-1.txt

Patch updated with testcase.

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784-1.txt, patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714512#action_12714512 ] 

Hadoop QA commented on HADOOP-5784:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12409148/patch-5784-1.txt
  against trunk revision 779944.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/426/console

This message is automatically generated.

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784-1.txt, patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5784:
--------------------------------------------

    Attachment: patch-5784.txt

Patch making the number of heartbeats that arrive JobTracker configurable.


> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5784:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5784-1.txt, patch-5784.txt
>
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5784) The length of the heartbeat cycle should be configurable.

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707966#action_12707966 ] 

Steve Loughran commented on HADOOP-5784:
----------------------------------------

is the goal here to detect failures of data nodes, or to have an up to date track of which task trackers have capacity?



> The length of the heartbeat cycle should be configurable.
> ---------------------------------------------------------
>
>                 Key: HADOOP-5784
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5784
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>
> Currently, the hearbeat cycle is set to (# nodes / 100) in seconds. This can be too long for clusters that need to run low latency jobs. We should make the number of heartbeats that should arrive a second configurable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.