You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org> on 2007/10/17 13:23:50 UTC

[jira] Commented: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

    [ https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535509 ] 

Amareshwari Sri Ramadasu commented on HADOOP-1900:
--------------------------------------------------

Here is a proposal for changing heartbeat interval dynamically.

We will intialize heartbeat interval as HEARTBEAT_INTERVAL(10 secs) in task tracker.
Once tasktracker transmits heartbeat, jobtracker's response will have next heartbeat interval.

JobTracker calculates next heart beat interval as follows:

1. Using Clustersize (Number of task trackers) :
    nextInterval = 10secs * (cluster_size/500 +1)
    i.e. For every additional 500 nodes we increase heartbeat interval by 10 secs. 

2. if (nextInterval >HEARTBEAT_INTERVAL_MAX ) nextInteval = HEARTBEAT_INTERVAL_MAX;
if this exceeds HEARTBEAT_INTERVAL_MAX (60seconds), next heartbeat interval is 60 seconds.

3.  if dropcount of heartbeats is greater than a threshold drop count,  increase interval by 10 more seconds.
   Threshold drop count can be 'clustersize/10' . i.e. If there are 500 nodes in the cluster and more than 50 heartbeats are dropped, then we increase next heartbeat interval by  10 more seconds.

   Thus, next Interval calculation can be the following
{code}
threshold_dropcount = clustersize/10;
isBusy = dropCount > threshold_dropcount ?1:0;
nextInterval =  HEARTBEAT_INTERVAL* (cluster_size/500 +1) 
                                   + HEARTBEAT_INTERVAL*isBusy;
if (nextInterval >HEARTBEAT_INTERVAL_MAX ) nextInteval = HEARTBEAT_INTERVAL_MAX;
{code}

Now, MapEventsFetcherThread polls jobtracker for completed map tasks for every 5 secs (MIN_POLL_INTERVAL).  Shall we change polling interval also in the similar fashion as heartbeat interval? But, here some reduce tasks could be idle for longer time.

Any thoughts?


> the heartbeat and task event queries interval should be set dynamically by the JobTracker
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1900
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1900
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>
> The JobTracker should scale the intervals that the TaskTrackers use to contact it dynamically, based on how the busy it is and the size of the cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.