You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Sharad Agarwal (JIRA)" <ji...@apache.org> on 2009/06/04 12:27:07 UTC

[jira] Commented: (HADOOP-5931) Collect information about number of tasks succeeded / total per time unit for a tasktracker.

    [ https://issues.apache.org/jira/browse/HADOOP-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716222#action_12716222 ] 

Sharad Agarwal commented on HADOOP-5931:
----------------------------------------

To collect stats for last hour/day, we can have a moving window for that time period. A moving window can contain multiple time slots. The granularity of window movement/update is decided by the slot size. The slot size could be different for different time windows. For example, hour window could have 5 minutes, day window could have 1 hour update granularity. So in that case hour window would hold stats in 12 slots of 5 mins each. Likewise day window would hold stats in 24 slots of 1 hour each.

As the last slot time is crossed, a new slot would be added and the very first one would be knocked off. Hence moving the window by one slot.

A simple strategy could be to collect this information in TaskTracker and report that to JobTracker via TaskTrackerStatus. A subclass could be added to TaskTrackerStatus with fields, say:
tasksSinceStarted, tasksSuccededSinceStarted,
tasksSinceInLastHour, tasksSuccededInLastHour,
tasksSinceInLastDay, tasksSuccededInLastDay

To optimize on heartbeat size, we need not send the above fields with every heartbeat. This could be reported only at certain interval (typically the minimum slot size, 5 mins in above example).

An alternate way could be to compute all this in JobTracker. My vote goes for doing it in Tasktracker as this is mostly to do with individual Task tracker and doesn't need any global information.

Thoughts?


> Collect information about number of tasks succeeded / total per time unit for a tasktracker. 
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5931
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Hemanth Yamijala
>
> Collecting information of number of tasks succeeded / total per tasktracker and being able to see these counts per hour, day and since start time will help reason about things like the blacklisting strategy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.