You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Dick King (JIRA)" <ji...@apache.org> on 2010/08/27 23:05:54 UTC

[jira] Created: (MAPREDUCE-2037) Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds

Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds
-----------------------------------------------------------------------------------------------------------

                 Key: MAPREDUCE-2037
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
            Reporter: Dick King
            Assignee: Dick King
             Fix For: 0.22.0


We would like to capture the following information at certain progress thresholds as a task runs:

   * Time taken so far
   * CPU load [either at the time the data are taken, or exponentially smoothed]
   * Memory load [also either at the time the data are taken, or exponentially smoothed]

This would be taken at intervals that depend on the task progress plateaus.  For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and (2/3-3/3] -- where fundamentally different activities happen.  Mappers have different boundaries, I understand, that are not symmetrically placed.  Data capture boundaries should coincide with activity boundaries.  For the state information capture [CPU and memory] we should average over the covered interval.

This data would flow in with the heartbeats.  It would be placed in the job history as part of the task attempt completion event, so it could be processed by rumen or some similar tool and could drive a benchmark engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-2037) Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds

Posted by "Dick King (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909988#action_12909988 ] 

Dick King commented on MAPREDUCE-2037:
--------------------------------------

Benchmarks to support more realistic validation of putative scheduler improvements would benefit from a gridmix3-like tool that can simulate the CPU usage patterns of the tasks of the emulated jobs.  That includes both the average loads of the various tasks, and also the time variation.  In order to develop this information, we need to capture the CPU usage of each task over time.

Fortunately, on linux systems, there's a way to capture this.  The {{/proc/n/stat}} information appears to capture everything I need.

I would plumb this using {{LinuxResourceCalculatorPlugin}} and {{TaskStatus}} .

The information will be placed in the job history files, in the task attempt end records.  This might be placed as a coded character string with a few dozen characters.

> Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2037
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Dick King
>            Assignee: Dick King
>             Fix For: 0.22.0
>
>
> We would like to capture the following information at certain progress thresholds as a task runs:
>    * Time taken so far
>    * CPU load [either at the time the data are taken, or exponentially smoothed]
>    * Memory load [also either at the time the data are taken, or exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.  For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and (2/3-3/3] -- where fundamentally different activities happen.  Mappers have different boundaries, I understand, that are not symmetrically placed.  Data capture boundaries should coincide with activity boundaries.  For the state information capture [CPU and memory] we should average over the covered interval.
> This data would flow in with the heartbeats.  It would be placed in the job history as part of the task attempt completion event, so it could be processed by rumen or some similar tool and could drive a benchmark engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-2037) Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907007#action_12907007 ] 

Hong Tang commented on MAPREDUCE-2037:
--------------------------------------

-1 on using EWMA to capture CPU usage. It is more useful to track the aggregated cpu tick counter as raw data, and we can always calculate EWMA from that later, but not vice versa. It'd be also useful to capture the number of threads that are included in the calculation. So each entry looks like the following: <time, cpu-ticker-counter, #threads>. I'd also like to capture CPU MHz number for the task tracker so that I can know if we are saturating the CPU.

> Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2037
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Dick King
>            Assignee: Dick King
>             Fix For: 0.22.0
>
>
> We would like to capture the following information at certain progress thresholds as a task runs:
>    * Time taken so far
>    * CPU load [either at the time the data are taken, or exponentially smoothed]
>    * Memory load [also either at the time the data are taken, or exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.  For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and (2/3-3/3] -- where fundamentally different activities happen.  Mappers have different boundaries, I understand, that are not symmetrically placed.  Data capture boundaries should coincide with activity boundaries.  For the state information capture [CPU and memory] we should average over the covered interval.
> This data would flow in with the heartbeats.  It would be placed in the job history as part of the task attempt completion event, so it could be processed by rumen or some similar tool and could drive a benchmark engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-2037) Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906680#action_12906680 ] 

Vinod K V commented on MAPREDUCE-2037:
--------------------------------------

I didn't realize before but MAPREDUCE-220 captures the cpu/memory load at the time of task completion. So the core functionality is already there in trunk.

But the load at the time of task completion isn't really a useful stat. +1 for either exponential smoothing or a simpler capturing of highest,lowest and average loads for cpu and memory.

> Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2037
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Dick King
>            Assignee: Dick King
>             Fix For: 0.22.0
>
>
> We would like to capture the following information at certain progress thresholds as a task runs:
>    * Time taken so far
>    * CPU load [either at the time the data are taken, or exponentially smoothed]
>    * Memory load [also either at the time the data are taken, or exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.  For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and (2/3-3/3] -- where fundamentally different activities happen.  Mappers have different boundaries, I understand, that are not symmetrically placed.  Data capture boundaries should coincide with activity boundaries.  For the state information capture [CPU and memory] we should average over the covered interval.
> This data would flow in with the heartbeats.  It would be placed in the job history as part of the task attempt completion event, so it could be processed by rumen or some similar tool and could drive a benchmark engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-2037) Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds

Posted by "Dick King (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910397#action_12910397 ] 

Dick King commented on MAPREDUCE-2037:
--------------------------------------

Worker tasks seldom have multiple threads.  Streaming and its friends spawn a task, and of course users can write whatever code they want, but most tasks burn their CPU time in their sole thread.

Of course, when we do have streaming we need to capture the info from the slave task...


> Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2037
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Dick King
>            Assignee: Dick King
>             Fix For: 0.22.0
>
>
> We would like to capture the following information at certain progress thresholds as a task runs:
>    * Time taken so far
>    * CPU load [either at the time the data are taken, or exponentially smoothed]
>    * Memory load [also either at the time the data are taken, or exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.  For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and (2/3-3/3] -- where fundamentally different activities happen.  Mappers have different boundaries, I understand, that are not symmetrically placed.  Data capture boundaries should coincide with activity boundaries.  For the state information capture [CPU and memory] we should average over the covered interval.
> This data would flow in with the heartbeats.  It would be placed in the job history as part of the task attempt completion event, so it could be processed by rumen or some similar tool and could drive a benchmark engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.