You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Kevin Wilfong (Created) (JIRA)" <ji...@apache.org> on 2011/10/04 19:11:39 UTC

[jira] [Created] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats object.

Log more Hadoop task counter values in the MapRedStats object.
--------------------------------------------------------------

                 Key: HIVE-2479
                 URL: https://issues.apache.org/jira/browse/HIVE-2479
             Project: Hive
          Issue Type: Improvement
            Reporter: Kevin Wilfong
            Assignee: Kevin Wilfong


We should log more of the Hadoop task tracker counters in the MapRedStats object, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:

    MAP_SPILL_CPU,
    MAP_SPILL_WALLCLOCK,
    MAP_SPILL_NUMBER,
    MAP_SPILL_BYTES,
    MAP_MEM_SORT_CPU,
    MAP_MEM_SORT_WALLCLOCK,
    MAP_MERGE_CPU,
    MAP_MERGE_WALLCLOCK,
    REDUCE_SHUFFLE_BYTES,
    REDUCE_COPY_WALLCLOCK,
    REDUCE_COPY_CPU,
    REDUCE_SORT_WALLCLOCK,
    REDUCE_SORT_CPU,
    MAP_TASK_WALLCLOCK,
    REDUCE_TASK_WALLCLOCK,
    MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "Kevin Wilfong (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-2479:
--------------------------------

    Attachment: HIVE-2479.4.patch.txt
    
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt, HIVE-2479.3.patch.txt, HIVE-2479.4.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "Kevin Wilfong (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-2479:
--------------------------------

    Description: 
We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:

    MAP_SPILL_CPU,
    MAP_SPILL_WALLCLOCK,
    MAP_SPILL_NUMBER,
    MAP_SPILL_BYTES,
    MAP_MEM_SORT_CPU,
    MAP_MEM_SORT_WALLCLOCK,
    MAP_MERGE_CPU,
    MAP_MERGE_WALLCLOCK,
    REDUCE_SHUFFLE_BYTES,
    REDUCE_COPY_WALLCLOCK,
    REDUCE_COPY_CPU,
    REDUCE_SORT_WALLCLOCK,
    REDUCE_SORT_CPU,
    MAP_TASK_WALLCLOCK,
    REDUCE_TASK_WALLCLOCK,
    MAP_INPUT_BYTES

  was:
We should log more of the Hadoop task tracker counters in the MapRedStats object, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:

    MAP_SPILL_CPU,
    MAP_SPILL_WALLCLOCK,
    MAP_SPILL_NUMBER,
    MAP_SPILL_BYTES,
    MAP_MEM_SORT_CPU,
    MAP_MEM_SORT_WALLCLOCK,
    MAP_MERGE_CPU,
    MAP_MERGE_WALLCLOCK,
    REDUCE_SHUFFLE_BYTES,
    REDUCE_COPY_WALLCLOCK,
    REDUCE_COPY_CPU,
    REDUCE_SORT_WALLCLOCK,
    REDUCE_SORT_CPU,
    MAP_TASK_WALLCLOCK,
    REDUCE_TASK_WALLCLOCK,
    MAP_INPUT_BYTES

        Summary: Log more Hadoop task counter values in the MapRedStats class.  (was: Log more Hadoop task counter values in the MapRedStats object.)
    
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "He Yongqiang (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang resolved HIVE-2479.
--------------------------------

    Resolution: Fixed

Committed, thanks Kevin Wilfong!
                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt, HIVE-2479.3.patch.txt, HIVE-2479.4.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "He Yongqiang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120302#comment-13120302 ] 

He Yongqiang commented on HIVE-2479:
------------------------------------

Let's put all job counters into a map in MapRedStats (not just the one listed above), and a hook can do anything he wants
                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "Kevin Wilfong (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-2479:
--------------------------------

    Description: 
We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.

We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

  was:
We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:

    MAP_SPILL_CPU,
    MAP_SPILL_WALLCLOCK,
    MAP_SPILL_NUMBER,
    MAP_SPILL_BYTES,
    MAP_MEM_SORT_CPU,
    MAP_MEM_SORT_WALLCLOCK,
    MAP_MERGE_CPU,
    MAP_MERGE_WALLCLOCK,
    REDUCE_SHUFFLE_BYTES,
    REDUCE_COPY_WALLCLOCK,
    REDUCE_COPY_CPU,
    REDUCE_SORT_WALLCLOCK,
    REDUCE_SORT_CPU,
    MAP_TASK_WALLCLOCK,
    REDUCE_TASK_WALLCLOCK,
    MAP_INPUT_BYTES

    
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120303#comment-13120303 ] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/#review2316
-----------------------------------------------------------



trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java
<https://reviews.apache.org/r/2167/#comment5322>

    don't put the counter names here, let's use a map and pass it the hook


- Yongqiang


On 2011-10-04 17:27:43, Kevin Wilfong wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2167/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-04 17:27:43)
bq.  
bq.  
bq.  Review request for hive, Ramkumar Vadali and Yongqiang He.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.
bq.  
bq.  I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum values and tries to get a value for each counter.  As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.
bq.  
bq.  
bq.  This addresses bug Hive-2479.
bq.      https://issues.apache.org/jira/browse/Hive-2479
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 
bq.  
bq.  Diff: https://reviews.apache.org/r/2167/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  I ran some queries to verify the counters were being populated.
bq.  
bq.  I also ran a few of the unit test queries to verify I hadn't broken anything.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Kevin
bq.  
bq.


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120651#comment-13120651 ] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/#review2335
-----------------------------------------------------------



trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java
<https://reviews.apache.org/r/2167/#comment5396>

    remove the enum, just either put all the counters objects into this MapRedStats, or put a map<counterGrpAndName, counterInst> here.
    
    For simplicity, we can just put the counters object here and hooks can anything they want.


- Yongqiang


On 2011-10-04 22:58:24, Kevin Wilfong wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2167/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-04 22:58:24)
bq.  
bq.  
bq.  Review request for hive, Ramkumar Vadali and Yongqiang He.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.
bq.  
bq.  I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum values and tries to get a value for each counter.  As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.
bq.  
bq.  
bq.  This addresses bug Hive-2479.
bq.      https://issues.apache.org/jira/browse/Hive-2479
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1178612 
bq.    trunk/conf/hive-default.xml 1178612 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 
bq.  
bq.  Diff: https://reviews.apache.org/r/2167/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  I ran some queries to verify the counters were being populated.
bq.  
bq.  I also ran a few of the unit test queries to verify I hadn't broken anything.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Kevin
bq.  
bq.


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "He Yongqiang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121302#comment-13121302 ] 

He Yongqiang commented on HIVE-2479:
------------------------------------

Looks good!

a small comment, and also can you rebase?


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt, HIVE-2479.3.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121393#comment-13121393 ] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/
-----------------------------------------------------------

(Updated 2011-10-05 19:38:10.946692)


Review request for hive, Ramkumar Vadali and Yongqiang He.


Changes
-------

Nice catch Yongqiang, I'm sorry I let that slip through.  I rebased and made the change.


Summary
-------

I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum values and tries to get a value for each counter.  As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.


This addresses bug Hive-2479.
    https://issues.apache.org/jira/browse/Hive-2479


Diffs (updated)
-----

  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1179378 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1179378 

Diff: https://reviews.apache.org/r/2167/diff


Testing
-------

I ran some queries to verify the counters were being populated.

I also ran a few of the unit test queries to verify I hadn't broken anything.


Thanks,

Kevin


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt, HIVE-2479.3.patch.txt, HIVE-2479.4.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "Kevin Wilfong (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-2479:
--------------------------------

    Attachment: HIVE-2479.3.patch.txt
    
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt, HIVE-2479.3.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120300#comment-13120300 ] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/
-----------------------------------------------------------

(Updated 2011-10-04 17:27:43.657392)


Review request for hive, Ramkumar Vadali and Yongqiang He.


Summary
-------

I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum values and tries to get a value for each counter.  As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.


This addresses bug Hive-2479.
    https://issues.apache.org/jira/browse/Hive-2479


Diffs
-----

  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 

Diff: https://reviews.apache.org/r/2167/diff


Testing
-------

I ran some queries to verify the counters were being populated.

I also ran a few of the unit test queries to verify I hadn't broken anything.


Thanks,

Kevin


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121304#comment-13121304 ] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/#review2355
-----------------------------------------------------------



trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java
<https://reviews.apache.org/r/2167/#comment5419>

    We may get NPE, so use code like this:
    
    Counter ctr = null;
    if(counters != null) {
      ctr= counters.findCounter(...);
      if (ctr != null) {
        ......
      }
    }
    Or put this to a common method. 
    
    


- Yongqiang


On 2011-10-05 17:58:04, Kevin Wilfong wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2167/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-05 17:58:04)
bq.  
bq.  
bq.  Review request for hive, Ramkumar Vadali and Yongqiang He.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.
bq.  
bq.  I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum values and tries to get a value for each counter.  As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.
bq.  
bq.  
bq.  This addresses bug Hive-2479.
bq.      https://issues.apache.org/jira/browse/Hive-2479
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 
bq.  
bq.  Diff: https://reviews.apache.org/r/2167/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  I ran some queries to verify the counters were being populated.
bq.  
bq.  I also ran a few of the unit test queries to verify I hadn't broken anything.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Kevin
bq.  
bq.


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt, HIVE-2479.3.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120311#comment-13120311 ] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/
-----------------------------------------------------------

(Updated 2011-10-04 17:38:18.076011)


Review request for hive, Ramkumar Vadali and Yongqiang He.


Changes
-------

I added everything from Task$Counter except CPU_MILLISECONDS because currently that receives special treatment in HadoopJobExecHelper.


Summary
-------

I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum values and tries to get a value for each counter.  As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.


This addresses bug Hive-2479.
    https://issues.apache.org/jira/browse/Hive-2479


Diffs (updated)
-----

  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 

Diff: https://reviews.apache.org/r/2167/diff


Testing
-------

I ran some queries to verify the counters were being populated.

I also ran a few of the unit test queries to verify I hadn't broken anything.


Thanks,

Kevin


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121286#comment-13121286 ] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/
-----------------------------------------------------------

(Updated 2011-10-05 17:58:04.685771)


Review request for hive, Ramkumar Vadali and Yongqiang He.


Changes
-------

Thanks Yongqiang, that's a way better idea.  I implemented it, again leaving cpuMsec because of the special logic for it in HadoopJobExecHelper, but converting all other counters to use it.


Summary
-------

I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum values and tries to get a value for each counter.  As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.


This addresses bug Hive-2479.
    https://issues.apache.org/jira/browse/Hive-2479


Diffs (updated)
-----

  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 

Diff: https://reviews.apache.org/r/2167/diff


Testing
-------

I ran some queries to verify the counters were being populated.

I also ran a few of the unit test queries to verify I hadn't broken anything.


Thanks,

Kevin


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121714#comment-13121714 ] 

Hudson commented on HIVE-2479:
------------------------------

Integrated in Hive-trunk-h0.21 #995 (See [https://builds.apache.org/job/Hive-trunk-h0.21/995/])
    HIVE-2479: Log more Hadoop task counter values in the MapRedStats class (Kevin Wilfong via He Yongqiang)

heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179493
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java

                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt, HIVE-2479.3.patch.txt, HIVE-2479.4.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "Kevin Wilfong (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-2479:
--------------------------------

    Attachment: HIVE-2479.2.patch.txt
    
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "Kevin Wilfong (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-2479:
--------------------------------

    Attachment: HIVE-2479.1.patch.txt
    
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120563#comment-13120563 ] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/
-----------------------------------------------------------

(Updated 2011-10-04 22:58:24.529007)


Review request for hive, Ramkumar Vadali and Yongqiang He.


Changes
-------

MapRedStats provides an enum which just contains those counters that were being used before this diff.  I also made it take a generic enum, and made the task counter enum used by HadoopJobExecHelper configurable.  This way, if one Hadoop version contains more enums than another, the counters to be logged can be modified by creating a new enum.


Summary
-------

I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum values and tries to get a value for each counter.  As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.


This addresses bug Hive-2479.
    https://issues.apache.org/jira/browse/Hive-2479


Diffs (updated)
-----

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1178612 
  trunk/conf/hive-default.xml 1178612 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 

Diff: https://reviews.apache.org/r/2167/diff


Testing
-------

I ran some queries to verify the counters were being populated.

I also ran a few of the unit test queries to verify I hadn't broken anything.


Thanks,

Kevin


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.  Specifically these are the counters we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.

Posted by "He Yongqiang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121527#comment-13121527 ] 

He Yongqiang commented on HIVE-2479:
------------------------------------

+1, will commit after tests pass.
                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2479.1.patch.txt, HIVE-2479.2.patch.txt, HIVE-2479.3.patch.txt, HIVE-2479.4.patch.txt
>
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.
> We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira