You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Albert Chern (JIRA)" <ji...@apache.org> on 2007/03/01 13:24:51 UTC

[jira] Commented: (HADOOP-492) Global counters

    [ https://issues.apache.org/jira/browse/HADOOP-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476907 ] 

Albert Chern commented on HADOOP-492:
-------------------------------------

I played around with this a bit today and I have a few questions:

1) Why does the method to increment a counter take an enum whereas the method to read the value takes a String?  Wouldn't it be more convenient if Counters.getCounter() also took an enum?

2) As a test, I created an enum with the value MY_COUNTER and placed a call to reporter.incrCounter(MY_COUNTER, 1) at the very beginning of a map().  Surprisingly, the final value was slightly less than MapTask's INPUT_RECORDS (120925196 vs. 120926095).  Am I missing something here, or is this potentially a bug?

> Global counters
> ---------------
>
>                 Key: HADOOP-492
>                 URL: https://issues.apache.org/jira/browse/HADOOP-492
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: arkady borkovsky
>         Assigned To: David Bowen
>             Fix For: 0.12.0
>
>         Attachments: counters1.patch, counters2.patch, counters3.patch
>
>
> It would be nice to have map / reduce job keep aggregated counts for arbitrary events occuring in its tasks -- the numer of records processed, the numer of exceptions of a specific type, the number of sentences in passive voice, whatever the jobs finds useful.
> This can be implemented by tasks periodically sending <name, value> pairs to the jobtracker (in some implementations such messages are piggy-backed on the heartbeats), so that the job tracker stores all the latests values from each task and aggregates them on a request.  It should also make the aggregated values available at the job end.  The value for a task would be flushed when the task fails.
> #491 and #490 may be related to this one.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.