You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "David Bowen (JIRA)" <ji...@apache.org> on 2007/03/01 05:00:50 UTC

[jira] Updated: (HADOOP-1041) Counter names are ugly

     [ https://issues.apache.org/jira/browse/HADOOP-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Bowen updated HADOOP-1041:
--------------------------------

    Attachment: 1041.patch


This patch implements the grouped display of counters.  The individual counters are in the order that they were declared in the originating Enum class, so the developer has full control over the ordering.

Optionally, a ResourceBundle may be provided.  (I had coded this before people started to vote that it wasn't necessary.)  The resource bundle goes in the same package directory as the class containing the enum, and must have the name <class name>_<enum name>.properties.  E.g. for the MapReduce counters, I created an enum called Counter in Task.java, so the bundle is Task_Counter.properties.  See that file for how to customize both the group name and the counter names.

There are also some changes to address Hadoop-1048.  Now the task counters are not summed every time a task status gets updated.  Instead, they are summed when someone - either a client, a JSP page, or a callback from the metrics package - requests them.  I changed JobSubmissionProtocol to allow fetching the counters on demand.  I bumped up its version, which I guess I should have also done on the previous patch in which the Counters object was included in the JobStatus.  (I could have kept it there, but it seemed a bit inconsistent now that the counters are computed on demand, since everything else in the JobStatus is kept up-to-date.)

A related JobTracker efficiency change was to stop updating the job metrics (via the metrics package) every time a task update is received.  Instead this is now only done when the metrics timer-based callback occurs (see JobTrackerMetrics.doUpdates()).  This means that this callback needs to get a list of the running jobs - I think I implemented that with the correct locking in the new method getRunningJobs, but someone might want to double check.

Changes affecting TastTracker:

The incrementCounter method should now be somewhat more efficient because it doesn't (normally) involve any String or Long construction.  The Counters object now holds a map of maps, where the enum class name is the index into the first.  (Previously it was just a map, so the key had to be constructed by string concatenation.)  The serialized form of Counter is a bit more concise, since the enum class name is only written once.

I didn't change the PROGRESS_INTERVAL (1  second) at which MapTasks report their progress to the TaskTracker, because I don't think it is relevant to Hadoop-1048 which is a JobTracker problem.  

JSP stuff:

There is a change that was requested in Hadoop-1038, namely to show both the map and reduce phase counters on the main jobdetails page.

I added a refresh parameter to jobdetails so that a refresh time in seconds can be specified (it was refreshing evey 60 seconds).  I changed the jobtracker page to use the refresh parameter so that running jobs by default get refreshed every 10 seconds, but completed and failed jobs don't get refreshed.

The counts are now right-justfied and decimal-formatted.






> Counter names are ugly
> ----------------------
>
>                 Key: HADOOP-1041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1041
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Owen O'Malley
>         Assigned To: David Bowen
>             Fix For: 0.12.0
>
>         Attachments: 1041.patch
>
>
> Having the complete class name in the counter names makes them unique, but they are ugly to present to non-developers. It would be nice to have some way to have a nicer string presented to the user. Currently, the Enum is converted to a name like:
> key.getDeclaringClass().getName() + "#" + key.toString()
> which gives counter names like "org.apache.hadoop.examples.RandomWriter$Counters#BYTES_WRITTEN"
> which is unique, but not very user friendly. Perhaps, we should strip off the class name for presenting to the users, which would allow them to make nice names. In particular, you could define an enum type that overloaded toString to print a nice user friendly string.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.