You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "David Bowen (JIRA)" <ji...@apache.org> on 2007/03/23 23:18:32 UTC

[jira] Updated: (HADOOP-1146) "Reduce input records" counter name is misleading

     [ https://issues.apache.org/jira/browse/HADOOP-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Bowen updated HADOOP-1146:
--------------------------------

    Attachment: 1146.patch


This patch:

   1. Renames the counter Reduce Input Records to Reduce Input Groups since that what it counts.

   2. Adds a new counter called Reduce Input Records that does count the records.

   3. Then when testing on Wordcount, I noticed that Map Output Records and Reduce Input Records were not the same because of the use of a Combiner.  So I added two new counters to show this: Combine Input Records and Combine Output Records.

I'm not sure if we really need these Combine Input/Output record counters.  At the end of the job, they should be the same as Map Output Records and Reduce Input Records respectively, but they are possibly interesting to watch as the job proceeds.

Comments welcome.


> "Reduce input records" counter name is misleading
> -------------------------------------------------
>
>                 Key: HADOOP-1146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1146
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: David Bowen
>         Assigned To: David Bowen
>         Attachments: 1146.patch
>
>
> It has been pointed out that the counter name "reduce input records" is misleading; this number should be called "reduce input keys" or "reduce input groups".  It could also be useful to have the actual number of reduce input records, which should be the same as the number of map output records.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.