You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by om...@apache.org on 2008/11/21 01:10:48 UTC
svn commit: r719431 - in /hadoop/core/trunk: CHANGES.txt
src/mapred/org/apache/hadoop/mapred/JobConf.java
Author: omalley
Date: Thu Nov 20 16:10:47 2008
New Revision: 719431
URL: http://svn.apache.org/viewvc?rev=719431&view=rev
Log:
HADOOP-4668. Improve documentation for setCombinerClass to clarify the
restrictions on combiners. (omalley)
Modified:
hadoop/core/trunk/CHANGES.txt
hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/JobConf.java
Modified: hadoop/core/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/CHANGES.txt?rev=719431&r1=719430&r2=719431&view=diff
==============================================================================
--- hadoop/core/trunk/CHANGES.txt (original)
+++ hadoop/core/trunk/CHANGES.txt Thu Nov 20 16:10:47 2008
@@ -121,6 +121,9 @@
it down by monitoring for cumulative memory usage across tasks.
(Vinod Kumar Vavilapalli via yhemanth)
+ HADOOP-4668. Improve documentation for setCombinerClass to clarify the
+ restrictions on combiners. (omalley)
+
OPTIMIZATIONS
HADOOP-3293. Fixes FileInputFormat to do provide locations for splits
Modified: hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/JobConf.java
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/JobConf.java?rev=719431&r1=719430&r2=719431&view=diff
==============================================================================
--- hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/JobConf.java (original)
+++ hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/JobConf.java Thu Nov 20 16:10:47 2008
@@ -775,11 +775,20 @@
* Set the user-defined <i>combiner</i> class used to combine map-outputs
* before being sent to the reducers.
*
- * <p>The combiner is a task-level aggregation operation which, in some cases,
- * helps to cut down the amount of data transferred from the {@link Mapper} to
- * the {@link Reducer}, leading to better performance.</p>
- *
- * <p>Typically the combiner is same as the the <code>Reducer</code> for the
+ * <p>The combiner is an application-specified aggregation operation, which
+ * can help cut down the amount of data transferred between the
+ * {@link Mapper} and the {@link Reducer}, leading to better performance.</p>
+ *
+ * <p>The framework may invoke the combiner 0, 1, or multiple times, in both
+ * the mapper and reducer tasks. In general, the combiner is called as the
+ * sort/merge result is written to disk. The combiner must:
+ * <ul>
+ * <li> be side-effect free</li>
+ * <li> have the same input and output key types and the same input and
+ * output value types</li>
+ * </ul></p>
+ *
+ * <p>Typically the combiner is same as the <code>Reducer</code> for the
* job i.e. {@link #setReducerClass(Class)}.</p>
*
* @param theClass the user-defined combiner class used to combine
@@ -1155,7 +1164,7 @@
/**
* Set whether the system should collect profiler information for some of
- * the tasks in this job? The information is stored in the the user log
+ * the tasks in this job? The information is stored in the user log
* directory.
* @param newValue true means it should be gathered
*/