You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by cu...@apache.org on 2006/10/04 19:25:11 UTC
svn commit: r452945 - in /lucene/hadoop/trunk: CHANGES.txt
src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java
Author: cutting
Date: Wed Oct 4 10:25:10 2006
New Revision: 452945
URL: http://svn.apache.org/viewvc?view=rev&rev=452945
Log:
HADOOP-343. Fix mapred copying so that a failed tasktracker does not slow other copies. Contributed by Sameer.
Modified:
lucene/hadoop/trunk/CHANGES.txt
lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java
Modified: lucene/hadoop/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/CHANGES.txt?view=diff&rev=452945&r1=452944&r2=452945
==============================================================================
--- lucene/hadoop/trunk/CHANGES.txt (original)
+++ lucene/hadoop/trunk/CHANGES.txt Wed Oct 4 10:25:10 2006
@@ -132,6 +132,9 @@
permits, e.g., TextInputFormat to again operate on non-UTF-8 data.
(Hairong and Mahadev via cutting)
+32. HADOOP-343. Fix mapred copying so that a failed tasktracker
+ doesn't cause other copies to slow. (Sameer Paranjpye via cutting)
+
Release 0.6.2 - 2006-09-18
Modified: lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java
URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java?view=diff&rev=452945&r1=452944&r2=452945
==============================================================================
--- lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java (original)
+++ lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java Wed Oct 4 10:25:10 2006
@@ -455,6 +455,20 @@
LOG.warn(reduceTask.getTaskId() + " adding host " +
cr.getHost() + " to penalty box, next contact in " +
((nextContact-currentTime)/1000) + " seconds");
+
+ // other outputs from the failed host may be present in the
+ // knownOutputs cache, purge them. This is important in case
+ // the failure is due to a lost tasktracker (causes many
+ // unnecessary backoffs). If not, we only take a small hit
+ // polling the jobtracker a few more times
+ ListIterator locIt = knownOutputs.listIterator();
+ while (locIt.hasNext()) {
+ MapOutputLocation loc = (MapOutputLocation)locIt.next();
+ if (cr.getHost().equals(loc.getHost())) {
+ locIt.remove();
+ neededOutputs.add(new Integer(loc.getMapId()));
+ }
+ }
}
uniqueHosts.remove(cr.getHost());
numInFlight--;