You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Gera Shegalov (JIRA)" <ji...@apache.org> on 2014/02/06 20:14:17 UTC

[jira] [Commented] (MAPREDUCE-5744) Job hangs because RMContainerAllocator$AssignedRequests.preemptReduce() violates the comparator contract

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893678#comment-13893678 ] 

Gera Shegalov commented on MAPREDUCE-5744:
------------------------------------------

We suggest to change:

{code}
diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java b/hadoop-ma
index 176945a..a6e898e 100644
--- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
+++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
@@ -1112,7 +1112,7 @@ void preemptReduce(int toPreempt) {
         public int compare(TaskAttemptId o1, TaskAttemptId o2) {
           float p = getJob().getTask(o1.getTaskId()).getAttempt(o1).getProgress() -
               getJob().getTask(o2.getTaskId()).getAttempt(o2).getProgress();
-          return p >= 0 ? 1 : -1;
+          return (int)p;
         }
       });
{code}

> Job hangs because RMContainerAllocator$AssignedRequests.preemptReduce() violates the comparator contract
> --------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5744
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5744
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.0.5-alpha
>            Reporter: Sangjin Lee
>            Assignee: Gera Shegalov
>            Priority: Blocker
>
> We ran into a situation where tasks are not getting assigned because RMContainerAllocator$AssignedRequests.preemptReduce() fails repeatedly with the following exception:
> {code}
> 2014-02-06 16:43:45,183 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM.
> java.lang.IllegalArgumentException: Comparison method violates its general contract!
>      at java.util.TimSort.mergeLo(TimSort.java:747)
>      at java.util.TimSort.mergeAt(TimSort.java:483)
>      at java.util.TimSort.mergeCollapse(TimSort.java:408)
>      at java.util.TimSort.sort(TimSort.java:214)
>      at java.util.TimSort.sort(TimSort.java:173)
>      at java.util.Arrays.sort(Arrays.java:659)
>      at java.util.Collections.sort(Collections.java:217)
>      at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$AssignedRequests.preemptReduce(RMContainerAllocator.java:1106)
>      at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.preemptReducesIfNeeded(RMContainerAllocator.java:416)
>      at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:230)
>      at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:252)
>      at java.lang.Thread.run(Thread.java:744)
> {code}
> It is because the comparator that's defined in this method does not abide by the contract, specifically if p == 0.
> Comparator.compare(): http://docs.oracle.com/javase/7/docs/api/java/util/Comparator.html#compare(T, T)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)