You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "lbkzman (JIRA)" <ji...@apache.org> on 2015/02/05 15:44:35 UTC

[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

lbkzman updated MAPREDUCE-6245:
-------------------------------
    Affects Version/s: 2.6.0
         Release Note: 
index 72b47f2..8b89782 100644
--- src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
+++ src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
@@ -203,12 +203,8 @@ public class InputSampler<K,V> extends Configured implement
s Tool  {
       r.setSeed(seed);
       LOG.debug("seed: " + seed);
       // shuffle splits
-      for (int i = 0; i < splits.size(); ++i) {
-        InputSplit tmp = splits.get(i);
-        int j = r.nextInt(splits.size());
-        splits.set(i, splits.get(j));
-        splits.set(j, tmp);
-      }
+      Collections.shuffle(splits);      
+
       // our target rate is in terms of the maximum number of sample splits,
       // but we accept the possibility of sampling additional splits to hit
       // the target sample keyset

               Status: Patch Available  (was: Open)

> Fixed split shuffling.
> ----------------------
>
>                 Key: MAPREDUCE-6245
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: lbkzman
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)