You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "lbkzman (JIRA)" <ji...@apache.org> on 2015/02/05 15:44:35 UTC
[jira] [Updated] (MAPREDUCE-6245) Fixed split shuffling.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
lbkzman updated MAPREDUCE-6245:
-------------------------------
Affects Version/s: 2.6.0
Release Note:
index 72b47f2..8b89782 100644
--- src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
+++ src/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java
@@ -203,12 +203,8 @@ public class InputSampler<K,V> extends Configured implement
s Tool {
r.setSeed(seed);
LOG.debug("seed: " + seed);
// shuffle splits
- for (int i = 0; i < splits.size(); ++i) {
- InputSplit tmp = splits.get(i);
- int j = r.nextInt(splits.size());
- splits.set(i, splits.get(j));
- splits.set(j, tmp);
- }
+ Collections.shuffle(splits);
+
// our target rate is in terms of the maximum number of sample splits,
// but we accept the possibility of sampling additional splits to hit
// the target sample keyset
Status: Patch Available (was: Open)
> Fixed split shuffling.
> ----------------------
>
> Key: MAPREDUCE-6245
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6245
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 2.6.0
> Reporter: lbkzman
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)