You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by kt...@apache.org on 2022/10/06 14:00:19 UTC

[accumulo-testing] branch main updated: fixes bug with bulk RW file partition point creation (#236)

This is an automated email from the ASF dual-hosted git repository.

kturner pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/accumulo-testing.git


The following commit(s) were added to refs/heads/main by this push:
     new 0f7201f  fixes bug with bulk RW file partition point creation (#236)
0f7201f is described below

commit 0f7201fdf69dcf839ce4066dfaf4d3f4f5a3da2e
Author: Keith Turner <kt...@apache.org>
AuthorDate: Thu Oct 6 15:00:14 2022 +0100

    fixes bug with bulk RW file partition point creation (#236)
    
    There was code that did the following
    
        TreeSet<Integer> startRows = new TreeSet<>();
        startRows.add(0);
        while (startRows.size() < parts)
          startRows.add(rand.nextInt(LOTS));
    
    The above code was replaced in 7453c37 with a stream. The stream
    did not fully capture the original behavior of the loop.  This
    change makes the stream fully capture that behavior.  Need to
    ensure that `parts` unique random numbers are generated including
    zero (like if the random number generator returns zero it should
    be properly deduplicated).  The stream was not properly handling
    the RNG returning duplicates or zero.
---
 .../org/apache/accumulo/testing/randomwalk/bulk/BulkPlusOne.java  | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/main/java/org/apache/accumulo/testing/randomwalk/bulk/BulkPlusOne.java b/src/main/java/org/apache/accumulo/testing/randomwalk/bulk/BulkPlusOne.java
index 448a7e2..54e8209 100644
--- a/src/main/java/org/apache/accumulo/testing/randomwalk/bulk/BulkPlusOne.java
+++ b/src/main/java/org/apache/accumulo/testing/randomwalk/bulk/BulkPlusOne.java
@@ -59,9 +59,11 @@ public class BulkPlusOne extends BulkImportTest {
     log.debug("Bulk loading from {}", dir);
     final int parts = env.getRandom().nextInt(10) + 1;
 
-    TreeSet<Integer> startRows = Stream.generate(() -> env.getRandom().nextInt(LOTS))
-        .limit(parts - 1).collect(Collectors.toCollection(TreeSet::new));
-    startRows.add(0);
+    // The set created below should always contain 0. So its very important that zero is first in
+    // concat below.
+    TreeSet<Integer> startRows = Stream
+        .concat(Stream.of(0), Stream.generate(() -> env.getRandom().nextInt(LOTS))).distinct()
+        .limit(parts).collect(Collectors.toCollection(TreeSet::new));
 
     List<String> printRows = startRows.stream().map(row -> String.format(FMT, row))
         .collect(Collectors.toList());