You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by mw...@apache.org on 2018/04/09 17:03:37 UTC

[accumulo-examples] branch master updated: Update batch example to use new Connector builder (#14)

This is an automated email from the ASF dual-hosted git repository.

mwalch pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/accumulo-examples.git


The following commit(s) were added to refs/heads/master by this push:
     new 4094d00  Update batch example to use new Connector builder (#14)
4094d00 is described below

commit 4094d0086a4f6841ecee14844d30fa47600b4cc0
Author: Mike Walch <mw...@apache.org>
AuthorDate: Mon Apr 9 13:03:35 2018 -0400

    Update batch example to use new Connector builder (#14)
    
    * Also update README.md instructions telling users to set up
      accumulo-client.properties
---
 README.md                                          |   5 +-
 docs/batch.md                                      |  62 +++----
 .../examples/client/CountingVerifyingReceiver.java |   6 +-
 .../examples/client/RandomBatchScanner.java        | 201 +++++++--------------
 .../examples/client/SequentialBatchWriter.java     |  68 ++++---
 5 files changed, 137 insertions(+), 205 deletions(-)

diff --git a/README.md b/README.md
index c4450eb..dbea091 100644
--- a/README.md
+++ b/README.md
@@ -34,10 +34,11 @@ Before running any of the examples, the following steps must be performed.
         git clone https://github.com/apache/accumulo-examples.git
         mvn clean package
 
-4. Specify Accumulo connection information.  All examples read connection information from a 
-   properties file. Copy the template and edit it.
+4. Specify Accumulo connection information in `conf/accumulo-client.properties`.  Some old examples
+   still read connection information from an examples.conf file so that should also be configured.
 
         cd accumulo-examples
+        nano conf/accumulo-client.properties
         cp examples.conf.template examples.conf
         nano examples.conf
 
diff --git a/docs/batch.md b/docs/batch.md
index 19acf84..305a732 100644
--- a/docs/batch.md
+++ b/docs/batch.md
@@ -16,42 +16,40 @@ limitations under the License.
 -->
 # Apache Accumulo Batch Writing and Scanning Example
 
-This tutorial uses the following Java classes:
-
- * [SequentialBatchWriter.java] - writes mutations with sequential rows and random values
- * [RandomBatchWriter.java] - used by SequentialBatchWriter to generate random values
- * [RandomBatchScanner.java] - reads random rows and verifies their values
-
 This is an example of how to use the BatchWriter and BatchScanner.
 
-First, you must ensure that the user you are running with (i.e `myuser` below) has the
-`exampleVis` authorization.
-
-    $ accumulo shell -u root -e "setauths -u myuser -s exampleVis"
-
-Second, you must create the table, batchtest1, ahead of time.
-
-    $ accumulo shell -u root -e "createtable batchtest1"
-
-The command below adds 10000 entries with random 50 bytes values to Accumulo.
+This tutorial uses the following Java classes.
 
-    $ ./bin/runex client.SequentialBatchWriter -c ./examples.conf -t batchtest1 --start 0 --num 10000 --size 50 --batchMemory 20M --batchLatency 500 --batchThreads 20 --vis exampleVis
-
-The command below will do 100 random queries.
-
-    $ ./bin/runex client.RandomBatchScanner -c ./examples.conf -t batchtest1 --num 100 --min 0 --max 10000 --size 50 --scanThreads 20 --auths exampleVis
-
-    07 11:33:11,103 [client.CountingVerifyingReceiver] INFO : Generating 100 random queries...
-    07 11:33:11,112 [client.CountingVerifyingReceiver] INFO : finished
-    07 11:33:11,260 [client.CountingVerifyingReceiver] INFO : 694.44 lookups/sec   0.14 secs
-
-    07 11:33:11,260 [client.CountingVerifyingReceiver] INFO : num results : 100
-
-    07 11:33:11,364 [client.CountingVerifyingReceiver] INFO : Generating 100 random queries...
-    07 11:33:11,370 [client.CountingVerifyingReceiver] INFO : finished
-    07 11:33:11,416 [client.CountingVerifyingReceiver] INFO : 2173.91 lookups/sec   0.05 secs
+ * [SequentialBatchWriter.java] - writes mutations with sequential rows and random values
+ * [RandomBatchScanner.java] - reads random rows and verifies their values
 
-    07 11:33:11,416 [client.CountingVerifyingReceiver] INFO : num results : 100
+Run `SequentialBatchWriter` to add 10000 entries with random 50 bytes values to Accumulo.
+
+    $ ./bin/runex client.SequentialBatchWriter
+
+Verify data was ingested by scanning the table using the Accumulo shell:
+
+    $ accumulo shell
+    root@instance> table batch
+    root@instance batch> scan
+
+Run `RandomBatchScanner` to perform 1000 random queries and verify the results.
+
+    $ ./bin/runex client.RandomBatchScanner
+    16:04:05,950 [examples.client.RandomBatchScanner] INFO : Generating 1000 random ranges for BatchScanner to read
+    16:04:06,020 [examples.client.RandomBatchScanner] INFO : Reading ranges using BatchScanner
+    16:04:06,283 [examples.client.RandomBatchScanner] TRACE: 100 lookups
+    16:04:06,290 [examples.client.RandomBatchScanner] TRACE: 200 lookups
+    16:04:06,294 [examples.client.RandomBatchScanner] TRACE: 300 lookups
+    16:04:06,297 [examples.client.RandomBatchScanner] TRACE: 400 lookups
+    16:04:06,301 [examples.client.RandomBatchScanner] TRACE: 500 lookups
+    16:04:06,304 [examples.client.RandomBatchScanner] TRACE: 600 lookups
+    16:04:06,307 [examples.client.RandomBatchScanner] TRACE: 700 lookups
+    16:04:06,309 [examples.client.RandomBatchScanner] TRACE: 800 lookups
+    16:04:06,316 [examples.client.RandomBatchScanner] TRACE: 900 lookups
+    16:04:06,320 [examples.client.RandomBatchScanner] TRACE: 1000 lookups
+    16:04:06,330 [examples.client.RandomBatchScanner] INFO : Scan finished! 3246.75 lookups/sec, 0.31 secs, 1000 results
+    16:04:06,331 [examples.client.RandomBatchScanner] INFO : All expected rows were scanned
 
 [SequentialBatchWriter.java]: ../src/main/java/org/apache/accumulo/examples/client/SequentialBatchWriter.java
 [RandomBatchWriter.java]:  ../src/main/java/org/apache/accumulo/examples/client/RandomBatchWriter.java
diff --git a/src/main/java/org/apache/accumulo/examples/client/CountingVerifyingReceiver.java b/src/main/java/org/apache/accumulo/examples/client/CountingVerifyingReceiver.java
index 51fc370..ac5eb11 100644
--- a/src/main/java/org/apache/accumulo/examples/client/CountingVerifyingReceiver.java
+++ b/src/main/java/org/apache/accumulo/examples/client/CountingVerifyingReceiver.java
@@ -35,9 +35,9 @@ class CountingVerifyingReceiver {
 
   long count = 0;
   int expectedValueSize = 0;
-  HashMap<Text,Boolean> expectedRows;
+  HashMap<String,Boolean> expectedRows;
 
-  CountingVerifyingReceiver(HashMap<Text,Boolean> expectedRows, int expectedValueSize) {
+  CountingVerifyingReceiver(HashMap<String,Boolean> expectedRows, int expectedValueSize) {
     this.expectedRows = expectedRows;
     this.expectedValueSize = expectedValueSize;
   }
@@ -56,7 +56,7 @@ class CountingVerifyingReceiver {
     if (!expectedRows.containsKey(key.getRow())) {
       log.error("Got unexpected key " + key);
     } else {
-      expectedRows.put(key.getRow(), true);
+      expectedRows.put(key.getRow().toString(), true);
     }
 
     count++;
diff --git a/src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java b/src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java
index ac32827..9a88d39 100644
--- a/src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java
+++ b/src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java
@@ -16,179 +16,102 @@
  */
 package org.apache.accumulo.examples.client;
 
+import static java.nio.charset.StandardCharsets.UTF_8;
 import static org.apache.accumulo.examples.client.RandomBatchWriter.abs;
 
+import java.util.Arrays;
 import java.util.HashMap;
 import java.util.HashSet;
 import java.util.Map.Entry;
 import java.util.Random;
-import java.util.concurrent.TimeUnit;
 
 import org.apache.accumulo.core.client.AccumuloException;
 import org.apache.accumulo.core.client.AccumuloSecurityException;
 import org.apache.accumulo.core.client.BatchScanner;
 import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.TableExistsException;
 import org.apache.accumulo.core.client.TableNotFoundException;
 import org.apache.accumulo.core.data.Key;
 import org.apache.accumulo.core.data.Range;
 import org.apache.accumulo.core.data.Value;
-import org.apache.accumulo.examples.cli.BatchScannerOpts;
-import org.apache.accumulo.examples.cli.ClientOnRequiredTable;
-import org.apache.hadoop.io.Text;
+import org.apache.accumulo.core.security.Authorizations;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import com.beust.jcommander.Parameter;
-
 /**
  * Simple example for reading random batches of data from Accumulo.
  */
 public class RandomBatchScanner {
-  private static final Logger log = LoggerFactory.getLogger(RandomBatchScanner.class);
-
-  /**
-   * Generate a number of ranges, each covering a single random row.
-   *
-   * @param num
-   *          the number of ranges to generate
-   * @param min
-   *          the minimum row that will be generated
-   * @param max
-   *          the maximum row that will be generated
-   * @param r
-   *          a random number generator
-   * @param ranges
-   *          a set in which to store the generated ranges
-   * @param expectedRows
-   *          a map in which to store the rows covered by the ranges (initially mapped to false)
-   */
-  static void generateRandomQueries(int num, long min, long max, Random r, HashSet<Range> ranges, HashMap<Text,Boolean> expectedRows) {
-    log.info(String.format("Generating %,d random queries...", num));
-    while (ranges.size() < num) {
-      long rowid = (abs(r.nextLong()) % (max - min)) + min;
 
-      Text row1 = new Text(String.format("row_%010d", rowid));
+  private static final Logger log = LoggerFactory.getLogger(RandomBatchScanner.class);
 
-      Range range = new Range(new Text(row1));
-      ranges.add(range);
-      expectedRows.put(row1, false);
+  public static void main(String[] args) throws AccumuloException, AccumuloSecurityException, TableNotFoundException {
+    Connector connector = Connector.builder().usingProperties("conf/accumulo-client.properties").build();
+    try {
+      connector.tableOperations().create("batch");
+    } catch (TableExistsException e) {
+      // ignore
     }
 
-    log.info("finished");
-  }
-
-  /**
-   * Prints a count of the number of rows mapped to false.
-   *
-   * @return boolean indicating "were all the rows found?"
-   */
-  private static boolean checkAllRowsFound(HashMap<Text,Boolean> expectedRows) {
-    int count = 0;
-    boolean allFound = true;
-    for (Entry<Text,Boolean> entry : expectedRows.entrySet())
-      if (!entry.getValue())
-        count++;
-
-    if (count > 0) {
-      log.warn("Did not find " + count + " rows");
-      allFound = false;
+    int totalLookups = 1000;
+    int totalEntries = 10000;
+    Random r = new Random();
+    HashSet<Range> ranges = new HashSet<>();
+    HashMap<String,Boolean> expectedRows = new HashMap<>();
+    log.info("Generating {} random ranges for BatchScanner to read", totalLookups);
+    while (ranges.size() < totalLookups) {
+      long rowId = abs(r.nextLong()) % totalEntries;
+      String row = String.format("row_%010d", rowId);
+      ranges.add(new Range(row));
+      expectedRows.put(row, false);
     }
-    return allFound;
-  }
-
-  /**
-   * Generates a number of random queries, verifies that the key/value pairs returned were in the queried ranges and that the values were generated by
-   * {@link RandomBatchWriter#createValue(long, int)}. Prints information about the results.
-   *
-   * @param num
-   *          the number of queries to generate
-   * @param min
-   *          the min row to query
-   * @param max
-   *          the max row to query
-   * @param evs
-   *          the expected size of the values
-   * @param r
-   *          a random number generator
-   * @param tsbr
-   *          a batch scanner
-   * @return boolean indicating "did the queries go fine?"
-   */
-  static boolean doRandomQueries(int num, long min, long max, int evs, Random r, BatchScanner tsbr) {
-
-    HashSet<Range> ranges = new HashSet<>(num);
-    HashMap<Text,Boolean> expectedRows = new java.util.HashMap<>();
-
-    generateRandomQueries(num, min, max, r, ranges, expectedRows);
-
-    tsbr.setRanges(ranges);
-
-    CountingVerifyingReceiver receiver = new CountingVerifyingReceiver(expectedRows, evs);
 
     long t1 = System.currentTimeMillis();
-
-    for (Entry<Key,Value> entry : tsbr) {
-      receiver.receive(entry.getKey(), entry.getValue());
+    long lookups = 0;
+
+    log.info("Reading ranges using BatchScanner");
+    try (BatchScanner scan = connector.createBatchScanner("batch", Authorizations.EMPTY, 20)) {
+      scan.setRanges(ranges);
+      for (Entry<Key, Value> entry : scan) {
+        Key key = entry.getKey();
+        Value value = entry.getValue();
+        String row = key.getRow().toString();
+        long rowId = Integer.parseInt(row.split("_")[1]);
+
+        Value expectedValue = SequentialBatchWriter.createValue(rowId);
+
+        if (!Arrays.equals(expectedValue.get(), value.get())) {
+          log.error("Unexpected value for key: {} expected: {} actual: {}", key,
+              new String(expectedValue.get(), UTF_8), new String(value.get(), UTF_8));
+        }
+
+        if (!expectedRows.containsKey(key.getRow().toString())) {
+          log.error("Encountered unexpected key: {} ", key);
+        } else {
+          expectedRows.put(key.getRow().toString(), true);
+        }
+
+        lookups++;
+        if (lookups % 100 == 0) {
+          log.trace("{} lookups", lookups);
+        }
+      }
     }
 
     long t2 = System.currentTimeMillis();
+    log.info(String.format("Scan finished! %6.2f lookups/sec, %.2f secs, %d results",
+        lookups / ((t2 - t1) / 1000.0), ((t2 - t1) / 1000.0), lookups));
 
-    log.info(String.format("%6.2f lookups/sec %6.2f secs%n", num / ((t2 - t1) / 1000.0), ((t2 - t1) / 1000.0)));
-    log.info(String.format("num results : %,d%n", receiver.count));
-
-    return checkAllRowsFound(expectedRows);
-  }
-
-  public static class Opts extends ClientOnRequiredTable {
-    @Parameter(names = "--min", description = "miniumum row that will be generated")
-    long min = 0;
-    @Parameter(names = "--max", description = "maximum ow that will be generated")
-    long max = 0;
-    @Parameter(names = "--num", required = true, description = "number of ranges to generate")
-    int num = 0;
-    @Parameter(names = "--size", required = true, description = "size of the value to write")
-    int size = 0;
-    @Parameter(names = "--seed", description = "seed for pseudo-random number generator")
-    Long seed = null;
-  }
-
-  /**
-   * Scans over a specified number of entries to Accumulo using a {@link BatchScanner}. Completes scans twice to compare times for a fresh query with those for
-   * a repeated query which has cached metadata and connections already established.
-   */
-  public static void main(String[] args) throws AccumuloException, AccumuloSecurityException, TableNotFoundException {
-    Opts opts = new Opts();
-    BatchScannerOpts bsOpts = new BatchScannerOpts();
-    opts.parseArgs(RandomBatchScanner.class.getName(), args, bsOpts);
-
-    Connector connector = opts.getConnector();
-    BatchScanner batchReader = connector.createBatchScanner(opts.getTableName(), opts.auths, bsOpts.scanThreads);
-    batchReader.setTimeout(bsOpts.scanTimeout, TimeUnit.MILLISECONDS);
-
-    Random r;
-    if (opts.seed == null)
-      r = new Random();
-    else
-      r = new Random(opts.seed);
-
-    // do one cold
-    boolean status = doRandomQueries(opts.num, opts.min, opts.max, opts.size, r, batchReader);
-
-    System.gc();
-    System.gc();
-    System.gc();
-
-    if (opts.seed == null)
-      r = new Random();
-    else
-      r = new Random(opts.seed);
-
-    // do one hot (connections already established, metadata table cached)
-    status = status && doRandomQueries(opts.num, opts.min, opts.max, opts.size, r, batchReader);
-
-    batchReader.close();
-    if (!status) {
+    int count = 0;
+    for (Entry<String,Boolean> entry : expectedRows.entrySet()) {
+      if (!entry.getValue()) {
+        count++;
+      }
+    }
+    if (count > 0) {
+      log.warn("Did not find {} rows", count);
       System.exit(1);
     }
+    log.info("All expected rows were scanned");
   }
 }
diff --git a/src/main/java/org/apache/accumulo/examples/client/SequentialBatchWriter.java b/src/main/java/org/apache/accumulo/examples/client/SequentialBatchWriter.java
index 9b57739..6f630f9 100644
--- a/src/main/java/org/apache/accumulo/examples/client/SequentialBatchWriter.java
+++ b/src/main/java/org/apache/accumulo/examples/client/SequentialBatchWriter.java
@@ -16,53 +16,63 @@
  */
 package org.apache.accumulo.examples.client;
 
+import java.util.Random;
+
 import org.apache.accumulo.core.client.AccumuloException;
 import org.apache.accumulo.core.client.AccumuloSecurityException;
 import org.apache.accumulo.core.client.BatchWriter;
 import org.apache.accumulo.core.client.Connector;
-import org.apache.accumulo.core.client.MutationsRejectedException;
+import org.apache.accumulo.core.client.TableExistsException;
 import org.apache.accumulo.core.client.TableNotFoundException;
 import org.apache.accumulo.core.data.Mutation;
-import org.apache.accumulo.core.security.ColumnVisibility;
-import org.apache.accumulo.examples.cli.BatchWriterOpts;
-import org.apache.accumulo.examples.cli.ClientOnRequiredTable;
+import org.apache.accumulo.core.data.Value;
 
-import com.beust.jcommander.Parameter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * Simple example for writing random data in sequential order to Accumulo.
  */
 public class SequentialBatchWriter {
 
-  static class Opts extends ClientOnRequiredTable {
-    @Parameter(names = "--start")
-    long start = 0;
-    @Parameter(names = "--num", required = true)
-    long num = 0;
-    @Parameter(names = "--size", required = true, description = "size of the value to write")
-    int valueSize = 0;
-    @Parameter(names = "--vis", converter = VisibilityConverter.class)
-    ColumnVisibility vis = new ColumnVisibility();
+  private static final Logger log = LoggerFactory.getLogger(SequentialBatchWriter.class);
+
+  public static Value createValue(long rowId) {
+    Random r = new Random(rowId);
+    byte value[] = new byte[50];
+
+    r.nextBytes(value);
+
+    // transform to printable chars
+    for (int j = 0; j < value.length; j++) {
+      value[j] = (byte) (((0xff & value[j]) % 92) + ' ');
+    }
+
+    return new Value(value);
   }
 
   /**
-   * Writes a specified number of entries to Accumulo using a {@link BatchWriter}. The rows of the entries will be sequential starting at a specified number.
-   * The column families will be "foo" and column qualifiers will be "1". The values will be random byte arrays of a specified size.
+   * Writes 1000 entries to Accumulo using a {@link BatchWriter}. The rows of the entries will be sequential starting from 0.
+   * The column families will be "foo" and column qualifiers will be "1". The values will be random 50 byte arrays.
    */
-  public static void main(String[] args) throws AccumuloException, AccumuloSecurityException, TableNotFoundException, MutationsRejectedException {
-    Opts opts = new Opts();
-    BatchWriterOpts bwOpts = new BatchWriterOpts();
-    opts.parseArgs(SequentialBatchWriter.class.getName(), args, bwOpts);
-    Connector connector = opts.getConnector();
-    BatchWriter bw = connector.createBatchWriter(opts.getTableName(), bwOpts.getBatchWriterConfig());
-
-    long end = opts.start + opts.num;
-
-    for (long i = opts.start; i < end; i++) {
-      Mutation m = RandomBatchWriter.createMutation(i, opts.valueSize, opts.vis);
-      bw.addMutation(m);
+  public static void main(String[] args) throws AccumuloException, AccumuloSecurityException, TableNotFoundException {
+    Connector connector = Connector.builder().usingProperties("conf/accumulo-client.properties").build();
+    try {
+      connector.tableOperations().create("batch");
+    } catch (TableExistsException e) {
+      // ignore
     }
 
-    bw.close();
+    try (BatchWriter bw = connector.createBatchWriter("batch")) {
+      for (int i = 0; i < 10000; i++) {
+        Mutation m = new Mutation(String.format("row_%010d", i));
+        // create a random value that is a function of row id for verification purposes
+        m.put("foo", "1", createValue(i));
+        bw.addMutation(m);
+        if (i % 1000 == 0) {
+          log.trace("wrote {} entries", i);
+        }
+      }
+    }
   }
 }

-- 
To stop receiving notification emails like this one, please contact
mwalch@apache.org.