You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by el...@apache.org on 2015/08/27 18:06:15 UTC

[06/12] accumulo git commit: ACCUMULO-3959 Rewrite BatchWriter javadoc

ACCUMULO-3959 Rewrite BatchWriter javadoc


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/d6427e1c
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/d6427e1c
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/d6427e1c

Branch: refs/heads/master
Commit: d6427e1ccd6cab7c9f40cc188d5258dbeb71c97c
Parents: 01dd7e3
Author: Dylan Hutchison <dh...@mit.edu>
Authored: Mon Aug 24 19:01:23 2015 -0400
Committer: Dylan Hutchison <dh...@mit.edu>
Committed: Mon Aug 24 19:01:23 2015 -0400

----------------------------------------------------------------------
 .../accumulo/core/client/BatchScanner.java      | 25 +++++++++++++++-----
 1 file changed, 19 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/d6427e1c/core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java
----------------------------------------------------------------------
diff --git a/core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java b/core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java
index bd7eb88..af0fd85 100644
--- a/core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java
+++ b/core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java
@@ -22,15 +22,28 @@ import java.util.Collection;
 import java.util.concurrent.TimeUnit;
 
 /**
- * Implementations of BatchScanner support efficient lookups of many ranges in accumulo.
- * BatchScanners are also appropriate for large, single ranges,
- * as a BatchScanner will break those ranges up into separate RPCs
- * provided the range spans more than one tablet
- * and there are sufficiently many scan threads available.
+ * In exchange for possibly <b>returning scanned entries out of order</b>,
+ * BatchScanner implementations may scan an Accumulo table more efficiently by
+ * <ul>
+ *   <li>Looking up multiple ranges in parallel.
+ *   Parallelism is constrained by the number of threads available to the BatchScanner, set in its constructor.</li>
+ *   <li>Breaking up large ranges into subranges.
+ *   Often the number and boundaries of subranges are determined by a table's split points.</li>
+ *   <li>Combining multiple ranges into a single RPC call to a tablet server.</li>
+ * </ul>
  *
- * Only use this when you do not care about returned data being in sorted order.
+ * The above techniques lead to better performance than a {@link Scanner} in use cases such as
+ * <ul>
+ *   <li>Retrieving many small ranges</li>
+ *   <li>Scanning a large range that returns many entries</li>
+ *   <li>Running server-side iterators that perform computation,
+ *   even if few entries are returned from the scan itself</li>
+ * </ul>
+ *
+ * To re-emphasize, only use a BatchScanner when you do not care whether returned data is in sorted order.
  * Use a {@link Scanner} instead when sorted order is important.
  *
+ * <p>
  * A BatchScanner instance will use no more threads than provided in the construction of the BatchScanner
  * implementation. Multiple invocations of <code>iterator()</code> will all share the same resources of the instance.
  * A new BatchScanner instance should be created to use allocate additional threads.