You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by el...@apache.org on 2015/04/28 05:11:33 UTC

[1/3] accumulo git commit: ACCUMULO-3646 Update iterator doc on entries past seek()

Repository: accumulo
Updated Branches:
  refs/heads/1.7 66075c3ea -> 481b11922
  refs/heads/master cf0bcb3fb -> e9ad5345f


ACCUMULO-3646 Update iterator doc on entries past seek()

Add warning to SKVI javadoc and to iterator chapter in Accumulo manual.
Fix inconsistency with Javadoc in iterator chapter.
Add warning on aliasing of getTopKey()/getTopValue() to chapter.

Closes #33

Signed-off-by: Josh Elser <el...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/481b1192
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/481b1192
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/481b1192

Branch: refs/heads/1.7
Commit: 481b11922ff8aec26168a5afebcca5fe72c5ec4b
Parents: 66075c3
Author: Dylan Hutchison <dh...@mit.edu>
Authored: Sat Apr 25 20:02:04 2015 -0400
Committer: Josh Elser <el...@apache.org>
Committed: Mon Apr 27 23:03:55 2015 -0400

----------------------------------------------------------------------
 .../core/iterators/SortedKeyValueIterator.java  |  8 ++++++
 .../main/asciidoc/chapters/iterator_design.txt  | 27 +++++++++++++++-----
 2 files changed, 29 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/481b1192/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
----------------------------------------------------------------------
diff --git a/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java b/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
index 232dc2a..f6d3170 100644
--- a/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
+++ b/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
@@ -106,6 +106,10 @@ public interface SortedKeyValueIterator<K extends WritableComparable<?>,V extend
    * Returns top key. Can be called 0 or more times without affecting behavior of next() or hasTop(). Note that in minor compaction scope and in non-full major
    * compaction scopes the iterator may see deletion entries. These entries should be preserved by all iterators except ones that are strictly scan-time
    * iterators that will never be configured for the minc or majc scopes. Deletion entries are only removed during full major compactions.
+   * <p>
+   * For performance reasons, iterators reserve the right to reuse objects returned by <tt>getTopKey</tt> when {@link #next()} is called, changing the data
+   * that the object references. Iterators that need to save an object returned by <tt>getTopKey</tt> ought to copy the object's data into a new object
+   * in order to avoid aliasing bugs.
    *
    * @return <tt>K</tt>
    * @exception IllegalStateException
@@ -117,6 +121,10 @@ public interface SortedKeyValueIterator<K extends WritableComparable<?>,V extend
 
   /**
    * Returns top value. Can be called 0 or more times without affecting behavior of next() or hasTop().
+   * <p>
+   * For performance reasons, iterators reserve the right to reuse objects returned by <tt>getTopValue</tt> when {@link #next()} is called, changing the
+   * underlying data that the object references. Iterators that need to save an object returned by <tt>getTopValue</tt> ought to copy the object's data
+   * into a new object in order to avoid aliasing bugs.
    *
    * @return <tt>V</tt>
    * @exception IllegalStateException

http://git-wip-us.apache.org/repos/asf/accumulo/blob/481b1192/docs/src/main/asciidoc/chapters/iterator_design.txt
----------------------------------------------------------------------
diff --git a/docs/src/main/asciidoc/chapters/iterator_design.txt b/docs/src/main/asciidoc/chapters/iterator_design.txt
index b4c1c69..02c9973 100644
--- a/docs/src/main/asciidoc/chapters/iterator_design.txt
+++ b/docs/src/main/asciidoc/chapters/iterator_design.txt
@@ -119,8 +119,9 @@ Range. For example, a regular expression Iterator would consume all records whic
 pattern before returning from `seek`.
 
 It is important to retain the original Range passed to this method to know when this Iterator should stop
-reading more Key-Value pairs. Ignoring this will typically not result in errors; however, it will result
-in wasted time from unnecessary computation.
+reading more Key-Value pairs. Ignoring this typically does not affect scans from a Scanner, but it
+will result in duplicate keys emitting from a BatchScan if the scanned table has more than one tablet.
+Best practice is to never emit entries outside the seek range.
 
 ==== `next`
 
@@ -145,8 +146,21 @@ alter the internal state of the Iterator.
 
 These methods simply return the current Key-Value pair for this iterator. If `hasTop` returns true,
 both of these methods should return non-null objects. If `hasTop` returns false, it is undefined
-what these methods should return. Multiple calls to these methods should not alter the state
-of the Iterator like `hasTop`.
+what these methods should return. Like `hasTop`, multiple calls to these methods should not alter
+the state of the Iterator.
+
+Users should take caution when either
+
+1. caching the Key/Value from `getTopKey`/`getTopValue`, for use after calling `next` on the source iterator.
+In this case, the cached Key/Value object is aliased to the reference returned by the source iterator.
+Iterators may reuse the same Key/Value object in a `next` call for performance reasons, changing the data
+that the cached Key/Value object references and resulting in a logic bug.
+2. modifying the Key/Value from `getTopKey`/`getTopValue`. If the source iterator reuses data stored in the Key/Value,
+then the source iterator may use the modified data that the Key/Value references. This may/may not result in a logic bug.
+
+In both cases, copying the Key/Value's data into a new object ensures iterator correctness. If neither case applies,
+it is safe to not copy the Key/Value.  The general guideline is to be aware of who else may use Key/Value objects
+returned from `getTopKey`/`getTopValue`.
 
 ==== `deepCopy`
 
@@ -154,8 +168,9 @@ The `deepCopy` method is similar to the `clone` method from the Java `Cloneable`
 Implementations of this method should return a new object of the same type as the Accumulo Iterator
 instance it was called on. Any internal state from the instance `deepCopy` was called
 on should be carried over to the returned copy. The returned copy should be ready to have
-`seek` called on it. It is unspecified if `init` will be called on the original object or the
-copy before or after `deepCopy` is called.
+`seek` called on it. The SortedKeyValueIterator interface guarantees that `init` will be called on
+an iterator before `deepCopy` and that `init` will not be called on the iterator returned by
+`deepCopy`.
 
 Typically, implementations of `deepCopy` call a copy-constructor which will initialize
 internal data structures. As with `seek`, it is common for the `IteratorEnvironment`


[3/3] accumulo git commit: Merge branch '1.7'

Posted by el...@apache.org.
Merge branch '1.7'


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/e9ad5345
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/e9ad5345
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/e9ad5345

Branch: refs/heads/master
Commit: e9ad5345f7e009e055fd74b3c33ec5225c55291a
Parents: cf0bcb3 481b119
Author: Josh Elser <el...@apache.org>
Authored: Mon Apr 27 23:11:23 2015 -0400
Committer: Josh Elser <el...@apache.org>
Committed: Mon Apr 27 23:11:23 2015 -0400

----------------------------------------------------------------------
 .../core/iterators/SortedKeyValueIterator.java  |  8 ++++++
 .../main/asciidoc/chapters/iterator_design.txt  | 27 +++++++++++++++-----
 2 files changed, 29 insertions(+), 6 deletions(-)
----------------------------------------------------------------------



[2/3] accumulo git commit: ACCUMULO-3646 Update iterator doc on entries past seek()

Posted by el...@apache.org.
ACCUMULO-3646 Update iterator doc on entries past seek()

Add warning to SKVI javadoc and to iterator chapter in Accumulo manual.
Fix inconsistency with Javadoc in iterator chapter.
Add warning on aliasing of getTopKey()/getTopValue() to chapter.

Closes #33

Signed-off-by: Josh Elser <el...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/481b1192
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/481b1192
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/481b1192

Branch: refs/heads/master
Commit: 481b11922ff8aec26168a5afebcca5fe72c5ec4b
Parents: 66075c3
Author: Dylan Hutchison <dh...@mit.edu>
Authored: Sat Apr 25 20:02:04 2015 -0400
Committer: Josh Elser <el...@apache.org>
Committed: Mon Apr 27 23:03:55 2015 -0400

----------------------------------------------------------------------
 .../core/iterators/SortedKeyValueIterator.java  |  8 ++++++
 .../main/asciidoc/chapters/iterator_design.txt  | 27 +++++++++++++++-----
 2 files changed, 29 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/481b1192/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
----------------------------------------------------------------------
diff --git a/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java b/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
index 232dc2a..f6d3170 100644
--- a/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
+++ b/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
@@ -106,6 +106,10 @@ public interface SortedKeyValueIterator<K extends WritableComparable<?>,V extend
    * Returns top key. Can be called 0 or more times without affecting behavior of next() or hasTop(). Note that in minor compaction scope and in non-full major
    * compaction scopes the iterator may see deletion entries. These entries should be preserved by all iterators except ones that are strictly scan-time
    * iterators that will never be configured for the minc or majc scopes. Deletion entries are only removed during full major compactions.
+   * <p>
+   * For performance reasons, iterators reserve the right to reuse objects returned by <tt>getTopKey</tt> when {@link #next()} is called, changing the data
+   * that the object references. Iterators that need to save an object returned by <tt>getTopKey</tt> ought to copy the object's data into a new object
+   * in order to avoid aliasing bugs.
    *
    * @return <tt>K</tt>
    * @exception IllegalStateException
@@ -117,6 +121,10 @@ public interface SortedKeyValueIterator<K extends WritableComparable<?>,V extend
 
   /**
    * Returns top value. Can be called 0 or more times without affecting behavior of next() or hasTop().
+   * <p>
+   * For performance reasons, iterators reserve the right to reuse objects returned by <tt>getTopValue</tt> when {@link #next()} is called, changing the
+   * underlying data that the object references. Iterators that need to save an object returned by <tt>getTopValue</tt> ought to copy the object's data
+   * into a new object in order to avoid aliasing bugs.
    *
    * @return <tt>V</tt>
    * @exception IllegalStateException

http://git-wip-us.apache.org/repos/asf/accumulo/blob/481b1192/docs/src/main/asciidoc/chapters/iterator_design.txt
----------------------------------------------------------------------
diff --git a/docs/src/main/asciidoc/chapters/iterator_design.txt b/docs/src/main/asciidoc/chapters/iterator_design.txt
index b4c1c69..02c9973 100644
--- a/docs/src/main/asciidoc/chapters/iterator_design.txt
+++ b/docs/src/main/asciidoc/chapters/iterator_design.txt
@@ -119,8 +119,9 @@ Range. For example, a regular expression Iterator would consume all records whic
 pattern before returning from `seek`.
 
 It is important to retain the original Range passed to this method to know when this Iterator should stop
-reading more Key-Value pairs. Ignoring this will typically not result in errors; however, it will result
-in wasted time from unnecessary computation.
+reading more Key-Value pairs. Ignoring this typically does not affect scans from a Scanner, but it
+will result in duplicate keys emitting from a BatchScan if the scanned table has more than one tablet.
+Best practice is to never emit entries outside the seek range.
 
 ==== `next`
 
@@ -145,8 +146,21 @@ alter the internal state of the Iterator.
 
 These methods simply return the current Key-Value pair for this iterator. If `hasTop` returns true,
 both of these methods should return non-null objects. If `hasTop` returns false, it is undefined
-what these methods should return. Multiple calls to these methods should not alter the state
-of the Iterator like `hasTop`.
+what these methods should return. Like `hasTop`, multiple calls to these methods should not alter
+the state of the Iterator.
+
+Users should take caution when either
+
+1. caching the Key/Value from `getTopKey`/`getTopValue`, for use after calling `next` on the source iterator.
+In this case, the cached Key/Value object is aliased to the reference returned by the source iterator.
+Iterators may reuse the same Key/Value object in a `next` call for performance reasons, changing the data
+that the cached Key/Value object references and resulting in a logic bug.
+2. modifying the Key/Value from `getTopKey`/`getTopValue`. If the source iterator reuses data stored in the Key/Value,
+then the source iterator may use the modified data that the Key/Value references. This may/may not result in a logic bug.
+
+In both cases, copying the Key/Value's data into a new object ensures iterator correctness. If neither case applies,
+it is safe to not copy the Key/Value.  The general guideline is to be aware of who else may use Key/Value objects
+returned from `getTopKey`/`getTopValue`.
 
 ==== `deepCopy`
 
@@ -154,8 +168,9 @@ The `deepCopy` method is similar to the `clone` method from the Java `Cloneable`
 Implementations of this method should return a new object of the same type as the Accumulo Iterator
 instance it was called on. Any internal state from the instance `deepCopy` was called
 on should be carried over to the returned copy. The returned copy should be ready to have
-`seek` called on it. It is unspecified if `init` will be called on the original object or the
-copy before or after `deepCopy` is called.
+`seek` called on it. The SortedKeyValueIterator interface guarantees that `init` will be called on
+an iterator before `deepCopy` and that `init` will not be called on the iterator returned by
+`deepCopy`.
 
 Typically, implementations of `deepCopy` call a copy-constructor which will initialize
 internal data structures. As with `seek`, it is common for the `IteratorEnvironment`