You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by ib...@apache.org on 2019/06/11 21:03:57 UTC

[accumulo-website] branch master updated: fixes #183: Added a chapter on yielding and fixed pseudocode (#185)

This is an automated email from the ASF dual-hosted git repository.

ibella pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/accumulo-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 915b78b  fixes #183: Added a chapter on yielding and fixed pseudocode (#185)
915b78b is described below

commit 915b78bae83630b4520a164a287152b517255b29
Author: Ivan Bella <iv...@bella.name>
AuthorDate: Tue Jun 11 17:03:52 2019 -0400

    fixes #183: Added a chapter on yielding and fixed pseudocode (#185)
    
    * fixes #183: Added a chapter on yielding and fixed pseudocode
---
 _docs-2/development/iterators.md | 47 ++++++++++++++++++++++++++++++++--------
 1 file changed, 38 insertions(+), 9 deletions(-)

diff --git a/_docs-2/development/iterators.md b/_docs-2/development/iterators.md
index d27fdde..cbcd6fe 100644
--- a/_docs-2/development/iterators.md
+++ b/_docs-2/development/iterators.md
@@ -8,7 +8,7 @@ Accumulo [SortedKeyValueIterators][SortedKeyValueIterator], commonly referred to
 that allow users to implement custom retrieval or computational purpose within Accumulo TabletServers.  The name rightly
 brings forward similarities to the Java Iterator interface; however, Accumulo Iterators are more complex than Java
 Iterators. Notably, in addition to the expected methods to retrieve the current element and advance to the next element
-in the iteration, Accumulo Iterators must also support the ability to "move" (`seek`) to an specified point in the
+in the iteration, Accumulo Iterators must also support the ability to "move" (`seek`) to a specified point in the
 iteration (the Accumulo table). Accumulo Iterators are designed to be concatenated together, similar to applying a
 series of transformations to a list of elements. Accumulo Iterators can duplicate their underlying source to create
 multiple "pointers" over the same underlying data (which is extremely powerful since each stream is sorted) or they can
@@ -18,7 +18,7 @@ are not designed to act as triggers nor are they designed to operate outside of
 
 Understanding how TabletServers invoke the methods on a [SortedKeyValueIterator] can be obtuse as the actual code is
 buried within the implementation of the TabletServer; however, it is generally unnecessary to have a strong
-understanding of this as the interface provides clear definitions about what each action each method should take. This
+understanding of this as the interface provides clear definitions about what each method should take. This
 chapter aims to provide a more detailed description of how Iterators are invoked, some best practices and some common
 pitfalls.
 
@@ -170,6 +170,25 @@ early programming assignments which implement their own tree data structures. `d
 copy on its sources (the children), copies itself, attaches the copies of the children, and
 then returns itself.
 
+## Yielding Interface
+
+If you have implemented an iterator with a next or seek call that can take a very long time
+resulting in starving out other scans within the same thread pool, try implementing the
+optional YieldingKeyValueIterator interface which SortedKeyValueIterator extends.
+
+```java
+default void enableYielding(YieldCallback callback) { }
+```
+
+### enableYielding
+
+The implementation of this method should simply cache the supplied callback as a member of
+the iterator. Then one can call the yield(Key key) method on the callback within a next or
+seek call when the iterator is to yield control.  The supplied key will be used as the
+start key in a follow-on seek call's range allowing the iterator to continue where it left
+off. Note when an iterator yields, the hasTop() method must return false.  Also note that
+the enableYielding method will not be called in isolation mode.
+
 ## TabletServer invocation of Iterators
 
 The following code is a general outline for how TabletServers invoke Iterators.
@@ -187,21 +206,34 @@ while (!overSizeLimit(batch)) {
         source = iter;
     }
 
-    // read a batch of data to return to client
+    // read a batch of data to return to client from
     // the last iterator, the "top"
     SortedKeyValueIterator topIter = source;
-    topIter.seek(getRangeFromUser(), ...)
+
+    YieldCallback cb = new YieldCallback();
+    topIter.enableYielding(cb)
+
+    topIter.seek(range, ...)
 
     while (topIter.hasTop() && !overSizeLimit(batch)) {
         key = topIter.getTopKey()
         val = topIter.getTopValue()
         batch.add(new KeyValue(key, val)
+        // remember the last key returned
+        setLastKeyReturned(key);
         if (systemDataSourcesChanged()) {
             // code does not show isolation case, which will
             // keep using same data sources until a row boundary is hit
             range = new Range(key, false, range.endKey(), range.endKeyInclusive());
             break;
         }
+        topIter.next()
+    }
+
+    if (cb.hasYielded()) {
+        // remember the yield key as the last key returned
+        setLastKeyReturned(cb.getKey());
+        break;
     }
 }
 //return batch of key values to client
@@ -213,15 +245,12 @@ Additionally, the obtuse "re-seek" case can be outlined as the following:
 // Given the above
 List<KeyValue> batch = getNextBatch();
 
-// Store off lastKeyReturned for this client
-lastKeyReturned = batch.get(batch.size() - 1).getKey();
-
 // thread goes away (client stops asking for the next batch).
 
 // Eventually client comes back
 // Setup as before...
-Range userRange = getRangeFromUser();
-Range actualRange = new Range(lastKeyReturned, false, userRange.getEndKey(), userRange.isEndKeyInclusive());
+Range userRange = getRangeFromClient();
+Range actualRange = new Range(getLastKeyReturned(), false, userRange.getEndKey(), userRange.isEndKeyInclusive());
 
 // Use the actualRange, not the user provided one
 topIter.seek(actualRange);