You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@datasketches.apache.org by le...@apache.org on 2020/06/01 22:01:14 UTC

[incubator-datasketches-java] branch Theta_AnotB_consistency created (now e735213)

This is an automated email from the ASF dual-hosted git repository.

leerho pushed a change to branch Theta_AnotB_consistency
in repository https://gitbox.apache.org/repos/asf/incubator-datasketches-java.git.


      at e735213  Very minor corrections to code comments.

This branch includes the following new commits:

     new e735213  Very minor corrections to code comments.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[incubator-datasketches-java] 01/01: Very minor corrections to code comments.

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch Theta_AnotB_consistency
in repository https://gitbox.apache.org/repos/asf/incubator-datasketches-java.git

commit e7352139dad71bb08c80ac4f483fc95611d66a78
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Mon Jun 1 15:00:47 2020 -0700

    Very minor corrections to code comments.
---
 src/main/java/org/apache/datasketches/theta/CompactSketch.java | 3 ++-
 src/main/java/org/apache/datasketches/theta/HeapAnotB.java     | 6 +++---
 src/main/java/org/apache/datasketches/theta/SetOperation.java  | 1 +
 src/main/javadoc/resources/dictionary.html                     | 5 +++--
 4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/main/java/org/apache/datasketches/theta/CompactSketch.java b/src/main/java/org/apache/datasketches/theta/CompactSketch.java
index 43c7504..dab61af 100644
--- a/src/main/java/org/apache/datasketches/theta/CompactSketch.java
+++ b/src/main/java/org/apache/datasketches/theta/CompactSketch.java
@@ -75,7 +75,8 @@ public abstract class CompactSketch extends Sketch {
 
   /**
    * Compact the given array. The source cache can be a hash table with interstitial zeros or
-   * "dirty" values.
+   * "dirty" values, which are hash values greater than theta. These can be generated by the
+   * Alpha sketch.
    * @param srcCache anything
    * @param curCount must be correct
    * @param thetaLong The correct
diff --git a/src/main/java/org/apache/datasketches/theta/HeapAnotB.java b/src/main/java/org/apache/datasketches/theta/HeapAnotB.java
index a316f37..2e1275d 100644
--- a/src/main/java/org/apache/datasketches/theta/HeapAnotB.java
+++ b/src/main/java/org/apache/datasketches/theta/HeapAnotB.java
@@ -153,10 +153,10 @@ final class HeapAnotB extends AnotB {
     //    A sketch in stored form can be in one of 5 states.
     //    Null is not actually a state, but is included for completeness.
     //    Null is interpreted as {Theta = 1.0, count = 0, empty = true}.
-    //    The empty state may have Theta < 1.0 but it is ignored; count must be zero.
+    //    In some cases the empty state may have Theta < 1.0 but it is ignored; count must be zero.
     //    State:
-    //      0 N Null
-    //      1 E Empty
+    //      0 N Null or instance of EmptyCompactSketch
+    //      1 E Empty bit set
     //      2 C Compact, not ordered
     //      3 O Compact Ordered
     //      4 H Hash-Table
diff --git a/src/main/java/org/apache/datasketches/theta/SetOperation.java b/src/main/java/org/apache/datasketches/theta/SetOperation.java
index 22b4434..3c5312d 100644
--- a/src/main/java/org/apache/datasketches/theta/SetOperation.java
+++ b/src/main/java/org/apache/datasketches/theta/SetOperation.java
@@ -250,6 +250,7 @@ public abstract class SetOperation {
       }
       return sk;
     }
+    //Not Empty
     if ((thetaLong == Long.MAX_VALUE) && (curCount == 1)) {
       final SingleItemSketch sis = new SingleItemSketch(compactCache[0], seedHash);
       if ((dstMem != null) && (dstMem.getCapacity() >= 16)) {
diff --git a/src/main/javadoc/resources/dictionary.html b/src/main/javadoc/resources/dictionary.html
index b20a5ab..ebfa7c3 100644
--- a/src/main/javadoc/resources/dictionary.html
+++ b/src/main/javadoc/resources/dictionary.html
@@ -80,9 +80,10 @@ See <a href="#validHash">Valid Hash</a>.
 <h3><a name="empty">isEmpty()</a></h3>
 In Theta Sketches, the state <i>isEmpty()</i> for a sketch means that the sketch cache has zero hash values and that none of the
 update methods have been called with valid data.  In other words, the sketch has never seen any data.  
-This state is equivalent to "null" in the sense that it is safe to exclude empty sketches from set operations.
+This state is equivalent to "null" in the sense that it is safe to exclude empty sketches from union operations. However, an empty sketch
+will impact intersections and difference set operations.
 
-<p>Note that <i>isEmpty()</i> does not mean that theta is 1.0 because if <i>p</i> &lt; 1.0, theta will be set 
+<p>Note that <i>isEmpty()</i> does not always mean that theta is 1.0 because if <i>p</i> &lt; 1.0, theta will be set 
 equal to <i>p</i> during construction. 
 Also, a cache of zero values (<i>getRetainedEntries(true) = 0</i>) does not mean that the sketch is <i>Empty</i> since 
 set intersection or difference operations can result in a sketch with zero values. 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org