You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@datasketches.apache.org by le...@apache.org on 2021/01/10 23:47:24 UTC

[datasketches-java] branch ReqExperiment updated (bfde9d8 -> 4056806)

This is an automated email from the ASF dual-hosted git repository.

leerho pushed a change to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git.


 discard bfde9d8  Merge remote-tracking branch 'origin/master' into ReqExperiment
    omit 8e8439e  Revert "superfluous calls of buf.sort() -- calling sort() in FloatBuffer.getEvensOrOdds is sufficient"
     new 69db4d1  Merge branch 'master' into ReqExperiment
     new c12bc9d  Fix conflicts with master
     new 827d25b  Fix conflicts with Master
     new eb97863  Fix conflicts with ReqExperiment
     new 3cb69ef  Fix conflicts with ReqExperimental
     new 47bdb28  Merge branch 'master' into ReqExperiment
     new 1a61267  set lazy compression = false
     new 4056806  Merge branch 'master' into ReqExperiment

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (bfde9d8)
            \
             N -- N -- N   refs/heads/ReqExperiment (4056806)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 8 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../java/org/apache/datasketches/req/ReqCompactor.java     |  4 ++--
 src/main/java/org/apache/datasketches/req/ReqSketch.java   | 14 +++++++-------
 2 files changed, 9 insertions(+), 9 deletions(-)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[datasketches-java] 07/08: set lazy compression = false

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git

commit 1a61267c882abd747fcf4d986df4270ee218557c
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sun Jan 10 15:40:55 2021 -0800

    set lazy compression = false
---
 src/main/java/org/apache/datasketches/req/ReqSketch.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/main/java/org/apache/datasketches/req/ReqSketch.java b/src/main/java/org/apache/datasketches/req/ReqSketch.java
index 81039ac..70059da 100644
--- a/src/main/java/org/apache/datasketches/req/ReqSketch.java
+++ b/src/main/java/org/apache/datasketches/req/ReqSketch.java
@@ -77,7 +77,7 @@ public class ReqSketch extends BaseReqSketch {
   static byte INIT_NUMBER_OF_SECTIONS = 3; // TODO: restore to final after eval
   static int MIN_K = 4; // TODO: restore to final after eval
   static float NOM_CAP_MULT = 2f; // TODO: restore to final after eval
-  private static boolean LAZY_COMPRESSION = true; //TODO: restore to final after eval
+  private static boolean LAZY_COMPRESSION = false; //TODO: restore to final after eval
   private static double relRseFactor; //TODO: restore final: = sqrt(0.0512 / INIT_NUMBER_OF_SECTIONS);
   private static final double fixRseFactor = .06;
   //finals


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[datasketches-java] 06/08: Merge branch 'master' into ReqExperiment

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git

commit 47bdb288084a58a2b495414209f143cd7428d3c3
Merge: 827d25b 3cb69ef
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sun Jan 10 14:47:00 2021 -0800

    Merge branch 'master' into ReqExperiment



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[datasketches-java] 04/08: Fix conflicts with ReqExperimental

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git

commit 3cb69ef60a8535c5a6af60f181b1defbb4546130
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sun Jan 10 14:43:55 2021 -0800

    Fix conflicts with ReqExperimental
---
 src/main/java/org/apache/datasketches/req/ReqCompactor.java | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/main/java/org/apache/datasketches/req/ReqCompactor.java b/src/main/java/org/apache/datasketches/req/ReqCompactor.java
index 5c88410..80e0f6c 100644
--- a/src/main/java/org/apache/datasketches/req/ReqCompactor.java
+++ b/src/main/java/org/apache/datasketches/req/ReqCompactor.java
@@ -23,6 +23,7 @@ import static java.lang.Math.round;
 import static org.apache.datasketches.Util.numberOfTrailingOnes;
 import static org.apache.datasketches.req.ReqSketch.INIT_NUMBER_OF_SECTIONS;
 import static org.apache.datasketches.req.ReqSketch.MIN_K;
+import static org.apache.datasketches.req.ReqSketch.NOM_CAP_MULT;
 
 import java.util.Random;
 
@@ -37,7 +38,7 @@ import org.apache.datasketches.req.ReqSketch.CompactorReturn;
 class ReqCompactor {
   //finals
   private static final double SQRT2 = Math.sqrt(2.0);
-  private static final int NOM_CAP_MULT = 2;
+  //private static final int NOM_CAP_MULT = 2;
   private final byte lgWeight;
   private final boolean hra;
   //state variables
@@ -170,7 +171,7 @@ class ReqCompactor {
    * @return the current nominal capacity of this compactor.
    */
   int getNomCapacity() {
-    return NOM_CAP_MULT * numSections * sectionSize;
+    return (int)(NOM_CAP_MULT * numSections * sectionSize);
   }
 
   /**
@@ -231,7 +232,7 @@ class ReqCompactor {
   private boolean ensureEnoughSections() {
     final float szf;
     final int ne;
-    if (state >= 1L << numSections - 1
+    if (state >= 1L << numSections - 1 //TODO try adding: && sectionSize > MIN_K
         && sectionSize > MIN_K
         && (ne = nearestEven(szf = (float)(sectionSizeFlt / SQRT2))) >= MIN_K)
     {
@@ -253,6 +254,7 @@ class ReqCompactor {
   private long computeCompactionRange(final int secsToCompact) {
     final int bufLen = buf.getCount();
     int nonCompact = getNomCapacity() / 2 + (numSections - secsToCompact) * sectionSize;
+    // TODO: alternative: int nonCompact = (2 * numSections - secsToCompact) * sectionSize;
     //make compacted region even:
     nonCompact = (bufLen - nonCompact & 1) == 1 ? nonCompact + 1 : nonCompact;
     final long low =  hra ? 0                   : nonCompact;


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[datasketches-java] 05/08: Fix conflicts with Master

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git

commit 827d25b196927c1675253602637367df956f2cb0
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sun Jan 10 14:45:08 2021 -0800

    Fix conflicts with Master
---
 src/main/java/org/apache/datasketches/req/ReqCompactor.java | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/src/main/java/org/apache/datasketches/req/ReqCompactor.java b/src/main/java/org/apache/datasketches/req/ReqCompactor.java
index ca7d886..80e0f6c 100644
--- a/src/main/java/org/apache/datasketches/req/ReqCompactor.java
+++ b/src/main/java/org/apache/datasketches/req/ReqCompactor.java
@@ -232,12 +232,8 @@ class ReqCompactor {
   private boolean ensureEnoughSections() {
     final float szf;
     final int ne;
-<<<<<<< HEAD
     if (state >= 1L << numSections - 1 //TODO try adding: && sectionSize > MIN_K
-=======
-    if (state >= 1L << numSections - 1
         && sectionSize > MIN_K
->>>>>>> refs/heads/master
         && (ne = nearestEven(szf = (float)(sectionSizeFlt / SQRT2))) >= MIN_K)
     {
       sectionSizeFlt = szf;


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[datasketches-java] 01/08: Merge branch 'master' into ReqExperiment

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git

commit 69db4d112c47fbaf3364c27dc186fc2d3aa0c43c
Merge: 07a11ea d768b80
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sun Jan 10 13:54:07 2021 -0800

    Merge branch 'master' into ReqExperiment
    
    Conflicts:
    	src/main/java/org/apache/datasketches/req/ReqCompactor.java
    	src/main/java/org/apache/datasketches/req/ReqSketch.java

 .asf.yaml                                                 | 15 ---------------
 pom.xml                                                   |  1 +
 .../java/org/apache/datasketches/req/ReqCompactor.java    |  6 +++++-
 src/main/java/org/apache/datasketches/req/ReqSketch.java  | 15 +++++++++++----
 4 files changed, 17 insertions(+), 20 deletions(-)

diff --cc src/main/java/org/apache/datasketches/req/ReqCompactor.java
index ef7145d,5c88410..ca7d886
--- a/src/main/java/org/apache/datasketches/req/ReqCompactor.java
+++ b/src/main/java/org/apache/datasketches/req/ReqCompactor.java
@@@ -233,7 -231,8 +232,12 @@@ class ReqCompactor 
    private boolean ensureEnoughSections() {
      final float szf;
      final int ne;
++<<<<<<< HEAD
 +    if (state >= 1L << numSections - 1 //TODO try adding: && sectionSize > MIN_K
++=======
+     if (state >= 1L << numSections - 1
+         && sectionSize > MIN_K
++>>>>>>> refs/heads/master
          && (ne = nearestEven(szf = (float)(sectionSizeFlt / SQRT2))) >= MIN_K)
      {
        sectionSizeFlt = szf;
diff --cc src/main/java/org/apache/datasketches/req/ReqSketch.java
index d0790bc,9b99717..18d3aa6
--- a/src/main/java/org/apache/datasketches/req/ReqSketch.java
+++ b/src/main/java/org/apache/datasketches/req/ReqSketch.java
@@@ -39,14 -39,17 +39,17 @@@ import org.apache.datasketches.memory.M
   * <ul>
   * <li>The algorithm requires no upper bound on the stream length.
   * Instead, each relative-compactor counts the number of compaction operations performed
-  * so far (variable numCompactions). Initially, the relative-compactor starts with 3 sections.
-  * Each time the numCompactions exceeds 2^{numSections - 1}, we double numSections.</li>
 - * so far (via variable state). Initially, the relative-compactor starts with 3 sections.
 - * Each time the number of compactions (variable state) exceeds 2^{numSections - 1}, we double numSections.
 - * Note that after merging the sketch with another one variable state may not correspond to the number of
 - * compactions performed at a particular level, however, since the state variable never exceeds
 - * the number of compactions, the guarantees of the sketch remain valid.</li>
++ * so far (via variable state). Initially, the relative-compactor starts with INIT_NUMBER_OF_SECTIONS.
++ * Each time the number of compactions (variable state) exceeds 2^{numSections - 1}, we double 
++ * numSections. Note that after merging the sketch with another one variable state may not correspond 
++ * to the number of compactions performed at a particular level, however, since the state variable 
++ * never exceeds the number of compactions, the guarantees of the sketch remain valid.</li>
   *
   * <li>The size of each section (variable k and sectionSize in the code and parameter k in
   * the paper) is initialized with a value set by the user via variable k.
   * When the number of sections doubles, we decrease sectionSize by a factor of sqrt(2).
   * This is applied at each level separately. Thus, when we double the number of sections, the
-  * nominal compactor size increases by a factor of sqrt(2) (up to +-1 after rounding).</li>
 - * nominal compactor size increases by a factor of approx. sqrt(2) (up to rounding issues).</li>
++ * nominal compactor size increases by a factor of approx. sqrt(2) (+- rounding).</li>
   *
   * <li>The merge operation here does not perform "special compactions", which are used in the paper
   * to allow for a tight mathematical analysis of the sketch.</li>
@@@ -212,7 -185,6 +215,7 @@@ public class ReqSketch extends BaseReqS
          compactors.get(h + 1).getBuffer().mergeSortIn(promoted);
          retItems += cReturn.deltaRetItems;
          maxNomSize += cReturn.deltaNomSize;
-         if (LAZY_COMPRESSION && retItems < maxNomSize) { break; }
++        //if (LAZY_COMPRESSION && retItems < maxNomSize) { break; }
        }
      }
      aux = null;


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[datasketches-java] 03/08: Fix conflicts with master

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git

commit c12bc9daec82ab1a8889cd9f476418703b62e000
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sun Jan 10 14:25:35 2021 -0800

    Fix conflicts with master
---
 src/main/java/org/apache/datasketches/req/ReqSketch.java | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/main/java/org/apache/datasketches/req/ReqSketch.java b/src/main/java/org/apache/datasketches/req/ReqSketch.java
index 18d3aa6..81039ac 100644
--- a/src/main/java/org/apache/datasketches/req/ReqSketch.java
+++ b/src/main/java/org/apache/datasketches/req/ReqSketch.java
@@ -40,16 +40,16 @@ import org.apache.datasketches.memory.Memory;
  * <li>The algorithm requires no upper bound on the stream length.
  * Instead, each relative-compactor counts the number of compaction operations performed
  * so far (via variable state). Initially, the relative-compactor starts with INIT_NUMBER_OF_SECTIONS.
- * Each time the number of compactions (variable state) exceeds 2^{numSections - 1}, we double 
- * numSections. Note that after merging the sketch with another one variable state may not correspond 
- * to the number of compactions performed at a particular level, however, since the state variable 
+ * Each time the number of compactions (variable state) exceeds 2^{numSections - 1}, we double
+ * numSections. Note that after merging the sketch with another one variable state may not correspond
+ * to the number of compactions performed at a particular level, however, since the state variable
  * never exceeds the number of compactions, the guarantees of the sketch remain valid.</li>
  *
  * <li>The size of each section (variable k and sectionSize in the code and parameter k in
  * the paper) is initialized with a value set by the user via variable k.
  * When the number of sections doubles, we decrease sectionSize by a factor of sqrt(2).
  * This is applied at each level separately. Thus, when we double the number of sections, the
- * nominal compactor size increases by a factor of approx. sqrt(2) (+- rounding).</li>
+ * nominal compactor size increases by a factor of approx. sqrt(2) (+/- rounding).</li>
  *
  * <li>The merge operation here does not perform "special compactions", which are used in the paper
  * to allow for a tight mathematical analysis of the sketch.</li>
@@ -215,7 +215,7 @@ public class ReqSketch extends BaseReqSketch {
         compactors.get(h + 1).getBuffer().mergeSortIn(promoted);
         retItems += cReturn.deltaRetItems;
         maxNomSize += cReturn.deltaNomSize;
-        //if (LAZY_COMPRESSION && retItems < maxNomSize) { break; }
+        if (LAZY_COMPRESSION && retItems < maxNomSize) { break; }
       }
     }
     aux = null;


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[datasketches-java] 08/08: Merge branch 'master' into ReqExperiment

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git

commit 4056806c2334f63b128d73980a18c613c4957cf4
Merge: 47bdb28 1a61267
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sun Jan 10 15:41:54 2021 -0800

    Merge branch 'master' into ReqExperiment

 src/main/java/org/apache/datasketches/req/ReqSketch.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[datasketches-java] 02/08: Fix conflicts with ReqExperiment

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch ReqExperiment
in repository https://gitbox.apache.org/repos/asf/datasketches-java.git

commit eb97863d0a790de8feec06a3b79e9c09d891dfad
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sun Jan 10 14:24:27 2021 -0800

    Fix conflicts with ReqExperiment
---
 .../org/apache/datasketches/req/ReqSketch.java     | 50 ++++++++++++++++++----
 1 file changed, 41 insertions(+), 9 deletions(-)

diff --git a/src/main/java/org/apache/datasketches/req/ReqSketch.java b/src/main/java/org/apache/datasketches/req/ReqSketch.java
index 9b99717..81039ac 100644
--- a/src/main/java/org/apache/datasketches/req/ReqSketch.java
+++ b/src/main/java/org/apache/datasketches/req/ReqSketch.java
@@ -39,17 +39,17 @@ import org.apache.datasketches.memory.Memory;
  * <ul>
  * <li>The algorithm requires no upper bound on the stream length.
  * Instead, each relative-compactor counts the number of compaction operations performed
- * so far (via variable state). Initially, the relative-compactor starts with 3 sections.
- * Each time the number of compactions (variable state) exceeds 2^{numSections - 1}, we double numSections.
- * Note that after merging the sketch with another one variable state may not correspond to the number of
- * compactions performed at a particular level, however, since the state variable never exceeds
- * the number of compactions, the guarantees of the sketch remain valid.</li>
+ * so far (via variable state). Initially, the relative-compactor starts with INIT_NUMBER_OF_SECTIONS.
+ * Each time the number of compactions (variable state) exceeds 2^{numSections - 1}, we double
+ * numSections. Note that after merging the sketch with another one variable state may not correspond
+ * to the number of compactions performed at a particular level, however, since the state variable
+ * never exceeds the number of compactions, the guarantees of the sketch remain valid.</li>
  *
  * <li>The size of each section (variable k and sectionSize in the code and parameter k in
  * the paper) is initialized with a value set by the user via variable k.
  * When the number of sections doubles, we decrease sectionSize by a factor of sqrt(2).
  * This is applied at each level separately. Thus, when we double the number of sections, the
- * nominal compactor size increases by a factor of approx. sqrt(2) (up to rounding issues).</li>
+ * nominal compactor size increases by a factor of approx. sqrt(2) (+/- rounding).</li>
  *
  * <li>The merge operation here does not perform "special compactions", which are used in the paper
  * to allow for a tight mathematical analysis of the sketch.</li>
@@ -74,9 +74,11 @@ import org.apache.datasketches.memory.Memory;
 public class ReqSketch extends BaseReqSketch {
   //static finals
   private static final String LS = System.getProperty("line.separator");
-  static final int INIT_NUMBER_OF_SECTIONS = 3;
-  static final int MIN_K = 4;
-  private static final double relRseFactor = sqrt(0.0512 / INIT_NUMBER_OF_SECTIONS);
+  static byte INIT_NUMBER_OF_SECTIONS = 3; // TODO: restore to final after eval
+  static int MIN_K = 4; // TODO: restore to final after eval
+  static float NOM_CAP_MULT = 2f; // TODO: restore to final after eval
+  private static boolean LAZY_COMPRESSION = true; //TODO: restore to final after eval
+  private static double relRseFactor; //TODO: restore final: = sqrt(0.0512 / INIT_NUMBER_OF_SECTIONS);
   private static final double fixRseFactor = .06;
   //finals
   private final int k;  //user config, default is 12 (1% @ 95% Conf)
@@ -96,6 +98,34 @@ public class ReqSketch extends BaseReqSketch {
   private final CompactorReturn cReturn = new CompactorReturn(); //used in compress()
 
   /**
+   * Temporary ctor for evaluation
+   * @param k blah
+   * @param highRankAccuracy blah
+   * @param reqDebug blah
+   * @param initNumSections blah
+   * @param minK blah
+   * @param nomCapMult blah
+   * @param lazyCompression blah
+   */
+  public ReqSketch(final int k, final boolean highRankAccuracy, final ReqDebug reqDebug,
+      final byte initNumSections, final int minK, final float nomCapMult,
+      final boolean lazyCompression) {
+    checkK(k);
+    this.k = k;
+    hra = highRankAccuracy;
+    retItems = 0;
+    maxNomSize = 0;
+    totalN = 0;
+    this.reqDebug = reqDebug;
+    INIT_NUMBER_OF_SECTIONS = initNumSections; //was 3
+    relRseFactor = sqrt(0.0512 / initNumSections);
+    MIN_K = minK; //was 4
+    NOM_CAP_MULT = nomCapMult; //was 2
+    LAZY_COMPRESSION = lazyCompression; //was true
+    grow();
+  }
+
+  /**
    * Normal Constructor used by ReqSketchBuilder.
    * @param k Controls the size and error of the sketch. It must be even and in the range
    * [4, 1024], inclusive.
@@ -185,6 +215,7 @@ public class ReqSketch extends BaseReqSketch {
         compactors.get(h + 1).getBuffer().mergeSortIn(promoted);
         retItems += cReturn.deltaRetItems;
         maxNomSize += cReturn.deltaNomSize;
+        if (LAZY_COMPRESSION && retItems < maxNomSize) { break; }
       }
     }
     aux = null;
@@ -514,6 +545,7 @@ public class ReqSketch extends BaseReqSketch {
     retItems++;
     totalN++;
     if (retItems >= maxNomSize) {
+      buf.sort();
       compress();
     }
     aux = null;


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org