You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@datasketches.apache.org by gi...@apache.org on 2022/08/12 08:25:36 UTC

[datasketches-website] branch asf-site updated: Automatic Site Publish by Buildbot

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datasketches-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new d89fa7fd Automatic Site Publish by Buildbot
d89fa7fd is described below

commit d89fa7fdb7d887e79f21801e7f3f4d0990ce5748
Author: buildbot <us...@infra.apache.org>
AuthorDate: Fri Aug 12 08:25:32 2022 +0000

    Automatic Site Publish by Buildbot
---
 .../SketchingQuantilesAndRanksTutorial.html        | 22 +++++++++++-----------
 output/docs/Tuple/TupleEngagementExample.html      |  2 +-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/output/docs/Quantiles/SketchingQuantilesAndRanksTutorial.html b/output/docs/Quantiles/SketchingQuantilesAndRanksTutorial.html
index e0b044e5..3d62ea39 100644
--- a/output/docs/Quantiles/SketchingQuantilesAndRanksTutorial.html
+++ b/output/docs/Quantiles/SketchingQuantilesAndRanksTutorial.html
@@ -509,16 +509,16 @@
 -->
 <h1 id="sketching-quantiles-and-ranks-the-basics">Sketching Quantiles and Ranks, the Basics</h1>
 <p>Streaming quantiles algorithms, or quantiles sketches, enable us to analyze the distributions 
-of massive data very quickly using only a small amout of space.<br />
-They allow us to compute a quantile values given a desired rank, or compute a rank given
+of massive data very quickly using only a small amount of space.<br />
+They allow us to compute quantile values given a desired rank, or compute a rank given
 a quantile value. Quantile sketches enable us to plot the CDF, PMF or histograms of a distribution.</p>
 
-<p>The goal of this short tutorial it to introduce to the reader some of the basic concepts 
+<p>The goal of this short tutorial it to introduce the reader to some of the basic concepts 
 of quantiles, ranks and their functions.</p>
 
 <h2 id="what-is-a-rank">What is a rank?</h2>
 
-<h3 id="a-rank-identifies-the-numeric-position-of-a-specific-value-in-an-enumerated-ordered-set-if-values">A <strong><em>rank</em></strong> identifies the numeric position of a specific value in an enumerated, ordered set if values.</h3>
+<h3 id="a-rank-identifies-the-numeric-position-of-a-specific-value-in-an-enumerated-ordered-set-of-values">A <strong><em>rank</em></strong> identifies the numeric position of a specific value in an enumerated, ordered set of values.</h3>
 
 <p>The actual enumeration can be done in several ways, but for our use here we will define the two common ways that <em>rank</em> can be specified and that we will use.</p>
 
@@ -534,7 +534,7 @@ of quantiles, ranks and their functions.</p>
 
 <h3 id="rank-and-mass">Rank and Mass</h3>
 
-<p><em>Normalized rank</em> is closely associated with the concept of <em>mass</em>. The value associated with the rank 0.5 represents the median value, or the center of <em>mass</em> of the entire set, where half of the values are below the median and half are above. The concept of mass is important to understanding the Prabability Mass Function (PMF) offered by all the quantile sketches in the library.</p>
+<p><em>Normalized rank</em> is closely associated with the concept of <em>mass</em>. The value associated with the rank 0.5 represents the median value, or the center of <em>mass</em> of the entire set, where half of the values are below the median and half are above. The concept of mass is important to understanding the Probability Mass Function (PMF) offered by all the quantile sketches in the library.</p>
 
 <h2 id="what-is-a-quantile">What is a quantile?</h2>
 
@@ -546,7 +546,7 @@ To wit:</p>
 <ul>
   <li>A percentile is a quantile where the rank domain is divided into hundredths. For example, “An SAT Math score of 740 is at the 95th percentile”. The score of 740 is the quantile and .95 is the normalized rank.</li>
   <li>A decile is a quantile where the rank domain is divided into tenths. For example, “An SAT Math score of 690 is at the 9th decile (rank = 0.9).</li>
-  <li>A quartile is a quantile where the rank domain is divided into forths. For example, “An SAT Math score of 600 is at the third quartile (rank = 0.75).</li>
+  <li>A quartile is a quantile where the rank domain is divided into fourths. For example, “An SAT Math score of 600 is at the third quartile (rank = 0.75).</li>
   <li>The median is a quantile that splits the rank domain in half. For example, “An SAT Math score of 520 is at the median (rank = 0.5).</li>
 </ul>
 
@@ -647,7 +647,7 @@ To wit:</p>
 the function <em>r(q)</em> is ambiguous. We will see how to resolve this shortly.</p>
 
 <h2 id="the-challenge-of-approximation">The challenge of approximation</h2>
-<p>By definiton, sketching algorithms are approximate, and they achieve their high performance by discarding  data.  Suppose you feed <em>n</em> items into a sketch that retains only <em>m &lt; n</em> items. This means <em>n-m</em> values were discarded.  The sketch must track the value <em>n</em> used for computing the rank and quantile functions. When the sketch reconstructs the relationship between ranks and values <em>n-m</em> rank values are missing creating holes in the sequence of [...]
+<p>By definition, sketching algorithms are approximate, and they achieve their high performance by discarding  data.  Suppose you feed <em>n</em> items into a sketch that retains only <em>m &lt; n</em> items. This means <em>n-m</em> values were discarded.  The sketch must track the value <em>n</em> used for computing the rank and quantile functions. When the sketch reconstructs the relationship between ranks and values <em>n-m</em> rank values are missing creating holes in the sequence o [...]
 
 <p>The raw data might look like this, with its associated natural ranks.</p>
 
@@ -711,7 +711,7 @@ the function <em>r(q)</em> is ambiguous. We will see how to resolve this shortly
 
 <p>When the sketch deletes values it adjusts the associated ranks by effectively increasing the “weight” of adjacent items so that they are positionally approximately correct and the top rank corresponds to <em>n</em>.</p>
 
-<p>How do we resove <em>q(3)</em> or <em>r(20)</em>?</p>
+<p>How do we resolve <em>q(3)</em> or <em>r(20)</em>?</p>
 
 <h2 id="the-need-for-inequality-search">The need for inequality search</h2>
 <p>The quantile sketch algorithms discussed in the literature primarily differ by how they choose which values in the stream should be discarded. After the elimination process, all of the quantiles sketch implementations are left with the challenge of how to reconstruct the actual distribution, approximately and with good accuracy.</p>
@@ -796,21 +796,21 @@ Given <em>q</em>, search the quantile array until we find the adjacent pair <em>
       <td>Qualifying pair</td>
       <td> </td>
       <td> </td>
+      <td>q1</td>
+      <td>q2</td>
       <td> </td>
       <td> </td>
       <td> </td>
-      <td>q1</td>
-      <td>q2</td>
       <td> </td>
     </tr>
     <tr>
       <td>Rank result</td>
       <td> </td>
       <td> </td>
+      <td>.357</td>
       <td> </td>
       <td> </td>
       <td> </td>
-      <td>.786</td>
       <td> </td>
       <td> </td>
     </tr>
diff --git a/output/docs/Tuple/TupleEngagementExample.html b/output/docs/Tuple/TupleEngagementExample.html
index c574a642..276e3fd2 100644
--- a/output/docs/Tuple/TupleEngagementExample.html
+++ b/output/docs/Tuple/TupleEngagementExample.html
@@ -518,7 +518,7 @@
 
 <p>The X-axis is the number of days that a specific customer (identified by some unique ID) visits our site in a 30 day period.</p>
 
-<p>The Y-axis is the number of distinct visitors (customers) that have visited our site Y number of times during the 30 day period.</p>
+<p>The Y-axis is the number of distinct visitors (customers) that have visited our site X number of times during the 30 day period.</p>
 
 <p>Reading this histogram we can see that about 100 distinct visitors visited our site exactly one day out of the 30 day period. About 11 visitors visited our site on 5 different days of the 30 day period. And, it seems that we have one customer that visited our site every day of the 30 day period!  We certainly want to encourage more of these loyal customers.</p>
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org