You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@datasketches.apache.org by le...@apache.org on 2020/03/22 00:33:50 UTC

[incubator-datasketches-website] branch Restructuring created (now e82010c)

This is an automated email from the ASF dual-hosted git repository.

leerho pushed a change to branch Restructuring
in repository https://gitbox.apache.org/repos/asf/incubator-datasketches-website.git.


      at e82010c  Update Downloads page.

This branch includes the following new commits:

     new e82010c  Update Downloads page.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org


[incubator-datasketches-website] 01/01: Update Downloads page.

Posted by le...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch Restructuring
in repository https://gitbox.apache.org/repos/asf/incubator-datasketches-website.git

commit e82010c149ef8773be018f12ef3fa0557363abe7
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Sat Mar 21 17:33:03 2020 -0700

    Update Downloads page.
    
    Minor wording changes to Sketch Elements, Sketch Origins and The
    Challenge pages.
---
 docs/Community/Downloads.md |  9 ---------
 docs/SketchElements.md      | 11 +++++++----
 docs/SketchOrigins.md       | 10 ++++------
 docs/TheChallenge.md        |  2 +-
 4 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/docs/Community/Downloads.md b/docs/Community/Downloads.md
index 8f9d73f..da15854 100644
--- a/docs/Community/Downloads.md
+++ b/docs/Community/Downloads.md
@@ -31,9 +31,6 @@ It is essential that you verify the integrity of release downloads. See [instruc
 ## Download Java Jar Files
 From [Maven Central](https://search.maven.org/search?q=g:%20org.apache.datasketches).
 
-## Download Shapshot Versions
-Clone or fork the current SNAPSHOT directly from the relevant [DataSketches repository](https://github.com/apache?utf8=%E2%9C%93&q=datasketches).
-
 ## Version Numbers
 Apache DataSketches uses [semantic versioning](https://semver.org/). Version numbers use the form major.minor.incremental and are incremented as follows:
 
@@ -78,12 +75,6 @@ shaded versions of the core jar and memory jar. The shading avoids conflicts wit
 of core Java and Memory that you might have in your system.
 
 
-### SNAPSHOT Jars
-If you want the latest and greatest version of the code, it is certainly OK for you to create your 
-own snapshot jars from a clone or fork. 
-The code is automatically tested using the current test suite, but you might catch the code in
-transition to a new future release. Caveat Emptor.
-
 ### Version History
 Please use GitHub revisions history on the respective component repositories
 
diff --git a/docs/SketchElements.md b/docs/SketchElements.md
index e6388e1..8451865 100644
--- a/docs/SketchElements.md
+++ b/docs/SketchElements.md
@@ -21,10 +21,10 @@ layout: doc_page
 -->
 ## Sketch Elements
 
-Sketches are different from traditional sampling techniques in that sketches process all 
+Sketches are different from traditional sampling techniques in that sketches examine all 
 the elements of a stream, touching each element only once,
-and have some form of randomization that forms the basis of their stochastic nature. 
-This "one-touch" property makes sketches ideally suited for real-time data processing.
+and often have some form of randomization that forms the basis of their stochastic nature. 
+This "one-touch" property makes sketches ideally suited for real-time stream processing.
 
 As an example, the first stage of a 
 <a href="https://en.wikipedia.org/wiki/Count-distinct_problem">count-distinct</a> sketching 
@@ -47,7 +47,7 @@ proven error distribution bounds.
 
 <img class="doc-img-full" src="{{site.docs_img_dir}}/SketchElements.png" alt="SketchElements" />
 
-Sketches are typically
+Sketches are typically<sup>1</sup>
 
 * Small in size. They are typically orders of magnitude smaller than the raw input data stream. 
 Sketches implement *sublinear* algorithms that grow in size much slower than that of the size of
@@ -60,3 +60,6 @@ The sketch only needs to see each item in the stream once.
 be merged without losing accuracy.
 * Approximate. As an example, for unique count sketches the relative error bounds 
 are a function of the configured size of the sketch.
+
+---
+<sup>1</sup> For a more comprehensive definition see [Sketch Criteria]({{site.docs_dir}}/Architecture/SketchCriteria.html)
\ No newline at end of file
diff --git a/docs/SketchOrigins.md b/docs/SketchOrigins.md
index 760d43f..a570341 100644
--- a/docs/SketchOrigins.md
+++ b/docs/SketchOrigins.md
@@ -21,10 +21,8 @@ layout: doc_page
 -->
 ## Sketch Origins
 
-Sketching is a relatively recent development in the theoretical field of 
-<a href="https://en.wikipedia.org/wiki/Streaming_algorithm"><i>Stochastic Streaming Algorithms</i></a><sup>1</sup>, 
-which deals with algorithms that can extract information from a stream of data in a single pass 
-(sometimes called "one-touch" processing) using various randomization techniques. 
+Sketching is a relatively recent development in computer science and in the theoretical literature is often referred to as a class of <a href="https://en.wikipedia.org/wiki/Streaming_algorithm"><i>Streaming Algorithms</i></a><sup>1</sup>, 
+Sketches implement algorithms that can extract information from a stream of data in a single pass, which is also known as "one-touch" processing.  Some sketches can be deterministic, although most sketches are probabilistic in their behavior and take advantage of various randomization techniques. 
 
 Sketching is a synergistic blend of theoretical mathematics, statistics and computer science, 
 refers to a broad range of algorithms, and has experienced a great deal of interest and growth 
@@ -35,8 +33,8 @@ describe these algorithms and associated data structures that implement the theo
 
 <img class="doc-img-full" src="{{site.docs_img_dir}}/SketchOrigins.png" alt="SketchOrigins" />
 
-<a href="https://en.wikipedia.org/wiki/Philippe_Flajolet">Philippe Flajolet</a> 
-is often refered to as the father of sketching with his research in analytic combinatorics and 
+The French mathematician <a href="https://en.wikipedia.org/wiki/Philippe_Flajolet">Philippe Flajolet</a> 
+is often regarded as the father of sketching with his research in analytic combinatorics and 
 analysis of algorithms. 
 His 1985 paper 
 <a href="http://db.cs.berkeley.edu/cs286/papers/flajoletmartin-jcss1985.pdf"> <!-- does not work with https -->
diff --git a/docs/TheChallenge.md b/docs/TheChallenge.md
index 93f313a..9b924b5 100644
--- a/docs/TheChallenge.md
+++ b/docs/TheChallenge.md
@@ -70,7 +70,7 @@ This, of course, assumes that you care about query responsiveness and speed; tha
 [Sketches]({{site.docs_dir}}/SketchOrigins.html), the informal name for these algorithms, offer an excellent solution to these types of queries, and in some cases may be the only solution.
 
 Instead of requiring to keep such enormous data on-hand, sketches have small data structures that are usually kilobytes in size, orders-of-magnitude smaller than required by the exact solutions. 
-Sketches are also streaming algorithms, in that they only need to see each incoming item only once.
+Sketches are also streaming algorithms, in that they only need to see each incoming item once.
 
 ## System Architecture for Sketch Processing of Big Data 
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org