You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@datasketches.apache.org by le...@apache.org on 2020/07/14 23:12:41 UTC

[incubator-datasketches-website] branch master updated: Update website add indirect references to System Integrations

This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-datasketches-website.git


The following commit(s) were added to refs/heads/master by this push:
     new d13ead9  Update website add indirect references to System Integrations
d13ead9 is described below

commit d13ead98f953e44bc3951a39225db17e4006a35d
Author: Lee Rhodes <le...@users.noreply.github.com>
AuthorDate: Tue Jul 14 16:12:17 2020 -0700

    Update website add indirect references to System Integrations
---
 _includes/toc.html                                 |  5 +-
 docs/Architecture/Components.md                    |  9 ++--
 docs/Architecture/LargeScale.md                    | 59 ++++++++++++----------
 docs/Architecture/SketchCriteria.md                |  2 +-
 docs/Community/Downloads.md                        |  3 ++
 docs/Community/Transitioning.md                    |  2 +-
 docs/HLL/HllSketchVsDruidHyperLogLogCollector.md   |  2 +-
 .../ApacheDruidIntegration.md}                     |  4 +-
 .../ApacheHiveIntegration.md}                      |  6 ++-
 .../ApachePigIntegration.md}                       |  6 ++-
 .../PostgreSQLIntegration.md}                      |  7 ++-
 src/main/resources/docgen/toc.json                 |  5 +-
 12 files changed, 66 insertions(+), 44 deletions(-)

diff --git a/_includes/toc.html b/_includes/toc.html
index ea27b03..08c98d2 100644
--- a/_includes/toc.html
+++ b/_includes/toc.html
@@ -285,7 +285,10 @@
     <a data-toggle="collapse" class="menu collapsed" href="#collapse_system_integrations">System Integrations</a>
   </p>
   <div class="collapse" id="collapse_system_integrations">
-    <li><a href="{{site.docs_dir}}/DruidIntegration.html">•Using Sketches in Druid</a></li>
+    <li><a href="{{site.docs_dir}}/SystemIntegrations/ApacheDruidIntegration.html">•Using Sketches in ApacheDruid</a></li>
+    <li><a href="{{site.docs_dir}}/SystemIntegrations/ApacheHiveIntegration.html">•Using Sketches in Apache Hive</a></li>
+    <li><a href="{{site.docs_dir}}/SystemIntegrations/ApachePigIntegration.html">•Using Sketches in Apache Pig</a></li>
+    <li><a href="{{site.docs_dir}}/SystemIntegrations/PostgreSQLIntegration.html">•Using Sketches in PostgreSQL</a></li>
   </div>
 
   <p id="community">
diff --git a/docs/Architecture/Components.md b/docs/Architecture/Components.md
index a530441..1d12c6e 100644
--- a/docs/Architecture/Components.md
+++ b/docs/Architecture/Components.md
@@ -37,16 +37,17 @@ If you like what you see give us a **Star** on one of these two sites!
 Apapters integrate the core components into the aggregation APIs of specific data processing systems. Some of these adapters are available as part of the library, other adapters are directly integrated into the target data processing application.
 
 ### Java Adaptors
-* **[Apache Hive](https://github.com/apache/incubator-datasketches-hive)** (Versioned, Apache Released)
+* **[Apache Druid](https://datasketches.apache.org/docs/SystemIntegrations/ApacheDruidIntegration.html)** (Apach Released as part of Druid)
+* **[Apache Hive](https://datasketches.apache.org/docs/SystemIntegrations/ApacheHiveIntegration.html)** (Versioned, Apache Released)
     * [Theta Sketch Example]({{site.docs_dir}}/Theta/ThetaHiveUDFs.html)
     * [Tuple Sketch Example]({{site.docs_dir}}/Tuple/TuplePigUDFs.html)
-* **[Apache Pig](https://github.com/apache/incubator-datasketches-pig)** (Versioned, Apache Released)
+* **[Apache Pig](https://datasketches.apache.org/docs/SystemIntegrations/ApachePigIntegration.html)** (Versioned, Apache Released)
     * [Theta Sketch Example]({{site.docs_dir}}/Theta/ThetaPigUDFs.html)
     * [Tuple Sketch Example]({{site.docs_dir}}/Tuple/TuplePigUDFs.html) 
-* **[Apache Druid](https://github.com/apache/druid/tree/master/extensions-core/datasketches)** (Apach Released as part of Druid)
+
 
 ### C++ Adaptors
-* **[PostgreSQL](https://github.com/apache/incubator-datasketches-postgresql)** (Versioned, Apache Released)
+* **[PostgreSQL](https://datasketches.apache.org/docs/SystemIntegrations/PostgreSQLIntegration.html)** (Versioned, Apache Released)
 This site provides the postgres-specific adaptors that wrap the C++ implementations making
 them available to the PostgreSQL database users. PostgreSQL users should download the PostgreSQL extension from [pgxn.org](https://pgxn.org/dist/datasketches/).  For examples refer to the README on the component site.
 
diff --git a/docs/Architecture/LargeScale.md b/docs/Architecture/LargeScale.md
index cbcb307..2a7133e 100644
--- a/docs/Architecture/LargeScale.md
+++ b/docs/Architecture/LargeScale.md
@@ -21,46 +21,53 @@ layout: doc_page
 -->
 ## Designed for Large-scale Computing Systems
 
-### Minimal Dependencies
+### Easy Integration with Minimal Dependencies
 
-* Can be integrated into virtually any Java-base system environment.
+* [Java Core](https://datasketches.apache.org/docs/Community/Downloads.html)
   
-* The core library (including Memory) has no dependencies outside of the Java JVM at runtime.
+    * The Java core library (including Memory) has no dependencies outside of the Java JVM at runtime allowing simple integration into virtually any Java based system environment.
+    * All of the Java components are Maven Deployable and registered with [The Central Repository](https://search.maven.org/classic/#search%7Cga%7C1%7Cg%3A%22org.apache.datasketches%22)
 
-### Maven Deployable
+* [C++ Core](https://datasketches.apache.org/docs/Community/Downloads.html)
+    * The C++ core is written as all header files allowing easy integration into a wide range of operating system environments. 
 
-* Registered with <a href="https://search.maven.org/#search|ga|1|DataSketches">The Central Repository</a>
+* [Python](https://github.com/apache/incubator-datasketches-cpp/tree/master/python)
+	* The C++ Core is extended using the python binding library [pybind11](https://github.com/pybind/pybind11) enabling high performance operation from Python.
+
+### Cross Language Binary Compatibility
+
+* Sketches serialized from C++ or Python can be interpreted by compatible Java sketches and visa versa. 
 
 ### Speed
 
-* These single-pass, "one-touch" algorithms are <a href="{{site.docs_dir}}/Theta/ThetaUpdateSpeed.html"><i>fast</i></a> to enable real-time processing capability.
+* These single-pass, "one-touch" algorithms are <i>fast ([see example](https://datasketches.apache.org/docs/Theta/ThetaUpdateSpeed.html))</i> to enable real-time processing capability.
   
-* Coupled with the compact binary representations, in many cases the need for costly serialization and deserialization has been eliminated.
+* Sketches can be represented in an updatable or compact form. The compact form is smaller,  immutable and faster to merge.
+
+* Some of the Java sketches have been designed to be instantiated and operated <i>off-heap</i>, whicn eliminates costly serialization and deserialization.
   
-* The sketch data structures are "additive" and embarrassingly parallelizable. The Theta sketches can be merged without losing accuracy.
+* The sketch data structures are "additive" and embarrassingly parallelizable. Sketches can be merged without losing accuracy.
 
-### Integration for Hive, Pig, Druid and Spark
+### Systems Integrations
 
-* <a href="https://github.com/apache/incubator-datasketches-hive">Apache Hive Adaptors</a>.
-  
-* <a href="https://github.com/apache.incubator-datasketches-pig">Apache Pig Adaptors</a>.
-  
-* <a href="https://github.com/apache/druid/tree/master/extensions-core/datasketches">Druid Adaptors</a>.
-  * For documentation see <a href="https://druid.apache.org/docs/latest/development/extensions-core/datasketches-extension.html">Druid Datasketches extension</a>
-  
-* <a href="{{site.docs_dir}}/Theta/ThetaSparkExample.html">Spark Examples</a> 
+* [Druid Integration](https://datasketches.apache.org/docs/SystemIntegrations/ApacheDruidIntegration.html)  
+* [Apache Hive, Apache Pig and PostgreSQL](https://datasketches.apache.org/docs/Community/Downloads.html)
+* [Apache Hive](https://datasketches.apache.org/docs/SystemIntegrations/ApacheHiveIntegration.html)
+* [Apache Pig](https://datasketches.apache.org/docs/SystemIntegrations/ApachePigIntegration.html)
+* [PostgreSQL](https://datasketches.apache.org/docs/SystemIntegrations/PostgreSQLIntegration.html)
+* [Spark Examples](https://datasketches.apache.org/docs/Theta/ThetaSparkExample.html) 
 
-### Specific Theta Sketch Features for Large Data
+### Specific Sketch Features for Large Data
 
 * <b>Hash Seed Handling</b>. Additional protection for managing hash seeds which is 
-particularly important when processing sensitive user identifiers.
+particularly important when processing sensitive user identifiers. Available with Theta Sketches.
 
-* <a href="{{site.docs_dir}}/Theta/ThetaPSampling.html"><b>Sampling</b></a>. Built-in up-front sampling for cases where additional 
-contol is required to limit overall memory consumption when dealing with millions of sketches.
+* <a href="{{site.docs_dir}}/Theta/ThetaPSampling.html"><b>Pre-Sampling</b></a>. Built-in up-front sampling for cases where additional 
+contol is required to limit overall memory consumption when dealing with millions of sketches. Available with Theta Sketches.
 
-* Off-Heap <a href="{{site.docs_dir}}/Memory/MemoryPackage.html"><b>Memory Package</b></a>. 
+* <a href="{{site.docs_dir}}/Memory/MemoryPackage.html"><b>Memory Package</b></a>. 
 Large query systems often require their own heaps outside the JVM in order to better manage garbage collection latencies. 
-The sketches in this package are designed to operate either on-heap or off-heap.
+The Java sketches utilize this powerful package. 
 
 * Built-in <b>Upper-Bound and Lower-Bound estimators</b>. 
 You are never in the dark about how good of an estimate the sketch is providing. 
@@ -70,10 +77,6 @@ confidence level.
 * User configurable trade-offs of accuracy vs. storage space as well as other performance 
 tuning options.
 
-* Additional protection of sensitive data by user configuration of a hash seed that is 
-not stored with the serialized data.
-
 * <b>Small Footprint Per Sketch</b>. The operating and storage footprint for both 
-row and column oriented storage are minimized with 
-<a href="{{site.docs_dir}}/Theta/ThetaSize.html">compact binary representations</a>, which are much smaller 
+row and column oriented storage are minimized with compact binary representations, which are much smaller 
 than the raw input stream and with a well defined upper bound of size.
diff --git a/docs/Architecture/SketchCriteria.md b/docs/Architecture/SketchCriteria.md
index 0fca52d..13feee7 100644
--- a/docs/Architecture/SketchCriteria.md
+++ b/docs/Architecture/SketchCriteria.md
@@ -20,7 +20,7 @@ layout: doc_page
     under the License.
 -->
 
-# Sketch Criteria for Library Inclusion
+# Sketch Criteria for DataSketches Library
 
 There are lots of clever and useful algorithms that are sometimes called "sketches".  However, due to limited resources, in order to be included in the DataSketches library, we had to clearly define what we meant by the term "sketch".  Otherwise, we would end up with a hodge podge of algorithms and have to answer: Why don't we include algorithm X?.
 
diff --git a/docs/Community/Downloads.md b/docs/Community/Downloads.md
index 8203896..7ec7d68 100644
--- a/docs/Community/Downloads.md
+++ b/docs/Community/Downloads.md
@@ -31,6 +31,9 @@ It is essential that you verify the integrity of release downloads. See [instruc
 ## Download Java Jar Files
 From [Maven Central](https://search.maven.org/search?q=g:%20org.apache.datasketches).
 
+## Enabling Python
+* First download the C++ core above, then read the [Python Installation Instructions](https://github.com/apache/incubator-datasketches-cpp/tree/master/python)
+
 ## Download Earlier Versions
 
 * **[ZIP Files](http://archive.apache.org/dist/incubator/datasketches/java/)**
diff --git a/docs/Community/Transitioning.md b/docs/Community/Transitioning.md
index de46f4e..dc07c3b 100644
--- a/docs/Community/Transitioning.md
+++ b/docs/Community/Transitioning.md
@@ -44,7 +44,7 @@ View all of our Apache DataSketches repository components as a [list](https://gi
   * **sketches-hive** moved to [incubator-datasketches-hive](https://github.com/apache/incubator-datasketches-hive) Adapts the Java core to Apache Hive.
   * **sketches-pig** moved to [incubator-datasketches-pig](https://github.com/apache/incubator-datasketches-pig) Adapts the Java core to Apache Pig.
   * **sketches-vector** moved to [incubator-datasketches-vector](https://github.com/apache/incubator-datasketches-vector) Experimental sketches for vector and matrix processing.
-  * [Apache Druid adaptors](https://github.com/apache/druid/tree/master/extensions-core/datasketches)
+  * [Apache Druid adaptors](https://datasketches.apache.org/docs/SystemIntegrations/ApacheDruidIntegration.html)
 
 * C++ / [Python](https://github.com/apache/incubator-datasketches-cpp/tree/master/python) Core
   * **sketches-core-cpp** moved to [incubator-datasketches-cpp](https://github.com/apache/incubator-datasketches-cpp) This is the **core** library that contains all major sketch algorithms written in C++ and Python.
diff --git a/docs/HLL/HllSketchVsDruidHyperLogLogCollector.md b/docs/HLL/HllSketchVsDruidHyperLogLogCollector.md
index ebf9bc6..721c9cb 100644
--- a/docs/HLL/HllSketchVsDruidHyperLogLogCollector.md
+++ b/docs/HLL/HllSketchVsDruidHyperLogLogCollector.md
@@ -89,4 +89,4 @@ The code to reproduce these measurements is available in the <a href="https://gi
 
 ## DataSketches HLL Sketch Druid module
 
-The DataSketches Hll sketch module for Druid is available as a part of the <a href="https://github.com/apache/druid/tree/master/extensions-core/datasketches">druid/extensions-core</a>.
+The DataSketches Hll sketch module for Druid is available as a part of [Druid/Extensions-core](https://datasketches.apache.org/docs/SystemIntegrations/ApacheDruidIntegration.html)
diff --git a/docs/DruidIntegration.md b/docs/SystemIntegrations/ApacheDruidIntegration.md
similarity index 86%
copy from docs/DruidIntegration.md
copy to docs/SystemIntegrations/ApacheDruidIntegration.md
index 445c2c0..b478350 100644
--- a/docs/DruidIntegration.md
+++ b/docs/SystemIntegrations/ApacheDruidIntegration.md
@@ -21,4 +21,6 @@ layout: doc_page
 -->
 ### Druid Integration
 
-See <a href="https://druid.apache.org/docs/latest/development/extensions-core/datasketches-extension.html">Druid DataSketches extension</a>
+* See [Druid DataSketches Extension](https://druid.apache.org/docs/latest/development/extensions-core/datasketches-extension.html)
+
+
diff --git a/docs/DruidIntegration.md b/docs/SystemIntegrations/ApacheHiveIntegration.md
similarity index 81%
copy from docs/DruidIntegration.md
copy to docs/SystemIntegrations/ApacheHiveIntegration.md
index 445c2c0..62caa89 100644
--- a/docs/DruidIntegration.md
+++ b/docs/SystemIntegrations/ApacheHiveIntegration.md
@@ -19,6 +19,8 @@ layout: doc_page
     specific language governing permissions and limitations
     under the License.
 -->
-### Druid Integration
+### Apache Hive Integration
 
-See <a href="https://druid.apache.org/docs/latest/development/extensions-core/datasketches-extension.html">Druid DataSketches extension</a>
+* [Download](https://datasketches.apache.org/docs/Community/Downloads.html)
+
+* Build and Install, See [Hive README](https://github.com/apache/incubator-datasketches-hive)
\ No newline at end of file
diff --git a/docs/DruidIntegration.md b/docs/SystemIntegrations/ApachePigIntegration.md
similarity index 81%
copy from docs/DruidIntegration.md
copy to docs/SystemIntegrations/ApachePigIntegration.md
index 445c2c0..9f55242 100644
--- a/docs/DruidIntegration.md
+++ b/docs/SystemIntegrations/ApachePigIntegration.md
@@ -19,6 +19,8 @@ layout: doc_page
     specific language governing permissions and limitations
     under the License.
 -->
-### Druid Integration
+### Apache Pig Integration
 
-See <a href="https://druid.apache.org/docs/latest/development/extensions-core/datasketches-extension.html">Druid DataSketches extension</a>
+* [Download](https://datasketches.apache.org/docs/Community/Downloads.html)
+
+* Build and Install, See [Pig README](https://github.com/apache/incubator-datasketches-pig)
\ No newline at end of file
diff --git a/docs/DruidIntegration.md b/docs/SystemIntegrations/PostgreSQLIntegration.md
similarity index 80%
rename from docs/DruidIntegration.md
rename to docs/SystemIntegrations/PostgreSQLIntegration.md
index 445c2c0..b8a716f 100644
--- a/docs/DruidIntegration.md
+++ b/docs/SystemIntegrations/PostgreSQLIntegration.md
@@ -19,6 +19,9 @@ layout: doc_page
     specific language governing permissions and limitations
     under the License.
 -->
-### Druid Integration
+### PostgreSQL Integration
+
+* [Download](https://datasketches.apache.org/docs/Community/Downloads.html)
+
+* Install, See [PostgreSQL README](https://github.com/apache/incubator-datasketches-postgresql)
 
-See <a href="https://druid.apache.org/docs/latest/development/extensions-core/datasketches-extension.html">Druid DataSketches extension</a>
diff --git a/src/main/resources/docgen/toc.json b/src/main/resources/docgen/toc.json
index dbd94cd..4a6b9e6 100644
--- a/src/main/resources/docgen/toc.json
+++ b/src/main/resources/docgen/toc.json
@@ -242,7 +242,10 @@
 
     { "class":"Dropdown", "desc" : "System Integrations", "array":
       [
-        {"class":"Doc",  "desc" : "Using Sketches in Druid",                  "dir" : "",          "file": "DruidIntegration" },
+        {"class":"Doc",  "desc" : "Using Sketches in ApacheDruid",            "dir" : "SystemIntegrations", "file": "ApacheDruidIntegration" },
+        {"class":"Doc",  "desc" : "Using Sketches in Apache Hive",            "dir" : "SystemIntegrations", "file": "ApacheHiveIntegration" },
+        {"class":"Doc",  "desc" : "Using Sketches in Apache Pig",             "dir" : "SystemIntegrations", "file": "ApachePigIntegration" },
+        {"class":"Doc",  "desc" : "Using Sketches in PostgreSQL",             "dir" : "SystemIntegrations", "file": "PostgreSQLIntegration" },
       ]
     },
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org