You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@ignite.apache.org by dm...@apache.org on 2020/02/06 21:11:32 UTC
svn commit: r1873721 - in /ignite/site/branches/ignite-redisign: includes/header.html use-cases/hpc.html

Author: dmagda
Date: Thu Feb  6 21:11:32 2020
New Revision: 1873721

URL: http://svn.apache.org/viewvc?rev=1873721&view=rev
Log:
Created a page for high-performance computing use case

Added:
    ignite/site/branches/ignite-redisign/use-cases/hpc.html
      - copied, changed from r1873579, ignite/site/branches/ignite-redisign/use-cases/spark-acceleration.html
Modified:
    ignite/site/branches/ignite-redisign/includes/header.html

Modified: ignite/site/branches/ignite-redisign/includes/header.html
URL: http://svn.apache.org/viewvc/ignite/site/branches/ignite-redisign/includes/header.html?rev=1873721&r1=1873720&r2=1873721&view=diff
==============================================================================
--- ignite/site/branches/ignite-redisign/includes/header.html (original)
+++ ignite/site/branches/ignite-redisign/includes/header.html Thu Feb  6 21:11:32 2020
@@ -124,14 +124,14 @@
                                         <li class="divider">
 
                                         <li role="presentation" class="submenu-header">Data & Compute Hubs</li>
+                                        <li><a href="/use-cases/hpc.html" aria-label="High-Performance Computing"
+                                               onclick="ga('send', 'event', 'apache_ignite_usecases', 'menu_click', 'massive_parallel_processing');">
+                                            High-Performance Computing</a>
+                                        </li>
                                         <li><a href="#TODO" aria-label="Digital Integration Hub"
                                                onclick="ga('send', 'event', 'apache_ignite_usecases', 'menu_click', 'digital_integration_hub');">
                                             Digital Integration Hub</a>
                                         </li>
-                                        <li><a href="#TODO" aria-label="Compute Engine"
-                                               onclick="ga('send', 'event', 'apache_ignite_usecases', 'menu_click', 'massive_parallel_processing');">
-                                            Compute Engine</a>
-                                        </li>
 
                                         <li class="divider">
 

Copied: ignite/site/branches/ignite-redisign/use-cases/hpc.html (from r1873579, ignite/site/branches/ignite-redisign/use-cases/spark-acceleration.html)
URL: http://svn.apache.org/viewvc/ignite/site/branches/ignite-redisign/use-cases/hpc.html?p2=ignite/site/branches/ignite-redisign/use-cases/hpc.html&p1=ignite/site/branches/ignite-redisign/use-cases/spark-acceleration.html&r1=1873579&r2=1873721&rev=1873721&view=diff
==============================================================================
--- ignite/site/branches/ignite-redisign/use-cases/spark-acceleration.html (original)
+++ ignite/site/branches/ignite-redisign/use-cases/hpc.html Thu Feb  6 21:11:32 2020
@@ -33,15 +33,15 @@ under the License.
 <!DOCTYPE html>
 <html lang="en">
 <head>
-<link rel="canonical" href="https://ignite.apache.org/use-cases/spark-acceleration.html"/>
+    <link rel="canonical" href="https://ignite.apache.org/use-cases/hpc.html"/>
     <meta charset="utf-8">
     <meta name="viewport" content="width=device-width, initial-scale=1.0">
 
     <meta name="description"
-          content="Apache Ignite integrates with Apache Spark to accelerate the performance of Spark applications
-          and APIs by keeping data in a shared in-memory cluster."/>
+          content="Apache Ignite enables high-performance computing by providing APIs for data and
+           compute-intensive calculations. Turn your commodity hardware or cloud environment into a distributed supercomputer."/>
 
-    <title>Apache Spark Performance Acceleration With Apache Ignite</title>
+    <title>High-Performance Computing With Apache Ignite</title>
 
     <!--#include virtual="/includes/styles.html" -->
 
@@ -53,90 +53,92 @@ under the License.
 
     <main id="main" role="main" class="container">
         <section id="shared-memory-layer" class="page-section">
-            <h1 class="first">Apache Spark Performance Acceleration With Apache Ignite</h1>
+            <h1 class="first">High-Performance Computing With Apache Ignite</h1>
             <div class="col-sm-12 col-md-12 col-xs-12" style="padding:0 0 10px 0;">
                 <div class="col-sm-6 col-md-6 col-xs-12" style="padding-left:0; padding-right:0">
                     <p>
-                        Apache Ignite integrates with Apache Spark to accelerate the performance of Spark applications
-                        and APIs by keeping data in a shared in-memory cluster. Spark users can use Ignite as a data
-                        source in a way similar to Hadoop or a relational database. Just start an Ignite cluster, set
-                        it as a data source for Spark workers, and keep using Spark RDDs or DataFrames APIs or gain
-                        even more speed by running Ignite SQL or compute APIs directly.
+                        High-performance computing (HPC) is the ability to process data and perform complex
+                        calculations at high speeds. Apache Ignite enables HPC by providing APIs for compute- and
+                        data-intensive calculations. The APIs implement the MapReduce paradigm and let you run
+                        arbitrary tasks across the cluster of Ignite nodes.
                     </p>
-
                     <p>
-                        In addition to the performance acceleration of Spark applications, Ignite is used as a shared
-                        in-memory layer by those Spark workers that need to share both data and state.
+                        Having Ignite as a high-performance compute cluster, you can turn a group of commodity
+                        machines or a cloud environment into a distributed supercomputer of interconnected Ignite
+                        nodes.
+                    </p>
+                    <p>
+                        Ignite enables speed and scale for HPC scenarios by processing records in memory with the
+                        elimination of data shuffling and network utilization.
                     </p>
-
                 </div>
 
                 <div class="col-sm-6 col-md-6 col-xs-12" style="padding-right:0">
-                    <img class="img-responsive" src="/images/spark_integration.png" width="440px" style="float:right;"/>
+                    <img class="img-responsive" src="/images/collocated_processing.png" width="440px"
+                         style="float:right;"/>
                 </div>
 
             </div>
 
+            <div class="page-heading">Co-located Processing</div>
             <p>
-                The performance increase is achievable for several reasons. First, Ignite is designed to store data sets
-                in memory across a cluster of nodes reducing latency of Spark operations that usually need to pull date
-                from disk-based systems. Second, Ignite tries to minimize data shuffling over the network between its
-                store and Spark applications by running certain Spark tasks, produced by RDDs or DataFrames APIs,
-                in-place on Ignite nodes. This optimization helps to reduce the effect of the network latency on
-                performance of Spark calls. Finally, the network impact can be minimized even greatly if native
-                Ignite APIs such as SQL are called from Spark applications directly. By doing that, you will completely
-                eliminate data shuffling between Spark and Ignite as long as Ignite SQL queries are always executed on
-                Ignite nodes returning a much smaller final result set to an application layer.
+                Ignite uses the notion of co-located processing to guide HPC workloads implementations in distributed
+                in-memory environments. The primary aim of this type of processing is to increase the performance of
+                your complex calculations by running them straight on the Ignite cluster nodes. In such a case, the
+                calculations process only local data sets of the cluster nodes, thus, avoiding records shuffling over
+                the network. It results in minimal network utilization, and an order of magnitude performance increase
+                depending on the data volume.
             </p>
 
-            <div class="page-heading">Ignite Shared RDDs</div>
             <p>
-                Apache Ignite provides an implementation of the Spark RDD which allows any data and state to be shared
-                in memory as RDDs across Spark jobs. The Ignite RDD provides a shared, mutable view of the same data
-                in-memory in Ignite across different Spark jobs, workers, or applications.
+                To exploit the co-located processing in practice, first, you need to co-locate data by storing related
+                records on the same cluster node. Consider your bank account and transactions posted to it as an example
+                of related or co-located data. Once you set <code>accountID</code> as an affinity
+                key for <code>Transactions</code> table, then you'll instruct Ignite to store all the transactions with
+                the same <code>accountId</code> on a single cluster node that keeps the record of your account in
+                <code>Accounts</code> table.
             </p>
 
             <p>
-                The way an IgniteRDD is implemented is as a view over a distributed Ignite table (aka. cache).
-                It can be deployed with an Ignite node either within the Spark job executing process, on a Spark worker,
-                or in a separate Ignite cluster. It means that depending on the chosen deployment mode the shared
-                state may either exist only during the lifespan of a Spark application (embedded mode), or it may
-                out-survive the Spark application (standalone mode).
+                As soon as data is co-located, Ignite can execute compute- and data-intensive logic on the cluster nodes
+                that store the records required for the calculation. For instance, a payment processing system can send
+                a compute task for previous transactions verification to a specific Ignite node that stores your account
+                record with all completed transactions and finish fraud-detection verifications locally on that machine.
+                Thus, instead of pulling all the transactions back to the application over the network, the processing
+                system eliminates network utilization by running verifications on the nodes that store actual data.
+                The effect is even more significant when the system needs to process millions of transactions per second,
+                verifying billions of previously completed payments.
             </p>
 
-            <div class="page-heading">Ignite DataFrames</div>
-            <p>
-                The Apache Spark DataFrame API introduced the concept of a schema to describe the data,
-                allowing Spark to manage the schema and organize the data into a tabular format. To put it simply,
-                a DataFrame is a distributed collection of data organized into named columns. It is conceptually
-                equivalent to a table in a relational database and allows Spark to leverage the Catalyst query
-                optimizer to produce much more efficient query execution plans in comparison to RDDs, which are
-                collections of elements partitioned across the nodes of the cluster.
-            </p>
+            <div class="page-heading">Compute APIs</div>
+
             <p>
-                Ignite supports DataFrame APIs letting Spark to write to and read from Ignite through that interface.
-                Even more, Ignite analyses execution plans produced by Spark's Catalyst engine and can execute
-                parts of the plan on Ignite nodes directly, reducing data shuffling. All that will make your SparkSQL
-                more performant.
+                Ignite provides compute APIs (also known as compute grid in Ignite) for creation and scheduling custom
+                tasks of arbitrary complexity. The APIs implement MapReduce paradigm and presently available for Java,
+                C# and C++ programming languages.
             </p>
 
             <div class="page-heading">Learn More</div>
             <p>
-                <a href="https://apacheignite-fs.readme.io/docs/installation-deployment" target="docs">
-                    <b>Ignite and Spark Installation and Deployment <i class="fa fa-angle-double-right"></i></b>
+                <a href="http://localhost/features/collocatedprocessing.html">
+                    <b>Co-located processing <i class="fa fa-angle-double-right"></i></b>
                 </a>
             </p>
             <p>
-                <a href="https://apacheignite-fs.readme.io/docs/ignitecontext-igniterdd" target="docs">
-                    <b>Ignite RDDs in Details <i class="fa fa-angle-double-right"></i></b>
+                <a href="https://apacheignite.readme.io/docs/compute-grid" target="docs">
+                    <b>Compute APIs <i class="fa fa-angle-double-right"></i></b>
                 </a>
             </p>
             <p>
-                <a href="https://apacheignite-fs.readme.io/docs/ignite-data-frame" target="docs">
-                    <b>Ignite DataFrames in Details <i class="fa fa-angle-double-right"></i></b>
+                <a href="/features/machinelearning.html">
+                    <b>Machine and Deep Learning <i class="fa fa-angle-double-right"></i></b>
+                </a>
+            </p>
+            <p>
+                <a href="/arch/memorycentric.html">
+                    <b>Memory-Centric Storage <i class="fa fa-angle-double-right"></i></b>
                 </a>
             </p>
-
         </section>
     </main>