You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@crail.apache.org by pe...@apache.org on 2018/09/06 11:01:09 UTC

[4/5] incubator-crail git commit: Documentation: Spark-IO

Documentation: Spark-IO

Add Spark-IO documentation: building and configuration.

Signed-off-by: Jonas Pfefferle <pe...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/incubator-crail/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-crail/commit/c3ed9d64
Tree: http://git-wip-us.apache.org/repos/asf/incubator-crail/tree/c3ed9d64
Diff: http://git-wip-us.apache.org/repos/asf/incubator-crail/diff/c3ed9d64

Branch: refs/heads/master
Commit: c3ed9d648a616c44d10e9534d570e61dfffded61
Parents: f1dcb0d
Author: Jonas Pfefferle <pe...@apache.org>
Authored: Wed Aug 15 11:15:36 2018 +0200
Committer: Jonas Pfefferle <pe...@apache.org>
Committed: Thu Sep 6 12:59:41 2018 +0200

----------------------------------------------------------------------
 doc/source/spark.rst | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-crail/blob/c3ed9d64/doc/source/spark.rst
----------------------------------------------------------------------
diff --git a/doc/source/spark.rst b/doc/source/spark.rst
index b999ed8..146a942 100644
--- a/doc/source/spark.rst
+++ b/doc/source/spark.rst
@@ -33,7 +33,47 @@ from the original data in HDFS.
 Spark-IO
 --------
 
+Crail-Spark-IO contains various I/O accleration plugins for Spark tailored to
+high-performance network and storage hardware (RDMA, NVMef, etc.).
+Spark-IO is not provided with the default Crail deployment but can be
+obtained `here <https://github.com/zrlio/crail-spark-io>`_.
+Spark-IO currently contains two IO plugins: a shuffle engine and a broadcast module.
+Both plugins inherit all the benefits of Crail such as very high performance
+(throughput and latency) and multi-tiering (e.g., DRAM and flash).
 
+Requirements
+~~~~~~~~~~~~
+
+* Spark >= 2.0
+* Java 8
+* Maven
+* Crail >= 1.0
+
+Building
+~~~~~~~~
+
+To build Crail execute the following steps:
+
+1. Obtain a copy of Crail-Spark-IO from `Github <https://github.com/zrlio/crail-spark-io>`_
+2. Make sure your local maven repository contains Crail, if not build Crail
+   from :ref:`source <Building from source>`
+3. Run: :code:`mvn -DskipTests install`
+
+
+Configure Spark
+~~~~~~~~~~~~~~~
+To configure the crail shuffle plugin add the following lines to spark-defaults.conf
+
+.. code-block:: bash
+
+    spark.shuffle.manager           org.apache.spark.shuffle.crail.CrailShuffleManager
+
+    spark.driver.extraClassPath     $CRAIL_HOME/jars/*:<path>/crail-spark-X.Y.jar:.
+    spark.executor.extraClassPath   $CRAIL_HOME/jars/*:<path>/crail-spark-X.Y.jar:.
+
+
+Since Spark version 2.0.0, broadcast is no longer an exchangeable plugin, unfortunately.
+To use the Crail broadcast plugin in Spark it has to be manually added to Spark's BroadcastManager.scala.
 
 Crail-TeraSort
 --------------