You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by ie...@apache.org on 2019/03/21 15:26:56 UTC
[beam] 01/01: Correct spark version used by the spark runner in the website

This is an automated email from the ASF dual-hosted git repository.

iemejia pushed a commit to branch iemejia-update-spark-web
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 1a839ef293256ef55abad38929f34b3927bb8cad
Author: Ismaël Mejía <ie...@gmail.com>
AuthorDate: Thu Mar 21 16:26:42 2019 +0100

    Correct spark version used by the spark runner in the website
    
    Also upgrade the related spark documentation to latest
---
 website/src/documentation/runners/spark.md | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/website/src/documentation/runners/spark.md b/website/src/documentation/runners/spark.md
index fa17b44..49fa7ee 100644
--- a/website/src/documentation/runners/spark.md
+++ b/website/src/documentation/runners/spark.md
@@ -26,8 +26,8 @@ The Spark Runner can execute Spark pipelines just like a native Spark applicatio
 The Spark Runner executes Beam pipelines on top of Apache Spark, providing:
 
 * Batch and streaming (and combined) pipelines.
-* The same fault-tolerance [guarantees](http://spark.apache.org/docs/1.6.3/streaming-programming-guide.html#fault-tolerance-semantics) as provided by RDDs and DStreams.
-* The same [security](http://spark.apache.org/docs/1.6.3/security.html) features Spark provides.
+* The same fault-tolerance [guarantees](http://spark.apache.org/docs/latest/streaming-programming-guide.html#fault-tolerance-semantics) as provided by RDDs and DStreams.
+* The same [security](http://spark.apache.org/docs/latest/security.html) features Spark provides.
 * Built-in metrics reporting using Spark's metrics system, which reports Beam Aggregators as well.
 * Native support for Beam side-inputs via spark's Broadcast variables.
 
@@ -37,7 +37,7 @@ _**Note:**_ _support for the Beam Model in streaming is currently experimental,
 
 ## Spark Runner prerequisites and setup
 
-The Spark runner currently supports Spark's 1.6 branch, and more specifically any version greater than 1.6.0.
+The Spark runner currently supports Spark's 2.x branch, and more specifically any version greater than 2.2.0.
 
 You can add a dependency on the latest version of the Spark runner by adding to your pom.xml the following:
 ```java
@@ -166,14 +166,14 @@ When executing your pipeline with the Spark Runner, you should consider the foll
 When submitting a Spark application to cluster, it is common (and recommended) to use the <code>spark-submit</code> script that is provided with the spark installation.
 The <code>PipelineOptions</code> described above are not to replace <code>spark-submit</code>, but to complement it.
 Passing any of the above mentioned options could be done as one of the <code>application-arguments</code>, and setting <code>--master</code> takes precedence.
-For more on how to generally use <code>spark-submit</code> checkout Spark [documentation](http://spark.apache.org/docs/1.6.3/submitting-applications.html#launching-applications-with-spark-submit).
+For more on how to generally use <code>spark-submit</code> checkout Spark [documentation](http://spark.apache.org/docs/latest/submitting-applications.html#launching-applications-with-spark-submit).
 
 ### Monitoring your job
 
-You can monitor a running Spark job using the Spark [Web Interfaces](http://spark.apache.org/docs/1.6.3/monitoring.html#web-interfaces). By default, this is available at port `4040` on the driver node. If you run Spark on your local machine that would be `http://localhost:4040`.
-Spark also has a history server to [view after the fact](http://spark.apache.org/docs/1.6.3/monitoring.html#viewing-after-the-fact).
-Metrics are also available via [REST API](http://spark.apache.org/docs/1.6.3/monitoring.html#rest-api).
-Spark provides a [metrics system](http://spark.apache.org/docs/1.6.3/monitoring.html#metrics) that allows reporting Spark metrics to a variety of Sinks. The Spark runner reports user-defined Beam Aggregators using this same metrics system and currently supports <code>GraphiteSink</code> and <code>CSVSink</code>, and providing support for additional Sinks supported by Spark is easy and straight-forward.
+You can monitor a running Spark job using the Spark [Web Interfaces](http://spark.apache.org/docs/latest/monitoring.html#web-interfaces). By default, this is available at port `4040` on the driver node. If you run Spark on your local machine that would be `http://localhost:4040`.
+Spark also has a history server to [view after the fact](http://spark.apache.org/docs/latest/monitoring.html#viewing-after-the-fact).
+Metrics are also available via [REST API](http://spark.apache.org/docs/latest/monitoring.html#rest-api).
+Spark provides a [metrics system](http://spark.apache.org/docs/latest/monitoring.html#metrics) that allows reporting Spark metrics to a variety of Sinks. The Spark runner reports user-defined Beam Aggregators using this same metrics system and currently supports <code>GraphiteSink</code> and <code>CSVSink</code>, and providing support for additional Sinks supported by Spark is easy and straight-forward.
 
 ### Streaming Execution