You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by me...@apache.org on 2018/08/20 21:40:42 UTC

[beam-site] 05/11: Blog post updates based on @iemejia's feedback

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit c5037a277bc347971635bc04d5d05e65e2acbd68
Author: Julien Phalip <jp...@gmail.com>
AuthorDate: Mon Aug 13 13:17:54 2018 -0400

    Blog post updates based on @iemejia's feedback
---
 src/_posts/2018-08-XX-review-input-streaming-connectors.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/_posts/2018-08-XX-review-input-streaming-connectors.md b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
index aa19675..fded813 100644
--- a/src/_posts/2018-08-XX-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
@@ -21,7 +21,7 @@ Spark is written in Scala and has a [Java API](https://spark.apache.org/docs/lat
 
 Spark offers two approaches to streaming: [Discretized Streaming](https://spark.apache.org/docs/latest/streaming-programming-guide.html) (or DStreams) and [Structured Streaming](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html). DStreams are a basic abstraction that represents a continuous series of [Resilient Distributed Datasets](https://spark.apache.org/docs/latest/rdd-programming-guide.html) (or RDDs). Structured Streaming was introduced more recently  [...]
 
-Spark Structured Streaming supports [file sources](https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/streaming/DataStreamReader.html) (local filesystems and HDFS-compatible systems like Cloud Storage or S3) and [Kafka](https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html) as streaming inputs. Spark maintains built-in connectors for DStreams aimed at third-party services, such as Kafka or Flume, while other connectors are available through link [...]
+Spark Structured Streaming supports [file sources](https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/streaming/DataStreamReader.html) (local filesystems and HDFS-compatible systems like Cloud Storage or S3) and [Kafka](https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html) as streaming [inputs](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#input-sources). Spark maintains built-in connectors for DStreams aimed  [...]
 
 Below are the main streaming input connectors for available for Beam and Spark DStreams in Java:
 
@@ -49,7 +49,7 @@ Below are the main streaming input connectors for available for Beam and Spark D
   <tr>
    <td>HDFS<br>(Using the <code>hdfs://</code> URI)
    </td>
-   <td><a href="https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html">HadoopFileSystemOptions</a>
+    <td><a href="https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/FileIO.html">FileIO</a> + <a href="https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html">HadoopFileSystemOptions</a>
    </td>
    <td><a href="https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/util/HdfsUtils.html">HdfsUtils</a>
    </td>
@@ -93,7 +93,7 @@ and <a href="https://spark.apache.org/docs/latest/api/java/org/apache/spark/stre
    </td>
    <td><a href="https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.html">PubsubIO</a>
    </td>
-   <td><a href="https://github.com/apache/bahir/tree/master/streaming-pubsub">Spark-streaming-pubsub</a> from <a href="http://bahir.apache.org">Apache Bahir</a>
+   <td><a href="https://github.com/apache/bahir/tree/master/streaming-pubsub">spark-streaming-pubsub</a> from <a href="http://bahir.apache.org">Apache Bahir</a>
    </td>
   </tr>
   <tr>
@@ -204,11 +204,11 @@ and <a href="http://spark.apache.org/docs/latest/api/python/pyspark.streaming.ht
 
 ### **Scala**
 
-Since Scala code is interoperable with Java and therefore has native compatibility with Java libraries (and vice versa), you can use the same Java connectors described above in your Scala programs. Apache Beam also has a [Scala SDK](https://github.com/spotify/scio) open-sourced [by Spotify](https://labs.spotify.com/2017/10/16/big-data-processing-at-spotify-the-road-to-scio-part-1/).
+Since Scala code is interoperable with Java and therefore has native compatibility with Java libraries (and vice versa), you can use the same Java connectors described above in your Scala programs. Apache Beam also has a [Scala API](https://github.com/spotify/scio) open-sourced [by Spotify](https://labs.spotify.com/2017/10/16/big-data-processing-at-spotify-the-road-to-scio-part-1/).
 
 ### **Go**
 
-A [Go SDK](https://beam.apache.org/documentation/sdks/go/) for Apache Beam is under active development. It is currently experimental and is not recommended for production.
+A [Go SDK](https://beam.apache.org/documentation/sdks/go/) for Apache Beam is under active development. It is currently experimental and is not recommended for production. Spark does not have an official Go SDK.
 
 ### **R**