You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@bahir.apache.org by lr...@apache.org on 2018/11/30 14:28:51 UTC

[1/4] bahir-website git commit: Update doc script with newly added extensions

Repository: bahir-website
Updated Branches:
  refs/heads/master 14533df08 -> 5b4ed3c1f


Update doc script with newly added extensions


Project: http://git-wip-us.apache.org/repos/asf/bahir-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/bahir-website/commit/0995f1ef
Tree: http://git-wip-us.apache.org/repos/asf/bahir-website/tree/0995f1ef
Diff: http://git-wip-us.apache.org/repos/asf/bahir-website/diff/0995f1ef

Branch: refs/heads/master
Commit: 0995f1ef61f0ad7c10e44bf3fd5beb661c09f3af
Parents: 14533df
Author: Luciano Resende <lr...@apache.org>
Authored: Mon Nov 26 15:22:54 2018 +0100
Committer: Luciano Resende <lr...@apache.org>
Committed: Mon Nov 26 15:22:54 2018 +0100

----------------------------------------------------------------------
 update-doc.sh | 5 +++++
 1 file changed, 5 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/bahir-website/blob/0995f1ef/update-doc.sh
----------------------------------------------------------------------
diff --git a/update-doc.sh b/update-doc.sh
index 73960ff..cc51fbd 100755
--- a/update-doc.sh
+++ b/update-doc.sh
@@ -85,6 +85,8 @@ FLINK_WEBSITE_DOC_DIR=site/docs/flink/current
 FLINK_REPO_NAME=bahir-flink
 FLINK_BAHIR_SOURCE_DIR=target/$FLINK_REPO_NAME
 
+GIT_REF=${GIT_REF:-master}
+
 function checkout_code {
     # Checkout code
     cd "$BASE_DIR" # make sure we're in the base dir
@@ -156,6 +158,7 @@ function update_spark {
         spark-streaming-akka      streaming-akka            \
         spark-streaming-mqtt      streaming-mqtt            \
         spark-streaming-pubsub    streaming-pubsub          \
+        spark-streaming-pubnub    streaming-pubnub          \
         spark-streaming-twitter   streaming-twitter         \
         spark-streaming-zeromq    streaming-zeromq
 
@@ -174,6 +177,8 @@ function update_flink {
         flink-streaming-activemq  flink-connector-activemq  \
         flink-streaming-akka      flink-connector-akka      \
         flink-streaming-flume     flink-connector-flume     \
+        flink-streaming-influxdb  flink-connector-influxdb  \
+        flink-streaming-kudu      flink-connector-kudu      \
         flink-streaming-netty     flink-connector-netty     \
         flink-streaming-redis     flink-connector-redis
 


[2/4] bahir-website git commit: Add template for new pubnub spark extension

Posted by lr...@apache.org.
Add template for new pubnub spark extension


Project: http://git-wip-us.apache.org/repos/asf/bahir-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/bahir-website/commit/a4e3d3d3
Tree: http://git-wip-us.apache.org/repos/asf/bahir-website/tree/a4e3d3d3
Diff: http://git-wip-us.apache.org/repos/asf/bahir-website/diff/a4e3d3d3

Branch: refs/heads/master
Commit: a4e3d3d356bba45faf8c120be999339470d40178
Parents: 0995f1e
Author: Luciano Resende <lr...@apache.org>
Authored: Mon Nov 26 15:24:36 2018 +0100
Committer: Luciano Resende <lr...@apache.org>
Committed: Mon Nov 26 15:24:36 2018 +0100

----------------------------------------------------------------------
 .../templates/spark-streaming-pubnub.template   | 26 ++++++++++++++++++++
 .../templates/spark-streaming-pubsub.template   |  4 +--
 2 files changed, 28 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/bahir-website/blob/a4e3d3d3/site/docs/spark/templates/spark-streaming-pubnub.template
----------------------------------------------------------------------
diff --git a/site/docs/spark/templates/spark-streaming-pubnub.template b/site/docs/spark/templates/spark-streaming-pubnub.template
new file mode 100644
index 0000000..3c7e86e
--- /dev/null
+++ b/site/docs/spark/templates/spark-streaming-pubnub.template
@@ -0,0 +1,26 @@
+---
+layout: page
+title: Spark Streaming Google Pub-Sub
+description: Spark Streaming Google Pub-Sub
+group: nav-right
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+{% include JB/setup %}

http://git-wip-us.apache.org/repos/asf/bahir-website/blob/a4e3d3d3/site/docs/spark/templates/spark-streaming-pubsub.template
----------------------------------------------------------------------
diff --git a/site/docs/spark/templates/spark-streaming-pubsub.template b/site/docs/spark/templates/spark-streaming-pubsub.template
index 3c7e86e..1e9958e 100644
--- a/site/docs/spark/templates/spark-streaming-pubsub.template
+++ b/site/docs/spark/templates/spark-streaming-pubsub.template
@@ -1,7 +1,7 @@
 ---
 layout: page
-title: Spark Streaming Google Pub-Sub
-description: Spark Streaming Google Pub-Sub
+title: Spark Streaming PubNub
+description: Spark Streaming PubNub
 group: nav-right
 ---
 <!--


[3/4] bahir-website git commit: Update spark extension current documentation

Posted by lr...@apache.org.
Update spark extension current documentation


Project: http://git-wip-us.apache.org/repos/asf/bahir-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/bahir-website/commit/1dc81cfb
Tree: http://git-wip-us.apache.org/repos/asf/bahir-website/tree/1dc81cfb
Diff: http://git-wip-us.apache.org/repos/asf/bahir-website/diff/1dc81cfb

Branch: refs/heads/master
Commit: 1dc81cfbb9d419cd78b94d08b0a40f045b312958
Parents: a4e3d3d
Author: Luciano Resende <lr...@apache.org>
Authored: Tue Nov 27 15:25:57 2018 +0100
Committer: Luciano Resende <lr...@apache.org>
Committed: Tue Nov 27 15:25:57 2018 +0100

----------------------------------------------------------------------
 site/docs/spark/current/spark-sql-cloudant.md   |   4 +-
 .../spark/current/spark-sql-streaming-akka.md   |   2 +-
 .../spark/current/spark-sql-streaming-mqtt.md   |  95 ++++++++++++++---
 .../spark/current/spark-streaming-pubnub.md     | 103 +++++++++++++++++++
 .../spark/current/spark-streaming-pubsub.md     |   4 +-
 .../spark/current/spark-streaming-zeromq.md     |  17 ++-
 6 files changed, 205 insertions(+), 20 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/bahir-website/blob/1dc81cfb/site/docs/spark/current/spark-sql-cloudant.md
----------------------------------------------------------------------
diff --git a/site/docs/spark/current/spark-sql-cloudant.md b/site/docs/spark/current/spark-sql-cloudant.md
index 2382e50..355f10c 100644
--- a/site/docs/spark/current/spark-sql-cloudant.md
+++ b/site/docs/spark/current/spark-sql-cloudant.md
@@ -57,11 +57,11 @@ The `--packages` argument can also be used with `bin/spark-submit`.
 
 Submit a job in Python:
     
-    spark-submit  --master local[4] --packages org.apache.bahir:spark-sql-cloudant_{{site.SCALA_BINARY_VERSION}}:{{site.SPARK_VERSION}}  <path to python script>
+    spark-submit  --master local[4] --packages org.apache.bahir:spark-sql-cloudant__{{site.SCALA_BINARY_VERSION}}:{{site.SPARK_VERSION}}  <path to python script>
     
 Submit a job in Scala:
 
-	spark-submit --class "<your class>" --master local[4] --packages org.apache.bahir:spark-sql-cloudant_{{site.SCALA_BINARY_VERSION}}:{{site.SPARK_VERSION}} <path to spark-sql-cloudant jar>
+	spark-submit --class "<your class>" --master local[4] --packages org.apache.bahir:spark-sql-cloudant__{{site.SCALA_BINARY_VERSION}}:{{site.SPARK_VERSION}} <path to spark-sql-cloudant jar>
 
 This library is compiled for Scala 2.11 only, and intends to support Spark 2.0 onwards.
 

http://git-wip-us.apache.org/repos/asf/bahir-website/blob/1dc81cfb/site/docs/spark/current/spark-sql-streaming-akka.md
----------------------------------------------------------------------
diff --git a/site/docs/spark/current/spark-sql-streaming-akka.md b/site/docs/spark/current/spark-sql-streaming-akka.md
index 459c3f6..d88fc91 100644
--- a/site/docs/spark/current/spark-sql-streaming-akka.md
+++ b/site/docs/spark/current/spark-sql-streaming-akka.md
@@ -71,7 +71,7 @@ Setting values for option `persistenceDirPath` helps in recovering in case of a
                        
 ## Configuration options.
                        
-This source uses [Akka Actor api](http://doc.akka.io/api/akka/2.4/akka/actor/Actor.html).
+This source uses [Akka Actor api](http://doc.akka.io/api/akka/2.5/akka/actor/Actor.html).
                        
 * `urlOfPublisher` The url of Publisher or Feeder actor that the Receiver actor connects to. Set this as the tcp url of the Publisher or Feeder actor.
 * `persistenceDirPath` By default it is used for storing incoming messages on disk.

http://git-wip-us.apache.org/repos/asf/bahir-website/blob/1dc81cfb/site/docs/spark/current/spark-sql-streaming-mqtt.md
----------------------------------------------------------------------
diff --git a/site/docs/spark/current/spark-sql-streaming-mqtt.md b/site/docs/spark/current/spark-sql-streaming-mqtt.md
index 98632df..3317648 100644
--- a/site/docs/spark/current/spark-sql-streaming-mqtt.md
+++ b/site/docs/spark/current/spark-sql-streaming-mqtt.md
@@ -25,7 +25,7 @@ limitations under the License.
 
 {% include JB/setup %}
 
-A library for reading data from MQTT Servers using Spark SQL Streaming ( or Structured streaming.). 
+A library for writing and reading data from MQTT Servers using Spark SQL Streaming (or Structured streaming).
 
 ## Linking
 
@@ -53,16 +53,25 @@ This library is compiled for Scala 2.11 only, and intends to support Spark 2.0 o
 
 ## Examples
 
-A SQL Stream can be created with data streams received through MQTT Server using,
+SQL Stream can be created with data streams received through MQTT Server using:
 
     sqlContext.readStream
         .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
         .option("topic", "mytopic")
         .load("tcp://localhost:1883")
 
-## Enable recovering from failures.
+SQL Stream may be also transferred into MQTT messages using:
 
-Setting values for option `localStorage` and `clientId` helps in recovering in case of a restart, by restoring the state where it left off before the shutdown.
+    sqlContext.writeStream
+        .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSinkProvider")
+        .option("checkpointLocation", "/path/to/localdir")
+        .outputMode("complete")
+        .option("topic", "mytopic")
+        .load("tcp://localhost:1883")
+
+## Source recovering from failures
+
+Setting values for option `localStorage` and `clientId` helps in recovering in case of source restart, by restoring the state where it left off before the shutdown.
 
     sqlContext.readStream
         .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
@@ -71,14 +80,14 @@ Setting values for option `localStorage` and `clientId` helps in recovering in c
         .option("clientId", "some-client-id")
         .load("tcp://localhost:1883")
 
-## Configuration options.
+## Configuration options
 
-This source uses [Eclipse Paho Java Client](https://eclipse.org/paho/clients/java/). Client API documentation is located [here](http://www.eclipse.org/paho/files/javadoc/index.html).
+This connector uses [Eclipse Paho Java Client](https://eclipse.org/paho/clients/java/). Client API documentation is located [here](http://www.eclipse.org/paho/files/javadoc/index.html).
 
- * `brokerUrl` A url MqttClient connects to. Set this or `path` as the url of the Mqtt Server. e.g. tcp://localhost:1883.
+ * `brokerUrl` An URL MqttClient connects to. Set this or `path` as the URL of the Mqtt Server. e.g. tcp://localhost:1883.
  * `persistence` By default it is used for storing incoming messages on disk. If `memory` is provided as value for this option, then recovery on restart is not supported.
  * `topic` Topic MqttClient subscribes to.
- * `clientId` clientId, this client is assoicated with. Provide the same value to recover a stopped client.
+ * `clientId` clientId, this client is associated with. Provide the same value to recover a stopped source client. MQTT sink ignores client identifier, because Spark batch can be distributed across multiple workers whereas MQTT broker does not allow simultanous connections with same ID from multiple hosts.
  * `QoS` The maximum quality of service to subscribe each topic at. Messages published at a lower quality of service will be received at the published QoS. Messages published at a higher quality of service will be received using the QoS specified on the subscribe.
  * `username` Sets the user name to use for the connection to Mqtt Server. Do not set it, if server does not need this. Setting it empty will lead to errors.
  * `password` Sets the password to use for the connection.
@@ -86,6 +95,18 @@ This source uses [Eclipse Paho Java Client](https://eclipse.org/paho/clients/jav
  * `connectionTimeout` Sets the connection timeout, a value of 0 is interpretted as wait until client connects. See `MqttConnectOptions.setConnectionTimeout` for more information.
  * `keepAlive` Same as `MqttConnectOptions.setKeepAliveInterval`.
  * `mqttVersion` Same as `MqttConnectOptions.setMqttVersion`.
+ * `maxInflight` Same as `MqttConnectOptions.setMaxInflight`
+ * `autoReconnect` Same as `MqttConnectOptions.setAutomaticReconnect`
+
+## Environment variables
+
+Custom environment variables allowing to manage MQTT connectivity performed by sink connector:
+
+ * `spark.mqtt.client.connect.attempts` Number of attempts sink will try to connect to MQTT broker before failing.
+ * `spark.mqtt.client.connect.backoff` Delay in milliseconds to wait before retrying connection to the server.
+ * `spark.mqtt.connection.cache.timeout` Sink connector caches MQTT connections. Idle connections will be closed after timeout milliseconds.
+ * `spark.mqtt.client.publish.attempts` Number of attempts to publish the message before failing the task.
+ * `spark.mqtt.client.publish.backoff` Delay in milliseconds to wait before retrying send operation.
 
 ### Scala API
 
@@ -95,7 +116,7 @@ An example, for scala API to count words from incoming message stream.
     val lines = spark.readStream
       .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
       .option("topic", topic)
-      .load(brokerUrl).as[(String, Timestamp)]
+      .load(brokerUrl).selectExpr("CAST(payload AS STRING)").as[String]
 
     // Split the lines into words
     val words = lines.map(_._1).flatMap(_.split(" "))
@@ -111,7 +132,7 @@ An example, for scala API to count words from incoming message stream.
 
     query.awaitTermination()
 
-Please see `MQTTStreamWordCount.scala` for full example.
+Please see `MQTTStreamWordCount.scala` for full example. Review `MQTTSinkWordCount.scala`, if interested in publishing data to MQTT broker.
 
 ### Java API
 
@@ -122,7 +143,8 @@ An example, for Java API to count words from incoming message stream.
             .readStream()
             .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
             .option("topic", topic)
-            .load(brokerUrl).select("value").as(Encoders.STRING());
+            .load(brokerUrl)
+            .selectExpr("CAST(payload AS STRING)").as(Encoders.STRING());
 
     // Split the lines into words
     Dataset<String> words = lines.flatMap(new FlatMapFunction<String, String>() {
@@ -143,5 +165,54 @@ An example, for Java API to count words from incoming message stream.
 
     query.awaitTermination();
 
-Please see `JavaMQTTStreamWordCount.java` for full example.
+Please see `JavaMQTTStreamWordCount.java` for full example. Review `JavaMQTTSinkWordCount.java`, if interested in publishing data to MQTT broker.
+
+## Best Practices.
+
+1. Turn Mqtt into a more reliable messaging service. 
+
+> *MQTT is a machine-to-machine (M2M)/"Internet of Things" connectivity protocol. It was designed as an extremely lightweight publish/subscribe messaging transport.*
+
+The design of Mqtt and the purpose it serves goes well together, but often in an application it is of utmost value to have reliability. Since mqtt is not a distributed message queue and thus does not offer the highest level of reliability features. It should be redirected via a kafka message queue to take advantage of a distributed message queue. In fact, using a kafka message queue offers a lot of possibilities including a single kafka topic subscribed to several mqtt sources and even a single mqtt stream publishing to multiple kafka topics. Kafka is a reliable and scalable message queue.
+
+2. Often the message payload is not of the default character encoding or contains binary that needs to be parsed using a particular parser. In such cases, spark mqtt payload should be processed using the external parser. For example:
+
+ * Scala API example:
+```scala
+    // Create DataFrame representing the stream of binary messages
+    val lines = spark.readStream
+      .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
+      .option("topic", topic)
+      .load(brokerUrl).select("payload").as[Array[Byte]].map(externalParser(_))
+```
+
+ * Java API example
+```java
+        // Create DataFrame representing the stream of binary messages
+        Dataset<byte[]> lines = spark
+                .readStream()
+                .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
+                .option("topic", topic)
+                .load(brokerUrl).selectExpr("CAST(payload AS BINARY)").as(Encoders.BINARY());
+
+        // Split the lines into words
+        Dataset<String> words = lines.map(new MapFunction<byte[], String>() {
+            @Override
+            public String call(byte[] bytes) throws Exception {
+                return new String(bytes); // Plug in external parser here.
+            }
+        }, Encoders.STRING()).flatMap(new FlatMapFunction<String, String>() {
+            @Override
+            public Iterator<String> call(String x) {
+                return Arrays.asList(x.split(" ")).iterator();
+            }
+        }, Encoders.STRING());
+
+```
+
+3. What is the solution for a situation when there are a large number of varied mqtt sources, each with different schema and throughput characteristics.
+
+Generally, one would create a lot of streaming pipelines to solve this problem. This would either require a very sophisticated scheduling setup or will waste a lot of resources, as it is not certain which stream is using more amount of data.
+
+The general solution is both less optimum and is more cumbersome to operate, with multiple moving parts incurs a high maintenance overall. As an alternative, in this situation, one can setup a single topic kafka-spark stream, where message from each of the varied stream contains a unique tag separating one from other streams. This way at the processing end, one can distinguish the message from one another and apply the right kind of decoding and processing. Similarly while storing, each message can be distinguished from others by a tag that distinguishes.
 

http://git-wip-us.apache.org/repos/asf/bahir-website/blob/1dc81cfb/site/docs/spark/current/spark-streaming-pubnub.md
----------------------------------------------------------------------
diff --git a/site/docs/spark/current/spark-streaming-pubnub.md b/site/docs/spark/current/spark-streaming-pubnub.md
new file mode 100644
index 0000000..e190934
--- /dev/null
+++ b/site/docs/spark/current/spark-streaming-pubnub.md
@@ -0,0 +1,103 @@
+---
+layout: page
+title: Spark Streaming Google Pub-Sub
+description: Spark Streaming Google Pub-Sub
+group: nav-right
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+{% include JB/setup %}
+# Spark Streaming PubNub Connector
+
+Library for reading data from real-time messaging infrastructure [PubNub](https://www.pubnub.com/) using Spark Streaming.
+
+## Linking
+
+Using SBT:
+    
+    libraryDependencies += "org.apache.bahir" %% "spark-streaming-pubnub" % "{{site.SPARK_VERSION}}"
+    
+Using Maven:
+    
+    <dependency>
+        <groupId>org.apache.bahir</groupId>
+        <artifactId>spark-streaming-pubnub_{{site.SCALA_BINARY_VERSION}}</artifactId>
+        <version>{{site.SPARK_VERSION}}</version>
+    </dependency>
+
+This library can also be added to Spark jobs launched through `spark-shell` or `spark-submit` by using the `--packages` command line option.
+For example, to include it when starting the spark shell:
+
+    $ bin/spark-shell --packages org.apache.bahir:spark-streaming-pubnub_{{site.SCALA_BINARY_VERSION}}:{{site.SPARK_VERSION}}
+
+Unlike using `--jars`, using `--packages` ensures that this library and its dependencies will be added to the classpath.
+The `--packages` argument can also be used with `bin/spark-submit`.
+
+## Examples
+
+Connector leverages official Java client for PubNub cloud infrastructure. You can import the `PubNubUtils`
+class and create input stream by calling `PubNubUtils.createStream()` as shown below. Security and performance related
+features shall be setup inside standard `PNConfiguration` object. We advise to configure reconnection policy so that
+temporary network outages do not interrupt processing job. Users may subscribe to multiple channels and channel groups,
+as well as specify time token to start receiving messages since given point in time.
+
+For complete code examples, please review _examples_ directory.
+
+### Scala API
+
+    import com.pubnub.api.PNConfiguration
+    import com.pubnub.api.enums.PNReconnectionPolicy
+    
+    import org.apache.spark.streaming.pubnub.{PubNubUtils, SparkPubNubMessage}
+
+    val config = new PNConfiguration
+    config.setSubscribeKey(subscribeKey)
+    config.setSecure(true)
+    config.setReconnectionPolicy(PNReconnectionPolicy.LINEAR)
+    val channel = "my-channel"
+
+    val pubNubStream: ReceiverInputDStream[SparkPubNubMessage] = PubNubUtils.createStream(
+      ssc, config, Seq(channel), Seq(), None, StorageLevel.MEMORY_AND_DISK_SER_2
+    )
+
+### Java API
+
+    import com.pubnub.api.PNConfiguration
+    import com.pubnub.api.enums.PNReconnectionPolicy
+    
+    import org.apache.spark.streaming.pubnub.PubNubUtils
+    import org.apache.spark.streaming.pubnub.SparkPubNubMessage
+
+    PNConfiguration config = new PNConfiguration()
+    config.setSubscribeKey(subscribeKey)
+    config.setSecure(true)
+    config.setReconnectionPolicy(PNReconnectionPolicy.LINEAR)
+    Set<String> channels = new HashSet<String>() {{
+        add("my-channel");
+    }};
+
+    ReceiverInputDStream<SparkPubNubMessage> pubNubStream = PubNubUtils.createStream(
+      ssc, config, channels, Collections.EMPTY_SET, null,
+      StorageLevel.MEMORY_AND_DISK_SER_2()
+    )
+
+## Unit Test
+
+Unit tests take advantage of publicly available _demo_ subscription and and publish key, which has limited request rate.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/bahir-website/blob/1dc81cfb/site/docs/spark/current/spark-streaming-pubsub.md
----------------------------------------------------------------------
diff --git a/site/docs/spark/current/spark-streaming-pubsub.md b/site/docs/spark/current/spark-streaming-pubsub.md
index 83f2532..4736aca 100644
--- a/site/docs/spark/current/spark-streaming-pubsub.md
+++ b/site/docs/spark/current/spark-streaming-pubsub.md
@@ -1,7 +1,7 @@
 ---
 layout: page
-title: Spark Streaming Google Pub-Sub
-description: Spark Streaming Google Pub-Sub
+title: Spark Streaming PubNub
+description: Spark Streaming PubNub
 group: nav-right
 ---
 <!--

http://git-wip-us.apache.org/repos/asf/bahir-website/blob/1dc81cfb/site/docs/spark/current/spark-streaming-zeromq.md
----------------------------------------------------------------------
diff --git a/site/docs/spark/current/spark-streaming-zeromq.md b/site/docs/spark/current/spark-streaming-zeromq.md
index e826b5a..034380a 100644
--- a/site/docs/spark/current/spark-streaming-zeromq.md
+++ b/site/docs/spark/current/spark-streaming-zeromq.md
@@ -24,6 +24,7 @@ limitations under the License.
 -->
 
 {% include JB/setup %}
+# Spark Streaming ZeroMQ Connector
 
 A library for reading data from [ZeroMQ](http://zeromq.org/) using Spark Streaming. 
 
@@ -53,13 +54,23 @@ This library is cross-published for Scala 2.10 and Scala 2.11, so users should r
 
 ## Examples
 
+Review end-to-end examples at [ZeroMQ Examples](https://github.com/apache/bahir/tree/master/streaming-zeromq/examples).
 
 ### Scala API
 
-    val lines = ZeroMQUtils.createStream(ssc, ...)
+    import org.apache.spark.streaming.zeromq.ZeroMQUtils
+
+    val lines = ZeroMQUtils.createTextStream(
+      ssc, "tcp://server:5555", true, Seq("my-topic".getBytes)
+    )
 
 ### Java API
 
-    JavaDStream<String> lines = ZeroMQUtils.createStream(jssc, ...);
+    import org.apache.spark.storage.StorageLevel;
+    import org.apache.spark.streaming.api.java.JavaReceiverInputDStream;
+    import org.apache.spark.streaming.zeromq.ZeroMQUtils;
 
-See end-to-end examples at [ZeroMQ Examples](https://github.com/apache/bahir/tree/master/streaming-zeromq/examples)
\ No newline at end of file
+    JavaReceiverInputDStream<String> test1 = ZeroMQUtils.createJavaStream(
+        ssc, "tcp://server:5555", true, Arrays.asList("my-topic.getBytes()),
+        StorageLevel.MEMORY_AND_DISK_SER_2()
+    );
\ No newline at end of file


[4/4] bahir-website git commit: Update documentation indexes with new/update extensions

Posted by lr...@apache.org.
Update documentation indexes with new/update extensions


Project: http://git-wip-us.apache.org/repos/asf/bahir-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/bahir-website/commit/5b4ed3c1
Tree: http://git-wip-us.apache.org/repos/asf/bahir-website/tree/5b4ed3c1
Diff: http://git-wip-us.apache.org/repos/asf/bahir-website/diff/5b4ed3c1

Branch: refs/heads/master
Commit: 5b4ed3c1f453320d68d5750deffe71b285a00bdb
Parents: 1dc81cf
Author: Luciano Resende <lr...@apache.org>
Authored: Thu Nov 29 15:26:21 2018 +0100
Committer: Luciano Resende <lr...@apache.org>
Committed: Thu Nov 29 15:26:21 2018 +0100

----------------------------------------------------------------------
 site/docs/spark/current/documentation.md | 6 ++++--
 site/index.md                            | 5 +++--
 2 files changed, 7 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/bahir-website/blob/5b4ed3c1/site/docs/spark/current/documentation.md
----------------------------------------------------------------------
diff --git a/site/docs/spark/current/documentation.md b/site/docs/spark/current/documentation.md
index 932b390..b5f9e3b 100644
--- a/site/docs/spark/current/documentation.md
+++ b/site/docs/spark/current/documentation.md
@@ -39,7 +39,7 @@ limitations under the License.
 
 [Akka data source](../spark-sql-streaming-akka)
 
-[MQTT data source](../spark-sql-streaming-mqtt)
+[MQTT data source](../spark-sql-streaming-mqtt) ![](/assets/themes/apache-clean/img/new-black.png){:height="36px" width="36px"} (new Sink)
 
 <br/>
 
@@ -51,8 +51,10 @@ limitations under the License.
 
 [Google Cloud Pub/Sub connector](../spark-streaming-pubsub)
 
+[Cloud PubNub connector](../spark-streaming-pubnub) ![](/assets/themes/apache-clean/img/new-black.png){:height="36px" width="36px"}
+
 [MQTT connector](../spark-streaming-mqtt)
 
 [Twitter connector](../spark-streaming-twitter)
 
-[ZeroMQ connector](../spark-streaming-zeromq)
+[ZeroMQ connector](../spark-streaming-zeromq) ![](/assets/themes/apache-clean/img/new-black.png){:height="36px" width="36px"} (Enhanced Implementation)

http://git-wip-us.apache.org/repos/asf/bahir-website/blob/5b4ed3c1/site/index.md
----------------------------------------------------------------------
diff --git a/site/index.md b/site/index.md
index 92b13cf..09c5d99 100644
--- a/site/index.md
+++ b/site/index.md
@@ -39,9 +39,10 @@ Currently, {{ site.data.project.short_name }} provides extensions for [Apache Sp
  - Spark DStream connector for Apache CouchDB/Cloudant
  - Spark DStream connector for Akka
  - Spark DStream connector for Google Cloud Pub/Sub
- - Spark DStream connector for MQTT
+ - Spark DStream connector for Google PubNub ![](/assets/themes/apache-clean/img/new-black.png){:height="36px" width="36px"}
+ - Spark DStream connector for MQTT ![](/assets/themes/apache-clean/img/new-black.png){:height="36px" width="36px"} (new Sink)
  - Spark DStream connector for Twitter
- - Spark DStream connector for ZeroMQ
+ - Spark DStream connector for ZeroMQ ![](/assets/themes/apache-clean/img/new-black.png){:height="36px" width="36px"} (Enhanced Implementation)
 
 
 ## Apache Flink extensions