You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by rm...@apache.org on 2019/07/05 12:46:18 UTC

[flink] 03/05: [FLINK-9311] [pubsub] Add documentation of pubsub connectors

This is an automated email from the ASF dual-hosted git repository.

rmetzger pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit e75c33f05ccdb7506ffd8a8e0ad778aa32d00226
Author: Richard Deurwaarder <ri...@xeli.eu>
AuthorDate: Mon Aug 20 17:58:28 2018 +0200

    [FLINK-9311] [pubsub] Add documentation of pubsub connectors
---
 docs/dev/connectors/guarantees.md |   5 ++
 docs/dev/connectors/index.md      |   1 +
 docs/dev/connectors/pubsub.md     | 106 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 112 insertions(+)

diff --git a/docs/dev/connectors/guarantees.md b/docs/dev/connectors/guarantees.md
index 732b8b2..56eac12 100644
--- a/docs/dev/connectors/guarantees.md
+++ b/docs/dev/connectors/guarantees.md
@@ -62,6 +62,11 @@ Please read the documentation of each connector to understand the details of the
             <td></td>
         </tr>
         <tr>
+            <td>Google PubSub</td>
+            <td>at least once</td>
+            <td></td>
+        </tr>
+        <tr>
             <td>Collections</td>
             <td>exactly once</td>
             <td></td>
diff --git a/docs/dev/connectors/index.md b/docs/dev/connectors/index.md
index b5405d4..6132014 100644
--- a/docs/dev/connectors/index.md
+++ b/docs/dev/connectors/index.md
@@ -47,6 +47,7 @@ Connectors provide code for interfacing with various third-party systems. Curren
  * [RabbitMQ](rabbitmq.html) (source/sink)
  * [Apache NiFi](nifi.html) (source/sink)
  * [Twitter Streaming API](twitter.html) (source)
+ * [Google PubSub](pubsub.html) (source/sink)
 
 Keep in mind that to use one of these connectors in an application, additional third party
 components are usually required, e.g. servers for the data stores or message queues.
diff --git a/docs/dev/connectors/pubsub.md b/docs/dev/connectors/pubsub.md
new file mode 100644
index 0000000..6eb3c24
--- /dev/null
+++ b/docs/dev/connectors/pubsub.md
@@ -0,0 +1,106 @@
+---
+title: "Google PubSub"
+nav-title: PubSub
+nav-parent_id: connectors
+nav-pos: 7
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+This connector provides a Source and Sink that can read from and write to
+[Google PubSub](https://cloud.google.com/pubsub). To use this connector, add the
+following dependency to your project:
+
+{% highlight xml %}
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-connector-pubsub{{ site.scala_version_suffix }}</artifactId>
+  <version>{{site.version }}</version>
+</dependency>
+{% endhighlight %}
+
+Note that the streaming connectors are currently not part of the binary
+distribution. See
+[here]({{site.baseurl}}/dev/linking.html)
+for information about how to package the program with the libraries for
+cluster execution.
+
+#### PubSub Source
+
+The connector provides a Source for reading data from Google PubSub to Apache Flink. PubSub has an Atleast-Once guarantee and as such.
+
+The class `PubSubSource(…)` has a builder to create PubSubsources. `PubSubSource.newBuilder()`
+
+There are several optional methods to alter how the PubSubSource is created, the bare minimum is to provide a google project and pubsub subscription and a way to deserialize the PubSubMessages.
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+StreamExecutionEnvironment streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment();
+
+DeserializationSchema<SomeObject> deserializationSchema = (...);
+SourceFunction<SomeObject> pubsubSource = PubSubSource.<SomeObject>newBuilder()
+                                                      .withDeserializationSchema(deserializationSchema)
+                                                      .withProjectSubscriptionName("google-project-name", "pubsub-subscription")
+                                                      .build();
+
+streamExecEnv.addSource(pubsubSource);
+{% endhighlight %}
+</div>
+</div>
+
+#### PubSub Sink
+
+The connector provides a Sink for writing data to PubSub.
+
+The class `PubSubSource(…)` has a builder to create PubSubsources. `PubSubSource.newBuilder()`
+
+This builder works in a similar way to the PubSubSource.
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+StreamExecutionEnvironment streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment();
+
+SerializationSchema<SomeObject> serializationSchema = (...);
+SourceFunction<SomeObject> pubsubSink = PubSubSink.<SomeObject>newBuilder()
+                                                  .withSerializationSchema(serializationSchema)
+                                                  .withTopicName("pubsub-topic-name")
+                                                  .withProjectName("google-project-name")
+                                                  .build()
+
+streamExecEnv.addSink(pubsubSink);
+{% endhighlight %}
+</div>
+</div>
+
+#### Google Credentials
+
+Google uses [Credentials](https://cloud.google.com/docs/authentication/production) to authenticate and authorize applications so that they can use Google cloud resources such as PubSub. Both builders allow several ways to provide these credentials.
+
+By default the connectors will look for an environment variable: [GOOGLE_APPLICATION_CREDENTIALS](https://cloud.google.com/docs/authentication/production#obtaining_and_providing_service_account_credentials_manually) which should point to a file containing the credentials.
+
+It is also possible to provide a Credentials object directly. For instance if you read the Credentials yourself from an external system. In this case you can use `PubSubSource.newBuilder().withCredentials(...)`
+
+#### Integration testing
+
+When using integration tests you might not want to connect to PubSub directly but use a docker container to read and write to. This is possible by using `PubSubSource.newBuilder().withHostAndPort("localhost:1234")`.
+{% top %}