You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by ch...@apache.org on 2019/07/15 09:28:00 UTC

[flink] branch release-1.9 updated: [FLINK-13154][docs] Fix broken links

This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch release-1.9
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/release-1.9 by this push:
     new 8310ad9  [FLINK-13154][docs] Fix broken links
8310ad9 is described below

commit 8310ad96cc5285ebacf4ba2fd55fa317ded3d6f5
Author: Yun Tang <my...@live.com>
AuthorDate: Mon Jul 15 17:27:13 2019 +0800

    [FLINK-13154][docs] Fix broken links
---
 docs/dev/connectors/pubsub.zh.md                 | 162 +++++++++++++++++++++++
 docs/dev/table/connect.md                        |   2 +-
 docs/dev/table/connect.zh.md                     |   2 +-
 docs/dev/table/sqlClient.md                      |   2 -
 docs/getting-started/tutorials/local_setup.md    |   4 +-
 docs/getting-started/tutorials/local_setup.zh.md |   4 +-
 6 files changed, 168 insertions(+), 8 deletions(-)

diff --git a/docs/dev/connectors/pubsub.zh.md b/docs/dev/connectors/pubsub.zh.md
new file mode 100644
index 0000000..0ee8187
--- /dev/null
+++ b/docs/dev/connectors/pubsub.zh.md
@@ -0,0 +1,162 @@
+---
+title: "Google Cloud PubSub"
+nav-title: Google Cloud PubSub
+nav-parent_id: connectors
+nav-pos: 7
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+This connector provides a Source and Sink that can read from and write to
+[Google Cloud PubSub](https://cloud.google.com/pubsub). To use this connector, add the
+following dependency to your project:
+
+{% highlight xml %}
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-connector-gcp-pubsub{{ site.scala_version_suffix }}</artifactId>
+  <version>{{ site.version }}</version>
+</dependency>
+{% endhighlight %}
+
+<p style="border-radius: 5px; padding: 5px" class="bg-danger">
+<b>Note</b>: This connector has been added to Flink recently. It has not received widespread testing yet.
+</p>
+
+Note that the streaming connectors are currently not part of the binary
+distribution. See
+[here]({{ site.baseurl }}/dev/projectsetup/dependencies.html)
+for information about how to package the program with the libraries for
+cluster execution.
+
+
+
+## Consuming or Producing PubSubMessages
+
+The connector provides a connectors for receiving and sending messages from and to Google PubSub.
+Google PubSub has an `at-least-once` guarantee and as such the connector delivers the same guarantees.
+
+### PubSub SourceFunction
+
+The class `PubSubSource` has a builder to create PubSubsources: `PubSubSource.newBuilder(...)`
+
+There are several optional methods to alter how the PubSubSource is created, the bare minimum is to provide a Google project, Pubsub subscription and a way to deserialize the PubSubMessages.
+
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+StreamExecutionEnvironment streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment();
+
+DeserializationSchema<SomeObject> deserializer = (...);
+SourceFunction<SomeObject> pubsubSource = PubSubSource.newBuilder()
+                                                      .withDeserializationSchema(deserializer)
+                                                      .withProjectName("project")
+                                                      .withSubscriptionName("subscription")
+                                                      .build();
+
+streamExecEnv.addSource(source);
+{% endhighlight %}
+</div>
+</div>
+
+Currently the source functions [pulls](https://cloud.google.com/pubsub/docs/pull) messages from PubSub, [push endpoints](https://cloud.google.com/pubsub/docs/push) are not supported.
+
+### PubSub Sink
+
+The class `PubSubSink` has a builder to create PubSubSinks. `PubSubSink.newBuilder(...)`
+
+This builder works in a similar way to the PubSubSource.
+
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<SomeObject> dataStream = (...);
+
+SerializationSchema<SomeObject> serializationSchema = (...);
+SinkFunction<SomeObject> pubsubSink = PubSubSink.newBuilder()
+                                                .withDeserializationSchema(deserializer)
+                                                .withProjectName("project")
+                                                .withSubscriptionName("subscription")
+                                                .build()
+
+dataStream.addSink(pubsubSink);
+{% endhighlight %}
+</div>
+</div>
+
+### Google Credentials
+
+Google uses [Credentials](https://cloud.google.com/docs/authentication/production) to authenticate and authorize applications so that they can use Google Cloud Platform resources (such as PubSub).
+
+Both builders allow you to provide these credentials but by default the connectors will look for an environment variable: [GOOGLE_APPLICATION_CREDENTIALS](https://cloud.google.com/docs/authentication/production#obtaining_and_providing_service_account_credentials_manually) which should point to a file containing the credentials.
+
+If you want to provide Credentials manually, for instance if you read the Credentials yourself from an external system, you can use `PubSubSource.newBuilder(...).withCredentials(...)`.
+
+### Integration testing
+
+When running integration tests you might not want to connect to PubSub directly but use a docker container to read and write to. (See: [PubSub testing locally](https://cloud.google.com/pubsub/docs/emulator))
+
+The following example shows how you would create a source to read messages from the emulator and send them back:
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DeserializationSchema<SomeObject> deserializationSchema = (...);
+SourceFunction<SomeObject> pubsubSource = PubSubSource.newBuilder()
+                                                      .withDeserializationSchema(deserializationSchema)
+                                                      .withProjectName("my-fake-project")
+                                                      .withSubscriptionName("subscription")
+                                                      .withPubSubSubscriberFactory(new PubSubSubscriberFactoryForEmulator("localhost:1234", "my-fake-project", "subscription", 10, Duration.ofSeconds(15), 100))
+                                                      .build();
+SinkFunction<SomeObject> pubsubSink = PubSubSink.newBuilder()
+                                                .withDeserializationSchema(deserializationSchema)
+                                                .withProjectName("my-fake-project")
+                                                .withSubscriptionName("subscription")
+                                                .withHostAndPortForEmulator(getPubSubHostPort())
+                                                .build()
+
+StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
+env.addSource(pubsubSource)
+   .addSink(pubsubSink);
+{% endhighlight %}
+</div>
+</div>
+
+### Atleast once guarantee
+
+#### SourceFunction
+
+There are several reasons why a message might be send multiple times, such as failure scenarios on Google PubSub's side.
+
+Another reason is when the acknowledgement deadline has passed. This is the time between receiving the message and between acknowledging the message. The PubSubSource will only acknowledge a message on successful checkpoints to guarantee Atleast-Once. This does mean if the time between successful checkpoints is larger than the acknowledgment deadline of your subscription messages will most likely be processed multiple times.
+
+For this reason it's recommended to have a (much) lower checkpoint interval than acknowledgement deadline.
+
+See [PubSub](https://cloud.google.com/pubsub/docs/subscriber) for details on how to increase the acknowledgment deadline of your subscription.
+
+Note: The metric `PubSubMessagesProcessedNotAcked` shows how many messages are waiting for the next checkpoint before they will be acknowledged.
+
+#### SinkFunction
+
+The sink function buffers messages that are to be send to PubSub for a short amount of time for performance reasons. Before each checkpoint this buffer is flushed and the checkpoint will not succeed unless the messages have been delivered to PubSub.
+
+{% top %}
diff --git a/docs/dev/table/connect.md b/docs/dev/table/connect.md
index 6e6c08a..14d2a3b 100644
--- a/docs/dev/table/connect.md
+++ b/docs/dev/table/connect.md
@@ -656,7 +656,7 @@ connector:
 
 The file system connector itself is included in Flink and does not require an additional dependency. A corresponding format needs to be specified for reading and writing rows from and to a file system.
 
-<span class="label label-danger">Attention</span> Make sure to include [Flink File System specific dependencies]({{ site.baseurl }}/ops/filesystems/index.html).
+<span class="label label-danger">Attention</span> Make sure to include [Flink File System specific dependencies]({{ site.baseurl }}/internals/filesystems.html).
 
 <span class="label label-danger">Attention</span> File system sources and sinks for streaming are only experimental. In the future, we will support actual streaming use cases, i.e., directory monitoring and bucket output.
 
diff --git a/docs/dev/table/connect.zh.md b/docs/dev/table/connect.zh.md
index c8398c2..9aeacd9 100644
--- a/docs/dev/table/connect.zh.md
+++ b/docs/dev/table/connect.zh.md
@@ -656,7 +656,7 @@ connector:
 
 The file system connector itself is included in Flink and does not require an additional dependency. A corresponding format needs to be specified for reading and writing rows from and to a file system.
 
-<span class="label label-danger">Attention</span> Make sure to include [Flink File System specific dependencies]({{ site.baseurl }}/ops/filesystems/index.html).
+<span class="label label-danger">Attention</span> Make sure to include [Flink File System specific dependencies]({{ site.baseurl }}/internals/filesystems.html).
 
 <span class="label label-danger">Attention</span> File system sources and sinks for streaming are only experimental. In the future, we will support actual streaming use cases, i.e., directory monitoring and bucket output.
 
diff --git a/docs/dev/table/sqlClient.md b/docs/dev/table/sqlClient.md
index cd69cce..c1e427a 100644
--- a/docs/dev/table/sqlClient.md
+++ b/docs/dev/table/sqlClient.md
@@ -456,8 +456,6 @@ catalogs:
 
 Currently Flink supports two types of catalog - `FlinkInMemoryCatalog` and `HiveCatalog`.
 
-For more information about catalog, see [Catalogs]({{ site.baseurl }}/dev/table/catalog.html).
-
 Detached SQL Queries
 --------------------
 
diff --git a/docs/getting-started/tutorials/local_setup.md b/docs/getting-started/tutorials/local_setup.md
index 4442b54..b916669 100644
--- a/docs/getting-started/tutorials/local_setup.md
+++ b/docs/getting-started/tutorials/local_setup.md
@@ -30,7 +30,7 @@ Get a Flink example program up and running in a few simple steps.
 
 ## Setup: Download and Start Flink
 
-Flink runs on __Linux, Mac OS X, and Windows__. To be able to run Flink, the only requirement is to have a working __Java 8.x__ installation. Windows users, please take a look at the [Flink on Windows]({{ site.baseurl }}/tutorials/flink_on_windows.html) guide which describes how to run Flink on Windows for local setups.
+Flink runs on __Linux, Mac OS X, and Windows__. To be able to run Flink, the only requirement is to have a working __Java 8.x__ installation. Windows users, please take a look at the [Flink on Windows]({{ site.baseurl }}/getting-started/tutorials/flink_on_windows.html) guide which describes how to run Flink on Windows for local setups.
 
 You can check the correct installation of Java by issuing the following command:
 
@@ -292,6 +292,6 @@ $ ./bin/stop-cluster.sh
 
 ## Next Steps
 
-Check out some more [examples]({{ site.baseurl }}/examples) to get a better feel for Flink's programming APIs. When you are done with that, go ahead and read the [streaming guide]({{ site.baseurl }}/dev/datastream_api.html).
+Check out some more [examples]({{ site.baseurl }}/getting-started/examples) to get a better feel for Flink's programming APIs. When you are done with that, go ahead and read the [streaming guide]({{ site.baseurl }}/dev/datastream_api.html).
 
 {% top %}
diff --git a/docs/getting-started/tutorials/local_setup.zh.md b/docs/getting-started/tutorials/local_setup.zh.md
index 48ffb4c..43cec15 100644
--- a/docs/getting-started/tutorials/local_setup.zh.md
+++ b/docs/getting-started/tutorials/local_setup.zh.md
@@ -30,7 +30,7 @@ Get a Flink example program up and running in a few simple steps.
 
 ## Setup: Download and Start Flink
 
-Flink runs on __Linux, Mac OS X, and Windows__. To be able to run Flink, the only requirement is to have a working __Java 8.x__ installation. Windows users, please take a look at the [Flink on Windows]({{ site.baseurl }}/tutorials/flink_on_windows.html) guide which describes how to run Flink on Windows for local setups.
+Flink runs on __Linux, Mac OS X, and Windows__. To be able to run Flink, the only requirement is to have a working __Java 8.x__ installation. Windows users, please take a look at the [Flink on Windows]({{ site.baseurl }}/getting-started/tutorials/flink_on_windows.html) guide which describes how to run Flink on Windows for local setups.
 
 You can check the correct installation of Java by issuing the following command:
 
@@ -292,6 +292,6 @@ $ ./bin/stop-cluster.sh
 
 ## Next Steps
 
-Check out some more [examples]({{ site.baseurl }}/examples) to get a better feel for Flink's programming APIs. When you are done with that, go ahead and read the [streaming guide]({{ site.baseurl }}/dev/datastream_api.html).
+Check out some more [examples]({{ site.baseurl }}/getting-started/examples) to get a better feel for Flink's programming APIs. When you are done with that, go ahead and read the [streaming guide]({{ site.baseurl }}/dev/datastream_api.html).
 
 {% top %}