You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@flink.apache.org by tr...@apache.org on 2020/03/04 14:09:54 UTC

[flink] branch release-1.10 updated (438f096 -> 803f9bc)

This is an automated email from the ASF dual-hosted git repository.

trohrmann pushed a change to branch release-1.10
in repository https://gitbox.apache.org/repos/asf/flink.git.


    from 438f096  [hotfix][scaladoc] Add missing parameter in DataStreamConversions's scaladoc
     new 83cbfa5  [FLINK-16131] [docs] Translate /ops/filesystems/s3.zh.md
     new 803f9bc  [hotfix] [docs] Fix typo in /ops/filesystems/s3.md

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 docs/ops/filesystems/s3.md    |  2 +-
 docs/ops/filesystems/s3.zh.md | 90 +++++++++++++++++++------------------------
 2 files changed, 41 insertions(+), 51 deletions(-)

[flink] 01/02: [FLINK-16131] [docs] Translate /ops/filesystems/s3.zh.md

Posted by tr...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

trohrmann pushed a commit to branch release-1.10
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 83cbfa5eabf6a832b86e85785b3fd5e53000d6d8
Author: Qingsheng Ren <re...@gmail.com>
AuthorDate: Sat Feb 22 16:19:23 2020 +0800

    [FLINK-16131] [docs] Translate /ops/filesystems/s3.zh.md
    
    This closes #11207.
---
 docs/ops/filesystems/s3.zh.md | 90 +++++++++++++++++++------------------------
 1 file changed, 40 insertions(+), 50 deletions(-)

diff --git a/docs/ops/filesystems/s3.zh.md b/docs/ops/filesystems/s3.zh.md
index 4e371c3..5bfd39d 100644
--- a/docs/ops/filesystems/s3.zh.md
+++ b/docs/ops/filesystems/s3.zh.md
@@ -23,114 +23,104 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-[Amazon Simple Storage Service](http://aws.amazon.com/s3/) (Amazon S3) provides cloud object storage for a variety of use cases. You can use S3 with Flink for **reading** and **writing data** as well in conjunction with the [streaming **state backends**]({{ site.baseurl}}/ops/state/state_backends.html).
+[Amazon Simple Storage Service](http://aws.amazon.com/s3/) (Amazon S3) 提供用于多种场景的云对象存储。S3 可与 Flink 一起使用以读取、写入数据，并可与 [流的 **State backends**]({{ site.baseurl}}/ops/state/state_backends.html) 相结合使用。
 
 * This will be replaced by the TOC
 {:toc}
 
-You can use S3 objects like regular files by specifying paths in the following format:
+通过以下格式指定路径，S3 对象可类似于普通文件使用：
 
 {% highlight plain %}
 s3://<your-bucket>/<endpoint>
 {% endhighlight %}
 
-The endpoint can either be a single file or a directory, for example:
+Endpoint 可以是一个文件或目录，例如：
 
 {% highlight java %}
-// Read from S3 bucket
+// 读取 S3 bucket
 env.readTextFile("s3://<bucket>/<endpoint>");
 
-// Write to S3 bucket
+// 写入 S3 bucket
 stream.writeAsText("s3://<bucket>/<endpoint>");
 
-// Use S3 as FsStatebackend
+// 使用 S3 作为 FsStatebackend
 env.setStateBackend(new FsStateBackend("s3://<your-bucket>/<endpoint>"));
 {% endhighlight %}
 
-Note that these examples are *not* exhaustive and you can use S3 in other places as well, including your [high availability setup](../jobmanager_high_availability.html) or the [RocksDBStateBackend]({{ site.baseurl }}/ops/state/state_backends.html#the-rocksdbstatebackend); everywhere that Flink expects a FileSystem URI.
+注意这些例子并*不详尽*，S3 同样可以用在其他场景，包括 [JobManager 高可用配置](../jobmanager_high_availability.html) 或 [RocksDBStateBackend]({{ site.baseurl }}/zh/ops/state/state_backends.html#the-rocksdbstatebackend)，以及所有 Flink 需要使用文件系统 URI 的位置。
 
-For most use cases, you may use one of our `flink-s3-fs-hadoop` and `flink-s3-fs-presto` S3 filesystem plugins which are self-contained and easy to set up.
-For some cases, however, e.g., for using S3 as YARN's resource storage dir, it may be necessary to set up a specific Hadoop S3 filesystem implementation.
+在大部分使用场景下，可使用 `flink-s3-fs-hadoop` 或 `flink-s3-fs-presto` 两个独立且易于设置的 S3 文件系统插件。然而在某些情况下，例如使用 S3 作为 YARN 的资源存储目录时，可能需要配置 Hadoop S3 文件系统。
 
-### Hadoop/Presto S3 File Systems plugins
+### Hadoop/Presto S3 文件系统插件
 
-{% panel **Note:** You don't have to configure this manually if you are running [Flink on EMR](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-flink.html). %}
+{% panel **注意:** 如果您在使用 [Flink on EMR](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-flink.html)，您无需手动对此进行配置。 %}
 
-Flink provides two file systems to talk to Amazon S3, `flink-s3-fs-presto` and `flink-s3-fs-hadoop`.
-Both implementations are self-contained with no dependency footprint, so there is no need to add Hadoop to the classpath to use them.
+Flink 提供两种文件系统用来与 S3 交互：`flink-s3-fs-presto` 和 `flink-s3-fs-hadoop`。两种实现都是独立的且没有依赖项，因此使用时无需将 Hadoop 添加至 classpath。
 
-  - `flink-s3-fs-presto`, registered under the scheme *s3://* and *s3p://*, is based on code from the [Presto project](https://prestodb.io/).
-  You can configure it the same way you can [configure the Presto file system](https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration) by placing adding the configurations to your `flink-conf.yaml`. Presto is the recommended file system for checkpointing to S3.
+  - `flink-s3-fs-presto`，通过 *s3://* 和 *s3p://* 两种 scheme 使用，基于 [Presto project](https://prestodb.io/)。
+  可以使用与[配置 Presto 文件系统](https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration)相同的方法进行配置，即将配置添加到 `flink-conf.yaml` 文件中。推荐使用 Presto 文件系统来在 S3 中建立 checkpoint。
 
-  - `flink-s3-fs-hadoop`, registered under *s3://* and *s3a://*, based on code from the [Hadoop Project](https://hadoop.apache.org/).
-  The file system can be [configured exactly like Hadoop's s3a](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A) by placing adding the configurations to your `flink-conf.yaml`. It is the only S3 file system with support for the [StreamingFileSink]({{ site.baseurl}}/dev/connectors/streamfile_sink.html).
+  - `flink-s3-fs-hadoop`，通过 *s3://* 和 *s3a://* 两种 scheme 使用, 基于 [Hadoop Project](https://hadoop.apache.org/)。
+  文件系统可以使用与 [Hadoop S3A 完全相同的配置方法](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A)进行配置，即将配置添加到 `flink-conf.yaml` 文件中。它是唯一一个支持 [StreamingFileSink]({{ site.baseurl}}/zh/dev/connectors/streamfile_sink.html) 的文件系统。
 
-Both `flink-s3-fs-hadoop` and `flink-s3-fs-presto` register default FileSystem
-wrappers for URIs with the *s3://* scheme, `flink-s3-fs-hadoop` also registers
-for *s3a://* and `flink-s3-fs-presto` also registers for *s3p://*, so you can
-use this to use both at the same time.
-For example, the job uses the [StreamingFileSink]({{ site.baseurl}}/dev/connectors/streamfile_sink.html) which only supports Hadoop, but uses Presto for checkpointing.
-In this case, it is advised to explicitly use *s3a://* as a scheme for the sink (Hadoop) and *s3p://* for checkpointing (Presto).
+`flink-s3-fs-hadoop` 和 `flink-s3-fs-presto` 都为 *s3://* scheme 注册了默认的文件系统包装器，`flink-s3-fs-hadoop` 另外注册了 *s3a://*，`flink-s3-fs-presto` 注册了 *s3p://*，因此二者可以同时使用。
+例如某作业使用了 [StreamingFileSink]({{ site.baseurl}}/zh/dev/connectors/streamfile_sink.html)，它仅支持 Hadoop，但建立 checkpoint 使用 Presto。在这种情况下，建议明确地使用 *s3a://* 作为 sink (Hadoop) 的 scheme，checkpoint (Presto) 使用 *s3p://*。
 
-To use `flink-s3-fs-hadoop` or `flink-s3-fs-presto`, copy the respective JAR file from the `opt` directory to the `plugins` directory of your Flink distribution before starting Flink, e.g.
+在启动 Flink 之前，将对应的 JAR 文件从 `opt` 复制到 Flink 发行版的 `plugins` 目录下，以使用 `flink-s3-fs-hadoop` 或 `flink-s3-fs-presto`。
 
 {% highlight bash %}
 mkdir ./plugins/s3-fs-presto
 cp ./opt/flink-s3-fs-presto-{{ site.version }}.jar ./plugins/s3-fs-presto/
 {% endhighlight %}
 
-#### Configure Access Credentials
+#### 配置访问凭据
 
-After setting up the S3 FileSystem wrapper, you need to make sure that Flink is allowed to access your S3 buckets.
+在设置好 S3 文件系统包装器后，您需要确认 Flink 具有访问 S3 Bucket 的权限。
 
-##### Identity and Access Management (IAM) (Recommended)
+##### Identity and Access Management (IAM)（推荐使用）
 
-The recommended way of setting up credentials on AWS is via [Identity and Access Management (IAM)](http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html). You can use IAM features to securely give Flink instances the credentials that they need to access S3 buckets. Details about how to do this are beyond the scope of this documentation. Please refer to the AWS user guide. What you are looking for are [IAM Roles](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for [...]
+建议通过 [Identity and Access Management (IAM)](http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) 来配置 AWS 凭据。可使用 IAM 功能为 Flink 实例安全地提供访问 S3 Bucket 所需的凭据。关于配置的细节超出了本文档的范围，请参考 AWS 用户手册中的 [IAM Roles](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html) 部分。
 
-If you set this up correctly, you can manage access to S3 within AWS and don't need to distribute any access keys to Flink.
+如果配置正确，则可在 AWS 中管理对 S3 的访问，而无需为 Flink 分发任何访问密钥（Access Key）。
 
-##### Access Keys (Discouraged)
+##### 访问密钥（Access Key）（不推荐）
 
-Access to S3 can be granted via your **access and secret key pair**. Please note that this is discouraged since the [introduction of IAM roles](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2).
+可以通过**访问密钥对（access and secret key）**授予 S3 访问权限。请注意，根据 [Introduction of IAM roles](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2)，不推荐使用该方法。
 
-You need to configure both `s3.access-key` and `s3.secret-key`  in Flink's  `flink-conf.yaml`:
+ `s3.access-key` 和 `s3.secret-key` 均需要在 Flink 的 `flink-conf.yaml` 中进行配置：
 
 {% highlight yaml %}
 s3.access-key: your-access-key
 s3.secret-key: your-secret-key
 {% endhighlight %}
 
-## Configure Non-S3 Endpoint
+## 配置非 S3 访问点
 
-The S3 Filesystems also support using S3 compliant object stores such as [IBM's Cloud Object Storage](https://www.ibm.com/cloud/object-storage) and [Minio](https://min.io/).
-To do so, configure your endpoint in `flink-conf.yaml`.
+S3 文件系统还支持兼容 S3 的对象存储服务，如 [IBM's Cloud Object Storage](https://www.ibm.com/cloud/object-storage) 和 [Minio](https://min.io/)。可在 `flink-conf.yaml` 中配置使用的访问点：
 
 {% highlight yaml %}
 s3.endpoint: your-endpoint-hostname
 {% endhighlight %}
 
-## Configure Path Style Access
+## 配置路径样式的访问
 
-Some of the S3 compliant object stores might not have virtual host style addressing enabled by default. In such cases, you will have to provide the property to enable path style access in  in `flink-conf.yaml`.
+某些兼容 S3 的对象存储服务可能没有默认启用虚拟主机样式的寻址。这种情况下需要在 `flink-conf.yaml` 中添加配置以启用路径样式的访问：
 
 {% highlight yaml %}
 s3.path.style.access: true
 {% endhighlight %}
 
-## Entropy injection for S3 file systems
+## S3 文件系统的熵注入
 
-The bundled S3 file systems (`flink-s3-fs-presto` and `flink-s3-fs-hadoop`) support entropy injection. Entropy injection is
-a technique to improve the scalability of AWS S3 buckets through adding some random characters near the beginning of the key.
+内置的 S3 文件系统 (`flink-s3-fs-presto` and `flink-s3-fs-hadoop`) 支持熵注入。熵注入是通过在关键字开头附近添加随机字符，以提高 AWS S3 bucket 可扩展性的技术。
 
-If entropy injection is activated, a configured substring in the path is replaced with random characters. For example, path
-`s3://my-bucket/checkpoints/_entropy_/dashboard-job/` would be replaced by something like `s3://my-bucket/checkpoints/gf36ikvg/dashboard-job/`.
-**This only happens when the file creation passes the option to inject entropy!**
-Otherwise, the file path removes the entropy key substring entirely. See [FileSystem.create(Path, WriteOption)](https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/core/fs/FileSystem.html#create-org.apache.flink.core.fs.Path-org.apache.flink.core.fs.FileSystem.WriteOptions-)
-for details.
+如果熵注入被启用，路径中配置好的字串将会被随机字符所替换。例如路径 `s3://my-bucket/checkpoints/_entropy_/dashboard-job/` 将会被替换成类似于 `s3://my-bucket/checkpoints/gf36ikvg/dashboard-job/` 的路径。
+**这仅在使用熵注入选项创建文件时启用！**
+否则将完全删除文件路径中的 entropy key。更多细节请参见 [FileSystem.create(Path, WriteOption)](https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/core/fs/FileSystem.html#create-org.apache.flink.core.fs.Path-org.apache.flink.core.fs.FileSystem.WriteOptions-)。
 
-{% panel **Note:** The Flink runtime currently passes the option to inject entropy only to checkpoint data files. All other files, including checkpoint metadata and external URI, do not inject entropy to keep checkpoint URIs predictable. %}
+{% panel **注意:** 目前 Flink 运行时仅对 checkpoint 数据文件使用熵注入选项。所有其他文件包括 chekcpoint 元数据与外部 URI 都不使用熵注入，以保证 checkpoint URI 的可预测性。 %}
 
-To enable entropy injection, configure the *entropy key* and the *entropy length* parameters.
+配置 *entropy key* 与 *entropy length* 参数以启用熵注入：
 
 ```
 s3.entropy.key: _entropy_
@@ -138,8 +128,8 @@ s3.entropy.length: 4 (default)
 
 ```
 
-The `s3.entropy.key` defines the string in paths that is replaced by the random characters. Paths that do not contain the entropy key are left unchanged.
-If a file system operation does not pass the *"inject entropy"* write option, the entropy key substring is simply removed.
-The `s3.entropy.length` defines the number of random alphanumeric characters used for entropy.
+`s3.entropy.key` 定义了路径中被随机字符替换掉的字符串。不包含 entropy key 路径将保持不变。
+如果文件系统操作没有经过 *"熵注入"* 写入，entropy key 字串将被直接移除。
+`s3.entropy.length` 定义了用于熵注入的随机字母/数字字符的数量。
 
 {% top %}

[flink] 02/02: [hotfix] [docs] Fix typo in /ops/filesystems/s3.md

Posted by tr...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

trohrmann pushed a commit to branch release-1.10
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 803f9bcf628cdb75901170b58423205ddb368c96
Author: Qingsheng Ren <re...@gmail.com>
AuthorDate: Sat Feb 22 16:20:45 2020 +0800

    [hotfix] [docs] Fix typo in /ops/filesystems/s3.md
---
 docs/ops/filesystems/s3.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/ops/filesystems/s3.md b/docs/ops/filesystems/s3.md
index 4e371c3..7c47f1f 100644
--- a/docs/ops/filesystems/s3.md
+++ b/docs/ops/filesystems/s3.md
@@ -111,7 +111,7 @@ s3.endpoint: your-endpoint-hostname
 
 ## Configure Path Style Access
 
-Some of the S3 compliant object stores might not have virtual host style addressing enabled by default. In such cases, you will have to provide the property to enable path style access in  in `flink-conf.yaml`.
+Some of the S3 compliant object stores might not have virtual host style addressing enabled by default. In such cases, you will have to provide the property to enable path style access in `flink-conf.yaml`.
 
 {% highlight yaml %}
 s3.path.style.access: true