You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by ja...@apache.org on 2019/05/05 09:05:04 UTC

[flink] branch master updated: [FLINK-11614][docs-zh] Translate the "Configuring Dependencies" page into Chinese

This is an automated email from the ASF dual-hosted git repository.

jark pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/master by this push:
     new 16d5cd7  [FLINK-11614][docs-zh] Translate the "Configuring Dependencies" page into Chinese
16d5cd7 is described below

commit 16d5cd7d0e19eb3306fa7206508c740068dba51b
Author: yangfei5 <ya...@xiaomi.com>
AuthorDate: Wed Apr 24 10:24:16 2019 +0800

    [FLINK-11614][docs-zh] Translate the "Configuring Dependencies" page into Chinese
    
    This closes #8200
---
 docs/dev/projectsetup/dependencies.zh.md | 156 ++++++++++++-------------------
 1 file changed, 59 insertions(+), 97 deletions(-)

diff --git a/docs/dev/projectsetup/dependencies.zh.md b/docs/dev/projectsetup/dependencies.zh.md
index cd8074f..6bf7fae 100644
--- a/docs/dev/projectsetup/dependencies.zh.md
+++ b/docs/dev/projectsetup/dependencies.zh.md
@@ -1,5 +1,5 @@
 ---
-title: "配置依赖、Connectors、类库"
+title: "配置依赖、连接器、类库"
 nav-parent_id: projectsetup
 nav-pos: 2
 ---
@@ -22,47 +22,35 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Every Flink application depends on a set of Flink libraries. At the bare minimum, the application depends
-on the Flink APIs. Many applications depend in addition on certain connector libraries (like Kafka, Cassandra, etc.).
-When running Flink applications (either in a distributed deployment, or in the IDE for testing), the Flink
-runtime library must be available as well.
+每个 Flink 应用都需要依赖一组 Flink 类库。Flink 应用至少需要依赖 Flink APIs。许多应用还会额外依赖连接器类库(比如 Kafka、Cassandra 等)。
+当用户运行 Flink 应用时(无论是在 IDE 环境下进行测试,还是部署在分布式环境下),运行时类库都必须可用。
 
+## Flink 核心依赖以及应用依赖
 
-## Flink Core and Application Dependencies
+与其他运行用户自定义应用的大多数系统一样,Flink 中有两大类依赖类库
 
-As with most systems that run user-defined applications, there are two broad categories of dependencies and libraries in Flink:
+  - **Flink 核心依赖**:Flink 本身包含运行所需的一组类和依赖,比如协调、网络通讯、checkpoint、容错处理、API、算子(如窗口操作)、
+    资源管理等,这些类和依赖形成了 Flink 运行时的核心。当 Flink 应用启动时,这些依赖必须可用。
 
-  - **Flink Core Dependencies**: Flink itself consists of a set of classes and dependencies that are needed to run the system, for example
-    coordination, networking, checkpoints, failover, APIs, operations (such as windowing), resource management, etc.
-    The set of all these classes and dependencies forms the core of Flink's runtime and must be present when a Flink
-    application is started.
+    这些核心类和依赖被打包在 `flink-dist` jar 里。它们是 Flink `lib` 文件夹下的一部分,也是 Flink 基本容器镜像的一部分。
+    这些依赖类似 Java `String` 和 `List` 的核心类库(`rt.jar`, `charsets.jar`等)。
+    
+    Flink 核心依赖不包含连接器和类库(如 CEP、SQL、ML 等),这样做的目的是默认情况下避免在类路径中具有过多的依赖项和类。
+    实际上,我们希望尽可能保持核心依赖足够精简,以保证一个较小的默认类路径,并且避免依赖冲突。
 
-    These core classes and dependencies are packaged in the `flink-dist` jar. They are part of Flink's `lib` folder and
-    part of the basic Flink container images. Think of these dependencies as similar to Java's core library (`rt.jar`, `charsets.jar`, etc.),
-    which contains the classes like `String` and `List`.
+  - **用户应用依赖** 是指特定的应用程序需要的类库,如连接器,formats等。
 
-    The Flink Core Dependencies do not contain any connectors or libraries (CEP, SQL, ML, etc.) in order to avoid having an excessive
-    number of dependencies and classes in the classpath by default. In fact, we try to keep the core dependencies as slim as possible
-    to keep the default classpath small and avoid dependency clashes.
+    用户应用代码和所需的连接器以及其他类库依赖通常被打包到 *application jar* 中。
 
-  - The **User Application Dependencies** are all connectors, formats, or libraries that a specific user application needs.
+    用户应用程序依赖项不需包括 Flink DataSet / DataStream API 以及运行时依赖项,因为它们已经是 Flink 核心依赖项的一部分。
 
-    The user application is typically packaged into an *application jar*, which contains the application code and the required
-    connector and library dependencies.
+## 搭建一个项目: 基础依赖
 
-    The user application dependencies explicitly do not include the Flink DataSet / DataStream APIs and runtime dependencies,
-    because those are already part of Flink's Core Dependencies.
+开发 Flink 应用程序需要最低限度的 API 依赖。Maven 用户,可以使用 
+[Java 项目模板]({{ site.baseurl }}/zh/dev/projectsetup/java_api_quickstart.html)或者
+[Scala 项目模板]({{ site.baseurl }}/zh/dev/projectsetup/scala_api_quickstart.html)来创建一个包含最初依赖的程序骨架。
 
-
-## Setting up a Project: Basic Dependencies
-
-Every Flink application needs as the bare minimum the API dependencies, to develop against.
-For Maven, you can use the [Java Project Template]({{ site.baseurl }}/dev/projectsetup/java_api_quickstart.html)
-or [Scala Project Template]({{ site.baseurl }}/dev/projectsetup/scala_api_quickstart.html) to create
-a program skeleton with these initial dependencies.
-
-When setting up a project manually, you need to add the following dependencies for the Java/Scala API
-(here presented in Maven syntax, but the same dependencies apply to other build tools (Gradle, SBT, etc.) as well.
+手动设置项目时,需要为 Java 或 Scala API 添加以下依赖项(这里以 Maven 语法为例,但也适用于其他构建工具(Gradle、 SBT 等))。
 
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">
@@ -99,31 +87,23 @@ When setting up a project manually, you need to add the following dependencies f
 </div>
 </div>
 
-**Important:** Please note that all these dependencies have their scope set to *provided*.
-That means that they are needed to compile against, but that they should not be packaged into the
-project's resulting application jar file - these dependencies are Flink Core Dependencies,
-which are already available in any setup.
-
-It is highly recommended to keep the dependencies in scope *provided*. If they are not set to *provided*,
-the best case is that the resulting JAR becomes excessively large, because it also contains all Flink core
-dependencies. The worst case is that the Flink core dependencies that are added to the application's jar file
-clash with some of your own dependency versions (which is normally avoided through inverted classloading).
+**注意事项:** 所有这些依赖项的作用域都应该设置为 *provided* 。
+这意味着需要这些依赖进行编译,但不应将它们打包到项目生成的应用程序jar文件中-- 
+因为这些依赖项是 Flink 的核心依赖,在应用启动前已经是可用的状态了。
 
-**Note on IntelliJ:** To make the applications run within IntelliJ IDEA, the Flink dependencies need
-to be declared in scope *compile* rather than *provided*. Otherwise IntelliJ will not add them to the classpath and
-the in-IDE execution will fail with a `NoClassDefFountError`. To avoid having to declare the
-dependency scope as *compile* (which is not recommended, see above), the above linked Java- and Scala
-project templates use a trick: They add a profile that selectively activates when the application
-is run in IntelliJ and only then promotes the dependencies to scope *compile*, without affecting
-the packaging of the JAR files.
+我们强烈建议保持这些依赖的作用域为 *provided* 。 如果它们的作用域未设置为 *provided* ,则典型的情况是因为包含了 Flink 的核心依赖而导致生成的jar包变得过大。
+最糟糕的情况是添加到应用程序的 Flink 核心依赖项与你自己的一些依赖项版本冲突(通常通过反向类加载来避免)。
 
+**IntelliJ 上的一些注意事项:** 为了可以让 Flink 应用在 IntelliJ IDEA 中运行,这些 Flink 核心依赖的作用域需要设置为 *compile* 而不是 *provided* 。
+否则 IntelliJ 不会添加这些依赖到 classpath,会导致应用运行时抛出 `NoClassDefFountError` 异常。为了避免声明这些依赖的作用域为 *compile* (因为我们不推荐这样做),
+上文给出的 Java 和 Scala 项目模板使用了一个小技巧:添加了一个 profile,仅当应用程序在 IntelliJ 中运行时该 profile 才会被激活,
+然后将依赖作用域设置为 *compile* ,从而不影响应用 jar 包。
 
-## Adding Connector and Library Dependencies
+## 添加连接器以及类库依赖
 
-Most applications need specific connectors or libraries to run, for example a connector to Kafka, Cassandra, etc.
-These connectors are not part of Flink's core dependencies and must hence be added as dependencies to the application
+大多数应用需要依赖特定的连接器或其他类库,例如 Kafka、Cassandra 的连接器等。这些连接器不是 Flink 核心依赖的一部分,因此必须作为依赖项手动添加到应用程序中。
 
-Below is an example adding the connector for Kafka 0.10 as a dependency (Maven syntax):
+下面是添加 Kafka 0.10 连接器依赖(Maven 语法)的示例:
 {% highlight xml %}
 <dependency>
     <groupId>org.apache.flink</groupId>
@@ -132,63 +112,46 @@ Below is an example adding the connector for Kafka 0.10 as a dependency (Maven s
 </dependency>
 {% endhighlight %}
 
-We recommend to package the application code and all its required dependencies into one *jar-with-dependencies* which
-we refer to as the *application jar*. The application jar can be submitted to an already running Flink cluster,
-or added to a Flink application container image.
+我们建议将应用程序代码及其所有需要的依赖项打包到一个 *jar-with-dependencies* 的 jar 包中。
+这个打包好的应用 jar 可以提交到已经运行的 Flink 集群中,或者添加到 Flink 应用容器镜像中。
+ 
+通过[Java 项目模板]({{ site.baseurl }}/zh/dev/projectsetup/java_api_quickstart.html) 或者
+[Scala 项目模板]({{ site.baseurl }}/zh/dev/projectsetup/scala_api_quickstart.html) 创建的应用,
+当使用命令 `mvn clean package` 打包的时候会自动将应用依赖类库打包进应用 jar 包。
+对于不是通过上面模板创建的应用,我们推荐添加 Maven Shade Plugin 去构建应用。(下面的附录会给出具体配置)
 
-Projects created from the [Java Project Template]({{ site.baseurl }}/dev/projectsetup/java_api_quickstart.html) or
-[Scala Project Template]({{ site.baseurl }}/dev/projectsetup/scala_api_quickstart.html) are configured to automatically include
-the application dependencies into the application jar when running `mvn clean package`. For projects that are
-not set up from those templates, we recommend to add the Maven Shade Plugin (as listed in the Appendix below)
-to build the application jar with all required dependencies.
+**注意:** 要使 Maven(以及其他构建工具)正确地将依赖项打包到应用程序 jar 中,必须将这些依赖项的作用域设置为 *compile* (与核心依赖项不同,后者作用域应该设置为 *provided* )。
 
-**Important:** For Maven (and other build tools) to correctly package the dependencies into the application jar,
-these application dependencies must be specified in scope *compile* (unlike the core dependencies, which
-must be specified in scope *provided*).
+## Scala 版本
 
+Scala 版本(2.10、2.11、2.12等)互相是不兼容的。因此,依赖 Scala 2.11 的 Flink 环境是不可以运行依赖 Scala 2.12 应用的。
 
-## Scala Versions
+所有依赖 Scala 的 Flink 类库都以它们依赖的 Scala 版本为后缀,例如 `flink-streaming-scala_2.11`。
 
-Scala versions (2.10, 2.11, 2.12, etc.) are not binary compatible with one another.
-For that reason, Flink for Scala 2.11 cannot be used with an application that uses
-Scala 2.12.
+只使用 Java 的开发人员可以选择任何 Scala 版本,Scala 开发人员需要选择与其应用程序相匹配的 Scala 版本。
 
-All Flink dependencies that (transitively) depend on Scala are suffixed with the
-Scala version that they are built for, for example `flink-streaming-scala_2.11`.
+对于指定的 Scala 版本如何构建 Flink 应用可以参考 [构建指南]({{ site.baseurl }}/zh/flinkDev/building.html#scala-versions)。
 
-Developers that only use Java can pick any Scala version, Scala developers need to
-pick the Scala version that matches their application's Scala version.
+## Hadoop 版本
 
-Please refer to the [build guide]({{ site.baseurl }}/flinkDev/building.html#scala-versions)
-for details on how to build Flink for a specific Scala version.
+**一般规则:永远不需要将 Hadoop 依赖项直接添加到你的应用程序中** 
+*(唯一的例外是使用 Flink 的 Hadoop 兼容包装器来处理 Hadoop 格式的输入/输出时)*
 
-## Hadoop Dependencies
+如果你想要在 Flink 应用中使用 Hadoop,你需要使用包含 Hadoop 依赖的 Flink,而非将 Hadoop 作为应用依赖进行添加。
+请参考[Hadoop 构建指南]({{ site.baseurl }}/zh/ops/deployment/hadoop.html)
 
-**General rule: It should never be necessary to add Hadoop dependencies directly to your application.**
-*(The only exception being when using existing Hadoop input-/output formats with Flink's Hadoop compatibility wrappers)*
+这样设计是出于两个主要原因:
 
-If you want to use Flink with Hadoop, you need to have a Flink setup that includes the Hadoop dependencies, rather than
-adding Hadoop as an application dependency. Please refer to the [Hadoop Setup Guide]({{ site.baseurl }}/ops/deployment/hadoop.html)
-for details.
+  - 可能在用户程序启动之前,一些 Hadoop 交互操作就已经发生在 Flink 核心中了,比如为 checkpoint 设置 HDFS 路径,通过 Hadoop's Kerberos tokens 进行权限认证以及进行 YARN 部署等。
 
-There are two main reasons for that design:
+  - Flink 的反向类加载方法隐藏了核心依赖关系中的许多传递依赖关系。这不仅适用于 Flink 自己的核心依赖项,也适用于 Hadoop 在启动中存在的依赖项。
+    通过这种方式,应用程序可以使用相同依赖项的不同版本,而不会引起依赖项的冲突(若非如此可能会引起严重依赖问题,因为 Hadoops 依赖树十分庞大。)
 
-  - Some Hadoop interaction happens in Flink's core, possibly before the user application is started, for example
-    setting up HDFS for checkpoints, authenticating via Hadoop's Kerberos tokens, or deployment on YARN.
+如果在 IDE 内部进行开发或测试的过程中需要 Hadoop 依赖项(例如用于 HDFS 访问),请将这些依赖项的作用域设置为 *test* 或 *provided* 。
 
-  - Flink's inverted classloading approach hides many transitive dependencies from the core dependencies. That applies not only
-    to Flink's own core dependencies, but also to Hadoop's dependencies when present in the setup.
-    That way, applications can use different versions of the same dependencies without running into dependency conflicts (and
-    trust us, that's a big deal, because Hadoops dependency tree is huge.)
+## 附录:构建带有依赖的应用 jar 包模板
 
-If you need Hadoop dependencies during testing or development inside the IDE (for example for HDFS access), please configure
-these dependencies similar to the scope of the dependencies to *test* or to *provided*.
-
-
-## Appendix: Template for building a Jar with Dependencies
-
-To build an application JAR that contains all dependencies required for declared connectors and libraries,
-you can use the following shade plugin definition:
+可以通过下面的 shade plugin 配置来构建包含所有依赖项的应用 jar 包
 
 {% highlight xml %}
 <build>
@@ -213,8 +176,8 @@ you can use the following shade plugin definition:
                         </artifactSet>
                         <filters>
                             <filter>
-                                <!-- Do not copy the signatures in the META-INF folder.
-                                Otherwise, this might cause SecurityExceptions when using the JAR. -->
+                                <!--不要拷贝 META-INF 目录下的签名,
+                                否则会引起 SecurityExceptions 。 -->
                                 <artifact>*:*</artifact>
                                 <excludes>
                                     <exclude>META-INF/*.SF</exclude>
@@ -237,4 +200,3 @@ you can use the following shade plugin definition:
 {% endhighlight %}
 
 {% top %}
-