You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/01/21 15:47:17 UTC

[GitHub] [flink] MartijnVisser commented on a change in pull request #18353: [FLINK-25129][docs]project configuation changes in docs

MartijnVisser commented on a change in pull request #18353:
URL: https://github.com/apache/flink/pull/18353#discussion_r789763661



##########
File path: docs/content/docs/dev/configuration/advanced.md
##########
@@ -0,0 +1,203 @@
+---
+title: "Advanced Configuration"
+weight: 10
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Dependencies: Flink Core and User Application
+
+There are two broad categories of dependencies and libraries in Flink, which are explained below.
+
+## Flink Core Dependencies
+
+Flink itself consists of a set of classes and dependencies that form the core of Flink's runtime
+and must be present when a Flink application is started. The classes and dependencies needed to run
+the system handle areas such as coordination, networking, checkpointing, failover, APIs,
+operations (such as windowing), resource management, etc.

Review comment:
       I'm wondering if it's operations or operators. I think the latte right be more correct

##########
File path: docs/content/docs/connectors/table/elasticsearch.md
##########
@@ -40,6 +40,9 @@ Dependencies
 
 {{< sql_download_table "elastic" >}}
 
+The Elastic connector is not currently part of the binary distribution.

Review comment:
       Should be Elasticsearch

##########
File path: docs/content/docs/dev/configuration/advanced.md
##########
@@ -0,0 +1,203 @@
+---
+title: "Advanced Configuration"
+weight: 10
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Dependencies: Flink Core and User Application
+
+There are two broad categories of dependencies and libraries in Flink, which are explained below.
+
+## Flink Core Dependencies
+
+Flink itself consists of a set of classes and dependencies that form the core of Flink's runtime
+and must be present when a Flink application is started. The classes and dependencies needed to run
+the system handle areas such as coordination, networking, checkpointing, failover, APIs,
+operations (such as windowing), resource management, etc.
+
+These core classes and dependencies are packaged in the `flink-dist` jar, are part of Flink's `lib`
+folder, and part of the basic Flink container images. You can think of these dependencies as similar
+to Java's core library (i.e. `rt.jar`, `charsets.jar`), which contains classes like `String` and `List`.
+
+In order to keep the core dependencies as small as possible and avoid dependency clashes, the
+Flink Core Dependencies do not contain any connectors or libraries (i.e. CEP, SQL, ML) in order to
+avoid having an excessive default number of classes and dependencies in the classpath.
+
+## User Application Dependencies
+
+These dependencies include all connectors, formats, or libraries that a specific user application
+needs and explicitly do not include the Flink DataStream APIs and runtime dependencies since those
+are already part of the Flink Core Dependencies.
+
+The user application is typically packaged into an *application jar*, which contains the application
+code and the required connector and library dependencies.
+
+## IDE configuration
+
+The default JVM heap size for Java may be too small for Flink and you have to manually increase it.
+In Eclipse, choose `Run Configurations -> Arguments` and write `-Xmx800m` into the `VM Arguments` box.
+In IntelliJ IDEA, the recommended way to change JVM options is from the `Help | Edit Custom VM Options` menu.
+See [this article](https://intellij-support.jetbrains.com/hc/en-us/articles/206544869-Configuring-JVM-options-and-platform-properties) for details.
+
+**Note on IntelliJ:** To make the applications run within IntelliJ IDEA, it is necessary to tick the
+`Include dependencies with "Provided" scope` box in the run configuration. If this option is not available
+(possibly due to using an older IntelliJ IDEA version), then a workaround is to create a test that
+calls the application's `main()` method.
+
+# Scala Versions
+
+Different Scala versions are not binary compatible with one another. For that reason, Flink for 
+Scala 2.11 cannot be used with an application that uses Scala 2.12. All Flink dependencies that 
+(transitively) depend on Scala are suffixed with the Scala version that they are built for 
+(i.e. `flink-streaming-scala_2.12`).
+
+If you are only using Java, you can use any Scala version. If you are using Scala, you need to pick 
+the Scala version that matches the application's Scala version.
+
+Please refer to the [build guide]({{< ref "docs/flinkDev/building" >}}#scala-versions) for details 
+on how to build Flink for a specific Scala version.
+
+Scala versions after 2.12.8 are not binary compatible with previous 2.12.x versions. This prevents
+the Flink project from upgrading its 2.12.x builds beyond 2.12.8. You can build Flink locally for
+later Scala versions by following the [build guide]({{< ref "docs/flinkDev/building" >}}#scala-versions).
+For this to work, you will need to add `-Djapicmp.skip` to skip binary compatibility checks when building.
+
+See the [Scala 2.12.8 release notes](https://github.com/scala/scala/releases/tag/v2.12.8) for more details.
+The relevant section states:
+
+> The second fix is not binary compatible: the 2.12.8 compiler omits certain
+> methods that are generated by earlier 2.12 compilers. However, we believe
+> that these methods are never used and existing compiled code will continue to
+> work.  See the [pull request
+> description](https://github.com/scala/scala/pull/7469) for more details.
+
+# Distribution
+
+The Flink distribution contains by default the required JARs to execute Flink SQL Jobs in `/lib`, in particular:
+
+-`flink-table-api-java-uber-{{< version >}}.jar` containing all the Java APIs
+-`flink-table-runtime-{{< version >}}.jar` containing the runtime
+-`flink-table-planner-loader-{{< version >}}.jar` containing the query planner
+
+When using formats and connectors with the Scala API, you need to either download and manually include 

Review comment:
       Flink Scala API

##########
File path: docs/content/docs/dev/configuration/advanced.md
##########
@@ -0,0 +1,203 @@
+---
+title: "Advanced Configuration"
+weight: 10
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Dependencies: Flink Core and User Application
+
+There are two broad categories of dependencies and libraries in Flink, which are explained below.
+
+## Flink Core Dependencies
+
+Flink itself consists of a set of classes and dependencies that form the core of Flink's runtime
+and must be present when a Flink application is started. The classes and dependencies needed to run
+the system handle areas such as coordination, networking, checkpointing, failover, APIs,
+operations (such as windowing), resource management, etc.
+
+These core classes and dependencies are packaged in the `flink-dist` jar, are part of Flink's `lib`
+folder, and part of the basic Flink container images. You can think of these dependencies as similar
+to Java's core library (i.e. `rt.jar`, `charsets.jar`), which contains classes like `String` and `List`.
+
+In order to keep the core dependencies as small as possible and avoid dependency clashes, the
+Flink Core Dependencies do not contain any connectors or libraries (i.e. CEP, SQL, ML) in order to
+avoid having an excessive default number of classes and dependencies in the classpath.
+
+## User Application Dependencies
+
+These dependencies include all connectors, formats, or libraries that a specific user application
+needs and explicitly do not include the Flink DataStream APIs and runtime dependencies since those
+are already part of the Flink Core Dependencies.
+
+The user application is typically packaged into an *application jar*, which contains the application
+code and the required connector and library dependencies.
+
+## IDE configuration
+
+The default JVM heap size for Java may be too small for Flink and you have to manually increase it.
+In Eclipse, choose `Run Configurations -> Arguments` and write `-Xmx800m` into the `VM Arguments` box.
+In IntelliJ IDEA, the recommended way to change JVM options is from the `Help | Edit Custom VM Options` menu.
+See [this article](https://intellij-support.jetbrains.com/hc/en-us/articles/206544869-Configuring-JVM-options-and-platform-properties) for details.
+
+**Note on IntelliJ:** To make the applications run within IntelliJ IDEA, it is necessary to tick the
+`Include dependencies with "Provided" scope` box in the run configuration. If this option is not available
+(possibly due to using an older IntelliJ IDEA version), then a workaround is to create a test that
+calls the application's `main()` method.
+
+# Scala Versions
+
+Different Scala versions are not binary compatible with one another. For that reason, Flink for 
+Scala 2.11 cannot be used with an application that uses Scala 2.12. All Flink dependencies that 
+(transitively) depend on Scala are suffixed with the Scala version that they are built for 
+(i.e. `flink-streaming-scala_2.12`).
+
+If you are only using Java, you can use any Scala version. If you are using Scala, you need to pick 
+the Scala version that matches the application's Scala version.
+
+Please refer to the [build guide]({{< ref "docs/flinkDev/building" >}}#scala-versions) for details 
+on how to build Flink for a specific Scala version.
+
+Scala versions after 2.12.8 are not binary compatible with previous 2.12.x versions. This prevents
+the Flink project from upgrading its 2.12.x builds beyond 2.12.8. You can build Flink locally for
+later Scala versions by following the [build guide]({{< ref "docs/flinkDev/building" >}}#scala-versions).
+For this to work, you will need to add `-Djapicmp.skip` to skip binary compatibility checks when building.
+
+See the [Scala 2.12.8 release notes](https://github.com/scala/scala/releases/tag/v2.12.8) for more details.
+The relevant section states:
+
+> The second fix is not binary compatible: the 2.12.8 compiler omits certain
+> methods that are generated by earlier 2.12 compilers. However, we believe
+> that these methods are never used and existing compiled code will continue to
+> work.  See the [pull request
+> description](https://github.com/scala/scala/pull/7469) for more details.
+
+# Distribution
+
+The Flink distribution contains by default the required JARs to execute Flink SQL Jobs in `/lib`, in particular:
+
+-`flink-table-api-java-uber-{{< version >}}.jar` containing all the Java APIs
+-`flink-table-runtime-{{< version >}}.jar` containing the runtime
+-`flink-table-planner-loader-{{< version >}}.jar` containing the query planner
+
+When using formats and connectors with the Scala API, you need to either download and manually include 
+the JARs in the `/lib` folder (recommended), or you need to shade them in the uber JAR of your Flink SQL Jobs.
+
+For more details, check out [Connect to External Systems]({{< ref "docs/connectors/table/overview" >}}).
+
+# Table Planner and Table Planner Loader
+
+Starting from Flink 1.15, the distribution contains two planners:
+
+-`flink-table-planner{{< scala_version >}}-{{< version >}}.jar`, in `/opt`, contains the query planner
+-`flink-table-planner-loader-{{< version >}}.jar`, loaded by default in `/lib`, contains the query planner 
+  hidden behind an isolated classpath (you won't be able to address any `io.apache.flink.table.planner` directly)
+
+The planners contain the same code, but they are packaged differently. In one case, you must use the 
+same Scala version of the JAR. In the other, you do not need to make considerations about Scala, since
+it is hidden inside the JAR.
+
+If you need to access and use the internals of the query planner, you can swap the JARs (copying and
+pasting them in the downloaded distribution). Be aware that you will be constrained to using the Scala 
+version of the Flink distribution that you are using.
+
+**Note:** The two planners cannot co-exist at the same time in the classpath. If you load both of them
+in `/lib` your Table Jobs will fail.
+
+# Hadoop Dependencies
+
+**General rule:** It should not be necessary to add Hadoop dependencies directly to your application.
+The only exception is when you use existing Hadoop input/output formats with [Flink's Hadoop compatibility 
+wrappers](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/dataset/hadoop_compatibility/).
+
+If you want to use Flink with Hadoop, you need to have a Flink setup that includes the Hadoop dependencies, 
+rather than adding Hadoop as an application dependency. In other words, Hadoop must be a dependency 
+of the Flink system itself and not of the user code that contains the application. Flink will use the
+Hadoop dependencies specified by the `HADOOP_CLASSPATH` environment variable, which can be set like this:
+
+```bash
+export HADOOP_CLASSPATH=`hadoop classpath`
+```
+
+There are two main reasons for this design:
+
+- Some Hadoop interaction happens in Flink's core, possibly before the user application is started. 

Review comment:
       interactions

##########
File path: docs/content/docs/dev/configuration/advanced.md
##########
@@ -0,0 +1,203 @@
+---
+title: "Advanced Configuration"
+weight: 10
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Dependencies: Flink Core and User Application
+
+There are two broad categories of dependencies and libraries in Flink, which are explained below.
+
+## Flink Core Dependencies
+
+Flink itself consists of a set of classes and dependencies that form the core of Flink's runtime
+and must be present when a Flink application is started. The classes and dependencies needed to run
+the system handle areas such as coordination, networking, checkpointing, failover, APIs,
+operations (such as windowing), resource management, etc.
+
+These core classes and dependencies are packaged in the `flink-dist` jar, are part of Flink's `lib`
+folder, and part of the basic Flink container images. You can think of these dependencies as similar
+to Java's core library (i.e. `rt.jar`, `charsets.jar`), which contains classes like `String` and `List`.
+
+In order to keep the core dependencies as small as possible and avoid dependency clashes, the
+Flink Core Dependencies do not contain any connectors or libraries (i.e. CEP, SQL, ML) in order to
+avoid having an excessive default number of classes and dependencies in the classpath.
+
+## User Application Dependencies
+
+These dependencies include all connectors, formats, or libraries that a specific user application
+needs and explicitly do not include the Flink DataStream APIs and runtime dependencies since those
+are already part of the Flink Core Dependencies.
+
+The user application is typically packaged into an *application jar*, which contains the application
+code and the required connector and library dependencies.
+
+## IDE configuration
+
+The default JVM heap size for Java may be too small for Flink and you have to manually increase it.
+In Eclipse, choose `Run Configurations -> Arguments` and write `-Xmx800m` into the `VM Arguments` box.
+In IntelliJ IDEA, the recommended way to change JVM options is from the `Help | Edit Custom VM Options` menu.
+See [this article](https://intellij-support.jetbrains.com/hc/en-us/articles/206544869-Configuring-JVM-options-and-platform-properties) for details.
+
+**Note on IntelliJ:** To make the applications run within IntelliJ IDEA, it is necessary to tick the
+`Include dependencies with "Provided" scope` box in the run configuration. If this option is not available
+(possibly due to using an older IntelliJ IDEA version), then a workaround is to create a test that
+calls the application's `main()` method.
+
+# Scala Versions
+
+Different Scala versions are not binary compatible with one another. For that reason, Flink for 
+Scala 2.11 cannot be used with an application that uses Scala 2.12. All Flink dependencies that 

Review comment:
       Flink doesn't support Scala 2.11 anymore since the next release

##########
File path: docs/content/docs/dev/configuration/advanced.md
##########
@@ -0,0 +1,203 @@
+---
+title: "Advanced Configuration"
+weight: 10
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Dependencies: Flink Core and User Application
+
+There are two broad categories of dependencies and libraries in Flink, which are explained below.
+
+## Flink Core Dependencies
+
+Flink itself consists of a set of classes and dependencies that form the core of Flink's runtime
+and must be present when a Flink application is started. The classes and dependencies needed to run
+the system handle areas such as coordination, networking, checkpointing, failover, APIs,
+operations (such as windowing), resource management, etc.
+
+These core classes and dependencies are packaged in the `flink-dist` jar, are part of Flink's `lib`
+folder, and part of the basic Flink container images. You can think of these dependencies as similar
+to Java's core library (i.e. `rt.jar`, `charsets.jar`), which contains classes like `String` and `List`.
+
+In order to keep the core dependencies as small as possible and avoid dependency clashes, the
+Flink Core Dependencies do not contain any connectors or libraries (i.e. CEP, SQL, ML) in order to
+avoid having an excessive default number of classes and dependencies in the classpath.
+
+## User Application Dependencies
+
+These dependencies include all connectors, formats, or libraries that a specific user application
+needs and explicitly do not include the Flink DataStream APIs and runtime dependencies since those
+are already part of the Flink Core Dependencies.
+
+The user application is typically packaged into an *application jar*, which contains the application
+code and the required connector and library dependencies.
+
+## IDE configuration
+
+The default JVM heap size for Java may be too small for Flink and you have to manually increase it.
+In Eclipse, choose `Run Configurations -> Arguments` and write `-Xmx800m` into the `VM Arguments` box.
+In IntelliJ IDEA, the recommended way to change JVM options is from the `Help | Edit Custom VM Options` menu.
+See [this article](https://intellij-support.jetbrains.com/hc/en-us/articles/206544869-Configuring-JVM-options-and-platform-properties) for details.
+
+**Note on IntelliJ:** To make the applications run within IntelliJ IDEA, it is necessary to tick the
+`Include dependencies with "Provided" scope` box in the run configuration. If this option is not available
+(possibly due to using an older IntelliJ IDEA version), then a workaround is to create a test that
+calls the application's `main()` method.
+
+# Scala Versions
+
+Different Scala versions are not binary compatible with one another. For that reason, Flink for 
+Scala 2.11 cannot be used with an application that uses Scala 2.12. All Flink dependencies that 
+(transitively) depend on Scala are suffixed with the Scala version that they are built for 
+(i.e. `flink-streaming-scala_2.12`).
+
+If you are only using Java, you can use any Scala version. If you are using Scala, you need to pick 

Review comment:
       I would say only use Flink's Java APIs and Flink's Scala APIs

##########
File path: docs/content/docs/dev/configuration/advanced.md
##########
@@ -0,0 +1,203 @@
+---
+title: "Advanced Configuration"
+weight: 10
+type: docs
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Dependencies: Flink Core and User Application
+
+There are two broad categories of dependencies and libraries in Flink, which are explained below.
+
+## Flink Core Dependencies
+
+Flink itself consists of a set of classes and dependencies that form the core of Flink's runtime
+and must be present when a Flink application is started. The classes and dependencies needed to run
+the system handle areas such as coordination, networking, checkpointing, failover, APIs,
+operations (such as windowing), resource management, etc.
+
+These core classes and dependencies are packaged in the `flink-dist` jar, are part of Flink's `lib`
+folder, and part of the basic Flink container images. You can think of these dependencies as similar
+to Java's core library (i.e. `rt.jar`, `charsets.jar`), which contains classes like `String` and `List`.
+
+In order to keep the core dependencies as small as possible and avoid dependency clashes, the
+Flink Core Dependencies do not contain any connectors or libraries (i.e. CEP, SQL, ML) in order to
+avoid having an excessive default number of classes and dependencies in the classpath.
+
+## User Application Dependencies
+
+These dependencies include all connectors, formats, or libraries that a specific user application
+needs and explicitly do not include the Flink DataStream APIs and runtime dependencies since those

Review comment:
       This probably needs to be both DataStream and Table API




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org