You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/03/08 12:52:41 UTC

[GitHub] [flink] MartijnVisser commented on a change in pull request #18812: [FLINK-25129][docs] Improvements to the table-planner-loader related docs

MartijnVisser commented on a change in pull request #18812:
URL: https://github.com/apache/flink/pull/18812#discussion_r821628545



##########
File path: docs/content/docs/dev/configuration/connector.md
##########
@@ -24,39 +24,42 @@ under the License.
 
 # Connectors and Formats
 
-Flink can read from and write to various external systems via connectors and define the format in 
-which to store the data.
+Flink can read from and write to various external systems via connectors and use the format of your choice
+in order to read/write data from/into records.
 
-The way that information is serialized is represented in the external system and that system needs

Review comment:
       My proposal would be:
   
   ```
   Flink applications can read from and write to various external systems via connectors. 
   It supports multiple formats in order to encode and decode data to match Flink's data structures.
   ```

##########
File path: docs/content/docs/dev/configuration/overview.md
##########
@@ -177,19 +177,46 @@ bash -c "$(curl https://flink.apache.org/q/gradle-quickstart.sh)" -- {{< version
 
 ## Which dependencies do you need?
 
-Depending on what you want to achieve, you are going to choose a combination of our available APIs, 
-which will require different dependencies. 
+To start working on a Flink job, you usually need the following dependencies:
+
+* Flink APIs, in order to develop your job
+* [Connectors and formats]({{< ref "docs/dev/configuration/connector" >}}), in order to integrate your job with external systems
+* [Testing utilities]({{< ref "docs/dev/configuration/testing" >}}), in order to test your job
+
+And in addition to these, you might want to add 3rd party dependencies that you need to develop custom functions.
+
+### Flink APIs
+
+Flink offers two major APIs: [Datastream API]({{< ref "docs/dev/datastream/overview" >}}) and [Table API & SQL]({{< ref "docs/dev/table/overview" >}}). 
+They can be used separately, or they can be mixed, depending on your use cases:
+
+| APIs you want to use                                                              | Dependency you need to add                          |
+|-----------------------------------------------------------------------------------|-----------------------------------------------------|
+| [DataStream]({{< ref "docs/dev/datastream/overview" >}})                          | `flink-streaming-java`                              |  
+| [DataStream with Scala]({{< ref "docs/dev/datastream/scala_api_extensions" >}})   | `flink-streaming-scala{{< scala_version >}}`        |   
+| [Table API]({{< ref "docs/dev/table/common" >}})                                  | `flink-table-api-java`                              |   
+| [Table API with Scala]({{< ref "docs/dev/table/common" >}})                       | `flink-table-api-scala{{< scala_version >}}`        |
+| [Table API + DataStream]({{< ref "docs/dev/table/data_stream_api" >}})            | `flink-table-api-java-bridge`                       |
+| [Table API + DataStream with Scala]({{< ref "docs/dev/table/data_stream_api" >}}) | `flink-table-api-scala-bridge{{< scala_version >}}` |
+
+Just include them in your build tool script/descriptor, and you can start developing your job!
+
+## Running and packaging
+
+If you want to run your job by simply executing the main class, you will need `flink-runtime` in your classpath.
+In case of Table API programs, you will also need `flink-table-runtime` and `flink-table-planner-loader`.
 
-Here is a table of artifact/dependency names:
+As a rule of thumb, we **suggest** packaging the application code and all its required dependencies into one fat/uber JAR.

Review comment:
       Thanks @zentol 

##########
File path: docs/content/docs/dev/configuration/testing.md
##########
@@ -26,65 +26,27 @@ under the License.
 
 Flink provides utilities for testing your job that you can add as dependencies.
 
-## DataStream API Test Dependencies
+## DataStream API Testing
 
-You need to add the following dependencies if you want to develop tests for a job built with the 
+You need to add the following dependencies if you want to develop tests for a job built with the
 DataStream API:
 
-{{< tabs "datastream test" >}}
+{{< artifact_tabs flink-test-utils withTestScope >}}
 
-{{< tab "Maven" >}}
-Open the `pom.xml` file in your project directory and add these dependencies in between the dependencies tab.
-{{< artifact flink-test-utils withTestScope >}}
-{{< artifact flink-runtime withTestScope >}}
-{{< /tab >}}
-
-{{< tab "Gradle" >}}
-Open the `build.gradle` file in your project directory and add the following in the dependencies block.
-```gradle
-...
-dependencies {
-    ...  
-    testImplementation "org.apache.flink:flink-test-utils:${flinkVersion}"
-    testImplementation "org.apache.flink:flink-runtime:${flinkVersion}"
-    ...
-}
-...
-```
-**Note:** This assumes that you have created your project using our Gradle build script or quickstart script.
-{{< /tab >}}
-
-{{< /tabs >}}
+Among the various test utilities, this module provides `MiniCluster`, a lightweight configurable Flink cluster runnable in a JUnit test that can directly execute jobs.

Review comment:
       It does appear now in the review, so 👍 

##########
File path: docs/content/docs/dev/configuration/advanced.md
##########
@@ -24,33 +24,28 @@ under the License.
 
 # Advanced Configuration Topics
 
-## Dependencies: Flink Core and User Application
-
-There are two broad categories of dependencies and libraries in Flink, which are explained below.
-
-### Flink Core Dependencies
+## Anatomy of the Flink distribution
 
 Flink itself consists of a set of classes and dependencies that form the core of Flink's runtime
 and must be present when a Flink application is started. The classes and dependencies needed to run
 the system handle areas such as coordination, networking, checkpointing, failover, APIs,
 operators (such as windowing), resource management, etc.
 
-These core classes and dependencies are packaged in the `flink-dist` jar, are part of Flink's `lib`
-folder, and part of the basic Flink container images. You can think of these dependencies as similar
-to Java's core library, which contains classes like `String` and `List`.
+These core classes and dependencies are packaged in the `flink-dist.jar` available in the `/lib`
+folder in the downloaded distribution, and part of the basic Flink container images. 
+You can think of these dependencies as similar to Java's core library, which contains classes like `String` and `List`.
 
 In order to keep the core dependencies as small as possible and avoid dependency clashes, the
 Flink Core Dependencies do not contain any connectors or libraries (i.e. CEP, SQL, ML) in order to
 avoid having an excessive default number of classes and dependencies in the classpath.
 
-### User Application Dependencies
+The `/lib` directory of the Flink distribution additionally contains various JARs including commonly used modules, 
+such as all the required [modules to execute Table jobs](#anatomy-of-table-dependencies) and a set of connector and formats.

Review comment:
       That's a good point. I think it's good




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org