You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by wa...@apache.org on 2022/02/16 07:09:20 UTC

[flink] branch master updated: [FLINK-25053][docs] Document how to use the usrlib to load code in the user code class loader

This is an automated email from the ASF dual-hosted git repository.

wangyang0918 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/master by this push:
     new f503fef  [FLINK-25053][docs] Document how to use the usrlib to load code in the user code class loader
f503fef is described below

commit f503fef9e923453a0eee42f0a955897c06462df9
Author: Lijie Wang <wa...@gmail.com>
AuthorDate: Sun Jan 23 20:05:56 2022 +0800

    [FLINK-25053][docs] Document how to use the usrlib to load code in the user code class loader
    
    This closes #18450.
---
 .../resource-providers/native_kubernetes.md        |  8 +++++
 .../resource-providers/standalone/overview.md      |  7 +++++
 .../docs/deployment/resource-providers/yarn.md     | 11 ++++++-
 .../docs/ops/debugging/debugging_classloading.md   | 34 ++++++++++-----------
 .../resource-providers/native_kubernetes.md        |  8 +++++
 .../resource-providers/standalone/overview.md      |  7 +++++
 .../docs/deployment/resource-providers/yarn.md     | 13 ++++++--
 .../docs/ops/debugging/debugging_classloading.md   | 35 +++++++++++-----------
 8 files changed, 85 insertions(+), 38 deletions(-)

diff --git a/docs/content.zh/docs/deployment/resource-providers/native_kubernetes.md b/docs/content.zh/docs/deployment/resource-providers/native_kubernetes.md
index 2d950a5..7ee3302 100644
--- a/docs/content.zh/docs/deployment/resource-providers/native_kubernetes.md
+++ b/docs/content.zh/docs/deployment/resource-providers/native_kubernetes.md
@@ -572,4 +572,12 @@ spec:
       emptyDir: { }
 ```
 
+### User jars & Classpath
+
+When deploying Flink natively on Kubernetes, the following jars will be recognized as user-jars and included into user classpath:
+- Session Mode: The JAR file specified in startup command.
+- Application Mode: The JAR file specified in startup command and all JAR files in Flink's `usrlib` folder.
+
+Please refer to the [Debugging Classloading Docs]({{< ref "docs/ops/debugging/debugging_classloading" >}}#overview-of-classloading-in-flink) for details.
+
 {{< top >}}
diff --git a/docs/content.zh/docs/deployment/resource-providers/standalone/overview.md b/docs/content.zh/docs/deployment/resource-providers/standalone/overview.md
index d97d64f..6498b2c 100644
--- a/docs/content.zh/docs/deployment/resource-providers/standalone/overview.md
+++ b/docs/content.zh/docs/deployment/resource-providers/standalone/overview.md
@@ -236,6 +236,13 @@ Stopping standalonesession daemon (pid: 7349) on host localhost.
 $ bin/stop-zookeeper-quorum.sh
 Stopping zookeeper daemon (pid: 7101) on host localhost.</pre>
 
+### User jars & Classpath
+
+In Standalone mode, the following jars will be recognized as user-jars and included into user classpath:
+- Session Mode: The JAR file specified in startup command.
+- Application Mode: The JAR file specified in startup command and all JAR files in Flink's `usrlib` folder.
+
+Please refer to the [Debugging Classloading Docs]({{< ref "docs/ops/debugging/debugging_classloading" >}}#overview-of-classloading-in-flink) for details.
 
 
 {{< top >}}
diff --git a/docs/content.zh/docs/deployment/resource-providers/yarn.md b/docs/content.zh/docs/deployment/resource-providers/yarn.md
index bb4acbd..7ac23f5 100644
--- a/docs/content.zh/docs/deployment/resource-providers/yarn.md
+++ b/docs/content.zh/docs/deployment/resource-providers/yarn.md
@@ -237,7 +237,14 @@ The configuration parameter for specifying the REST endpoint port is [rest.bind-
 
 ### User jars & Classpath
 
-By default Flink will include the user jars into the system classpath when running a single job. This behavior can be controlled with the [yarn.per-job-cluster.include-user-jar]({{< ref "docs/deployment/config" >}}#yarn-per-job-cluster-include-user-jar) parameter.
+**Session Mode**
+
+When deploying Flink with Session Mode on Yarn, only the JAR file specified in startup command will be recognized as user-jars and included into user classpath.
+
+**PerJob Mode & Application Mode**
+
+When deploying Flink with PerJob/Application Mode on Yarn, the JAR file specified in startup command and all JAR files in Flink's `usrlib` folder will be recognized as user-jars.
+By default Flink will include the user-jars into the system classpath. This behavior can be controlled with the [yarn.classpath.include-user-jar]({{< ref "docs/deployment/config" >}}#yarn-classpath-include-user-jar) parameter.
 
 When setting this to `DISABLED` Flink will include the jar in the user classpath instead.
 
@@ -247,4 +254,6 @@ The user-jars position in the classpath can be controlled by setting the paramet
 - `FIRST`: Adds the jar to the beginning of the system classpath.
 - `LAST`: Adds the jar to the end of the system classpath.
 
+Please refer to the [Debugging Classloading Docs]({{< ref "docs/ops/debugging/debugging_classloading" >}}#overview-of-classloading-in-flink) for details.
+
 {{< top >}}
diff --git a/docs/content.zh/docs/ops/debugging/debugging_classloading.md b/docs/content.zh/docs/ops/debugging/debugging_classloading.md
index e2c0512..e08c9c0 100644
--- a/docs/content.zh/docs/ops/debugging/debugging_classloading.md
+++ b/docs/content.zh/docs/ops/debugging/debugging_classloading.md
@@ -33,12 +33,12 @@ When running Flink applications, the JVM will load various classes over time.
 These classes can be divided into three groups based on their origin:
 
   - The **Java Classpath**: This is Java's common classpath, and it includes the JDK libraries, and all code
-    in Flink's `/lib` folder (the classes of Apache Flink and some dependencies).
+    in Flink's `/lib` folder (the classes of Apache Flink and some dependencies). They are loaded by *AppClassLoader*.
 
   - The **Flink Plugin Components**: The plugins code in folders under Flink's `/plugins` folder. Flink's plugin mechanism will dynamically load them once during startup.
 
   - The **Dynamic User Code**: These are all classes that are included in the JAR files of dynamically submitted jobs,
-    (via REST, CLI, web UI). They are loaded (and unloaded) dynamically per job.
+    (via REST, CLI, web UI). They are loaded (and unloaded) dynamically by *FlinkUserCodeClassLoader* per job.
 
 As a general rule, whenever you start the Flink processes first and submit jobs later, the job's classes are loaded dynamically.
 If the Flink processes are started together with the job/application, or if the application spawns the Flink components (JobManager, TaskManager, etc.), then all job's classes are in the Java classpath.
@@ -47,10 +47,10 @@ Code in plugin components is loaded dynamically once by a dedicated class loader
 
 In the following are some more details about the different deployment modes:
 
-**Standalone Session**
+**Session Mode (Standalone/Yarn/Kubernetes)**
 
-When starting a Flink cluster as a standalone session, the JobManagers and TaskManagers are started with the Flink framework classes in the
-Java classpath. The classes from all jobs/applications that are submitted against the session (via REST / CLI) are loaded *dynamically*.
+When starting a Flink session(Standalone/Yarn/Kubernetes) cluster, the JobManagers and TaskManagers are started with the Flink framework classes in the
+Java classpath. The classes from all jobs/applications that are submitted against the session (via REST / CLI) are loaded dynamically by *FlinkUserCodeClassLoader*.
 
 <!--
 **Docker Containers with Flink-as-a-Library**
@@ -61,22 +61,22 @@ created for an job/application and will contain the job/application's jar files.
 
 -->
 
-**Docker / Kubernetes Sessions**
+**Per-Job Mode (deprecated) (Yarn)**
 
-Docker / Kubernetes setups that start first a set of JobManagers / TaskManagers and then submit jobs/applications via REST or the CLI
-behave like standalone sessions: Flink's code is in the Java classpath, plugin components are loaded dynamically at startup and the job's code is loaded dynamically.
+Currently, only Yarn supports Per-Job mode. By default, running a Flink cluster in Per-Job mode will include the user jars 
+(the JAR file specified in startup command and all JAR files in Flink's `usrlib` folder) into the system classpath (the *AppClassLoader*).
+This behavior can be controlled with the [yarn.classpath.include-user-jar]({{< ref "docs/deployment/config" >}}#yarn-classpath-include-user-jar) config option. 
+When setting it to `DISABLED`, Flink will include the user jars in the user classpath and load them dynamically by *FlinkUserCodeClassLoader*.
+See [Flink on Yarn]({{< ref "docs/deployment/resource-providers/yarn" >}}) for more details.
 
+**Application Mode (Standalone/Yarn/Kubernetes)**
 
-**YARN**
+When run a Standalone/Kubernetes Flink cluster in Application Mode, the user jars (the JAR file specified in startup command and all JAR files in Flink's `usrlib` folder)
+will be loaded dynamically by *FlinkUserCodeClassLoader*.
 
-YARN classloading differs between single job deployments and sessions:
-
-  - When submitting a Flink job/application directly to YARN (via `bin/flink run -m yarn-cluster ...`), dedicated TaskManagers and
-    JobManagers are started for that job. Those JVMs have user code classes in the Java classpath.
-    That means that there is *no dynamic classloading* involved in that case for the job.
-
-  - When starting a YARN session, the JobManagers and TaskManagers are started with the Flink framework classes in the
-    classpath. The classes from all jobs that are submitted against the session are loaded dynamically.
+When run a Yarn Flink cluster in Application Mode, the user jars (the JAR file specified in startup command and all JAR files in Flink's `usrlib` folder)
+will be included into the system classpath (the *AppClassLoader*) by default. Same as Per-Job mode, when setting the [yarn.classpath.include-user-jar]({{< ref "docs/deployment/config" >}}#yarn-classpath-include-user-jar)
+to `DISABLED`, Flink will include the user jars in the user classpath and load them dynamically by *FlinkUserCodeClassLoader*.
 
 
 ## Inverted Class Loading and ClassLoader Resolution Order
diff --git a/docs/content/docs/deployment/resource-providers/native_kubernetes.md b/docs/content/docs/deployment/resource-providers/native_kubernetes.md
index a2c3ab0..607f904 100644
--- a/docs/content/docs/deployment/resource-providers/native_kubernetes.md
+++ b/docs/content/docs/deployment/resource-providers/native_kubernetes.md
@@ -576,4 +576,12 @@ spec:
       emptyDir: { }
 ```
 
+### User jars & Classpath
+
+When deploying Flink natively on Kubernetes, the following jars will be recognized as user-jars and included into user classpath:
+- Session Mode: The JAR file specified in startup command.
+- Application Mode: The JAR file specified in startup command and all JAR files in Flink's `usrlib` folder.
+
+Please refer to the [Debugging Classloading Docs]({{< ref "docs/ops/debugging/debugging_classloading" >}}#overview-of-classloading-in-flink) for details.
+
 {{< top >}}
diff --git a/docs/content/docs/deployment/resource-providers/standalone/overview.md b/docs/content/docs/deployment/resource-providers/standalone/overview.md
index 756a4a9..f31ecb4 100644
--- a/docs/content/docs/deployment/resource-providers/standalone/overview.md
+++ b/docs/content/docs/deployment/resource-providers/standalone/overview.md
@@ -285,5 +285,12 @@ $ ./bin/stop-zookeeper-quorum.sh
 Stopping zookeeper daemon (pid: 7101) on host localhost.
 ```
 
+### User jars & Classpath
+
+In Standalone mode, the following jars will be recognized as user-jars and included into user classpath:
+- Session Mode: The JAR file specified in startup command.
+- Application Mode: The JAR file specified in startup command and all JAR files in Flink's `usrlib` folder.
+
+Please refer to the [Debugging Classloading Docs]({{< ref "docs/ops/debugging/debugging_classloading" >}}#overview-of-classloading-in-flink) for details.
 
 {{< top >}}
diff --git a/docs/content/docs/deployment/resource-providers/yarn.md b/docs/content/docs/deployment/resource-providers/yarn.md
index 0a803bb..a967cb3 100644
--- a/docs/content/docs/deployment/resource-providers/yarn.md
+++ b/docs/content/docs/deployment/resource-providers/yarn.md
@@ -1,4 +1,4 @@
----
+\---
 title: YARN
 weight: 5
 type: docs
@@ -253,7 +253,14 @@ The configuration parameter for specifying the REST endpoint port is [rest.bind-
 
 ### User jars & Classpath
 
-By default Flink will include the user jars into the system classpath when running a single job. This behavior can be controlled with the [yarn.per-job-cluster.include-user-jar]({{< ref "docs/deployment/config" >}}#yarn-per-job-cluster-include-user-jar) parameter.
+**Session Mode**
+
+When deploying Flink with Session Mode on Yarn, only the JAR file specified in startup command will be recognized as user-jars and included into user classpath.
+
+**PerJob Mode & Application Mode**
+
+When deploying Flink with PerJob/Application Mode on Yarn, the JAR file specified in startup command and all JAR files in Flink's `usrlib` folder will be recognized as user-jars.
+By default Flink will include the user-jars into the system classpath. This behavior can be controlled with the [yarn.classpath.include-user-jar]({{< ref "docs/deployment/config" >}}#yarn-classpath-include-user-jar) parameter.
 
 When setting this to `DISABLED` Flink will include the jar in the user classpath instead.
 
@@ -263,4 +270,6 @@ The user-jars position in the classpath can be controlled by setting the paramet
 - `FIRST`: Adds the jar to the beginning of the system classpath.
 - `LAST`: Adds the jar to the end of the system classpath.
 
+Please refer to the [Debugging Classloading Docs]({{< ref "docs/ops/debugging/debugging_classloading" >}}#overview-of-classloading-in-flink) for details.
+
 {{< top >}}
diff --git a/docs/content/docs/ops/debugging/debugging_classloading.md b/docs/content/docs/ops/debugging/debugging_classloading.md
index 0bb26b2..7a65bff 100644
--- a/docs/content/docs/ops/debugging/debugging_classloading.md
+++ b/docs/content/docs/ops/debugging/debugging_classloading.md
@@ -33,12 +33,12 @@ When running Flink applications, the JVM will load various classes over time.
 These classes can be divided into three groups based on their origin:
 
   - The **Java Classpath**: This is Java's common classpath, and it includes the JDK libraries, and all code
-    in Flink's `/lib` folder (the classes of Apache Flink and some dependencies).
+    in Flink's `/lib` folder (the classes of Apache Flink and some dependencies). They are loaded by *AppClassLoader*.
 
   - The **Flink Plugin Components**: The plugins code in folders under Flink's `/plugins` folder. Flink's plugin mechanism will dynamically load them once during startup.
 
   - The **Dynamic User Code**: These are all classes that are included in the JAR files of dynamically submitted jobs,
-    (via REST, CLI, web UI). They are loaded (and unloaded) dynamically per job.
+    (via REST, CLI, web UI). They are loaded (and unloaded) dynamically by *FlinkUserCodeClassLoader* per job.
 
 As a general rule, whenever you start the Flink processes first and submit jobs later, the job's classes are loaded dynamically.
 If the Flink processes are started together with the job/application, or if the application spawns the Flink components (JobManager, TaskManager, etc.), then all job's classes are in the Java classpath.
@@ -47,10 +47,10 @@ Code in plugin components is loaded dynamically once by a dedicated class loader
 
 In the following are some more details about the different deployment modes:
 
-**Standalone Session**
+**Session Mode (Standalone/Yarn/Kubernetes)**
 
-When starting a Flink cluster as a standalone session, the JobManagers and TaskManagers are started with the Flink framework classes in the
-Java classpath. The classes from all jobs/applications that are submitted against the session (via REST / CLI) are loaded *dynamically*.
+When starting a Flink session(Standalone/Yarn/Kubernetes) cluster, the JobManagers and TaskManagers are started with the Flink framework classes in the
+Java classpath. The classes from all jobs/applications that are submitted against the session (via REST / CLI) are loaded dynamically by *FlinkUserCodeClassLoader*.
 
 <!--
 **Docker Containers with Flink-as-a-Library**
@@ -61,23 +61,22 @@ created for an job/application and will contain the job/application's jar files.
 
 -->
 
-**Docker / Kubernetes Sessions**
+**Per-Job Mode (deprecated) (Yarn)**
 
-Docker / Kubernetes setups that start first a set of JobManagers / TaskManagers and then submit jobs/applications via REST or the CLI
-behave like standalone sessions: Flink's code is in the Java classpath, plugin components are loaded dynamically at startup and the job's code is loaded dynamically.
+Currently, only Yarn supports Per-Job mode. By default, running a Flink cluster in Per-Job mode will include the user jars 
+(the JAR file specified in startup command and all JAR files in Flink's `usrlib` folder) into the system classpath (the *AppClassLoader*).
+This behavior can be controlled with the [yarn.classpath.include-user-jar]({{< ref "docs/deployment/config" >}}#yarn-classpath-include-user-jar) config option. 
+When setting it to `DISABLED`, Flink will include the user jars in the user classpath and load them dynamically by *FlinkUserCodeClassLoader*.
+See [Flink on Yarn]({{< ref "docs/deployment/resource-providers/yarn" >}}) for more details.
 
+**Application Mode (Standalone/Yarn/Kubernetes)**
 
-**YARN**
-
-YARN classloading differs between single job deployments and sessions:
-
-  - When submitting a Flink job/application directly to YARN (via `bin/flink run -m yarn-cluster ...`), dedicated TaskManagers and
-    JobManagers are started for that job. Those JVMs have user code classes in the Java classpath.
-    That means that there is *no dynamic classloading* involved in that case for the job.
-
-  - When starting a YARN session, the JobManagers and TaskManagers are started with the Flink framework classes in the
-    classpath. The classes from all jobs that are submitted against the session are loaded dynamically.
+When run a Standalone/Kubernetes Flink cluster in Application Mode, the user jars (the JAR file specified in startup command and all JAR files in Flink's `usrlib` folder)
+will be loaded dynamically by *FlinkUserCodeClassLoader*.
 
+When run a Yarn Flink cluster in Application Mode, the user jars (the JAR file specified in startup command and all JAR files in Flink's `usrlib` folder)
+will be included into the system classpath (the *AppClassLoader*) by default. Same as Per-Job mode, when setting the [yarn.classpath.include-user-jar]({{< ref "docs/deployment/config" >}}#yarn-classpath-include-user-jar)
+to `DISABLED`, Flink will include the user jars in the user classpath and load them dynamically by *FlinkUserCodeClassLoader*.
 
 ## Inverted Class Loading and ClassLoader Resolution Order