You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/01/18 12:22:14 UTC

[GitHub] [flink] Myasuka commented on a change in pull request #18083: [FLINK-25402] Introduce working directory for Flink processes

Myasuka commented on a change in pull request #18083:
URL: https://github.com/apache/flink/pull/18083#discussion_r786698096



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypointUtils.java
##########
@@ -128,4 +132,80 @@ public static void configureUncaughtExceptionHandler(Configuration config) {
                 new ClusterUncaughtExceptionHandler(
                         config.get(ClusterOptions.UNCAUGHT_EXCEPTION_HANDLING)));
     }
+
+    /**
+     * Creates the working directory for the TaskManager process. This method ensures that the
+     * working directory exists.
+     *
+     * @param configuration to extract the required settings from
+     * @param resourceId identifying the TaskManager process
+     * @return working directory
+     * @throws IOException if the working directory could not be created
+     */
+    public static WorkingDirectory createTaskManagerWorkingDirectory(
+            Configuration configuration, ResourceID resourceId) throws IOException {
+        return WorkingDirectory.create(
+                generateTaskManagerWorkingDirectoryFile(configuration, resourceId));
+    }
+
+    /**
+     * Generates the working directory {@link File} for the TaskManager process. This method does
+     * not ensure that the working directory exists.
+     *
+     * @param configuration to extract the required settings from
+     * @param resourceId identifying the TaskManager process
+     * @return working directory file
+     */
+    @VisibleForTesting
+    public static File generateTaskManagerWorkingDirectoryFile(
+            Configuration configuration, ResourceID resourceId) {
+        return generateWorkingDirectoryFile(
+                configuration,
+                ClusterOptions.TASK_MANAGER_PROCESS_WORKING_DIR_BASE,
+                "tm_" + resourceId);
+    }
+
+    /**
+     * Generates the working directory {@link File} for the JobManager process. This method does not
+     * ensure that the working directory exists.
+     *
+     * @param configuration to extract the required settings from
+     * @param resourceId identifying the JobManager process
+     * @return working directory file
+     */
+    @VisibleForTesting
+    public static File generateJobManagerWorkingDirectoryFile(
+            Configuration configuration, ResourceID resourceId) {
+        return generateWorkingDirectoryFile(
+                configuration,
+                ClusterOptions.JOB_MANAGER_PROCESS_WORKING_DIR_BASE,
+                "jm_" + resourceId);
+    }
+
+    private static File generateWorkingDirectoryFile(
+            Configuration configuration,
+            ConfigOption<String> precedingOption,
+            String workingDirectoryName) {
+        return new File(
+                configuration
+                        .getOptional(precedingOption)
+                        .orElseGet(
+                                () -> configuration.get(ClusterOptions.PROCESS_WORKING_DIR_BASE)),

Review comment:
       After went through current PR, I still have a question that if we always choose the 1st tmp directory as the state access working directory by default. How could we leverage the ability of multi disks in some deployment environment, such as YARN.
   
   How about reshuffle the `defaultDirs` within [BootstrapTools#updateTmpDirectoriesInConfiguration](https://github.com/apache/flink/blob/ed699b6ee6b0539087632b68a444f79b95120d84/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/BootstrapTools.java#L293-L296), so that we can ensure not return the same tmp directory each time.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org