You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2020/07/07 21:58:09 UTC

[GitHub] [incubator-gobblin] sv2000 opened a new pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

sv2000 opened a new pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056


   …Yarn cache location
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-1209
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if applicable):
   This PR allows the tmp dir to be set to the container's cache location overriding the default setting of /tmp. This requires the the tmp dir system property to dynamically configured to the cache location.  
   
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   Existing unit tests.
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on a change in pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056#discussion_r451581777



##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterUtils.java
##########
@@ -112,6 +116,26 @@ public static Path getJobStateFilePath(boolean usingStateStore, Path appWorkPath
     return jobStateFilePath;
   }
 
+  /**
+   * Return the system properties from the input {@link Config} instance

Review comment:
       1) The method is not returning anything.
   2) Please describe the special treatment of java.io.tmpdir/TMP_DIR.YARN_CACHE config




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] asfgit closed pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] sv2000 closed pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
sv2000 closed pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] arjun4084346 commented on pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056#issuecomment-657738089


   +1 LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056#discussion_r453915543



##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterUtils.java
##########
@@ -112,6 +116,28 @@ public static Path getJobStateFilePath(boolean usingStateStore, Path appWorkPath
     return jobStateFilePath;
   }
 
+  /**
+   * Set the system properties from the input {@link Config} instance
+   * @param config
+   */
+  public static void setSystemProperties(Config config) {
+    Properties properties = ConfigUtils.configToProperties(ConfigUtils.getConfig(config, GobblinClusterConfigurationKeys.GOBBLIN_CLUSTER_SYSTEM_PROPERTY_PREFIX,
+        ConfigFactory.empty()));
+
+    for (Map.Entry<Object, Object> entry: properties.entrySet()) {
+      if (entry.getKey().toString().equals("java.io.tmpdir")) {

Review comment:
       Addressed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
autumnust commented on a change in pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056#discussion_r453911728



##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterManager.java
##########
@@ -142,23 +142,25 @@
 
   public GobblinClusterManager(String clusterName, String applicationId, Config sysConfig,
       Optional<Path> appWorkDirOptional) throws Exception {
+    // Set system properties passed in via application config. As an example, Helix uses System#getProperty() for ZK configuration
+    // overrides such as sessionTimeout. In this case, the overrides specified
+    // in the application configuration have to be extracted and set before initializing HelixManager.
+    GobblinClusterUtils.setSystemProperties(sysConfig);
+
+    //Add dynamic config
+    this.config = GobblinClusterUtils.addDynamicConfig(sysConfig);

Review comment:
       Q: Does the location of adding `dynamicConfig` matter? Is dynamicConfig specifically used for SSL-relevant configs? 

##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterUtils.java
##########
@@ -112,6 +116,28 @@ public static Path getJobStateFilePath(boolean usingStateStore, Path appWorkPath
     return jobStateFilePath;
   }
 
+  /**
+   * Set the system properties from the input {@link Config} instance
+   * @param config
+   */
+  public static void setSystemProperties(Config config) {
+    Properties properties = ConfigUtils.configToProperties(ConfigUtils.getConfig(config, GobblinClusterConfigurationKeys.GOBBLIN_CLUSTER_SYSTEM_PROPERTY_PREFIX,
+        ConfigFactory.empty()));
+
+    for (Map.Entry<Object, Object> entry: properties.entrySet()) {
+      if (entry.getKey().toString().equals("java.io.tmpdir")) {

Review comment:
       Shall we add `java.io.tmpdir` to a string constant ? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] sv2000 commented on pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
sv2000 commented on pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056#issuecomment-655176589


   @autumnust please review.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056#discussion_r453868092



##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterUtils.java
##########
@@ -112,6 +116,26 @@ public static Path getJobStateFilePath(boolean usingStateStore, Path appWorkPath
     return jobStateFilePath;
   }
 
+  /**
+   * Return the system properties from the input {@link Config} instance

Review comment:
       Addressed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #3056: GOBBLIN-1209: Provide an option to configure the java tmp dir to the …

Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #3056:
URL: https://github.com/apache/incubator-gobblin/pull/3056#discussion_r453916541



##########
File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterManager.java
##########
@@ -142,23 +142,25 @@
 
   public GobblinClusterManager(String clusterName, String applicationId, Config sysConfig,
       Optional<Path> appWorkDirOptional) throws Exception {
+    // Set system properties passed in via application config. As an example, Helix uses System#getProperty() for ZK configuration
+    // overrides such as sessionTimeout. In this case, the overrides specified
+    // in the application configuration have to be extracted and set before initializing HelixManager.
+    GobblinClusterUtils.setSystemProperties(sysConfig);
+
+    //Add dynamic config
+    this.config = GobblinClusterUtils.addDynamicConfig(sysConfig);

Review comment:
       Yes. Dynamic config generation should happen after system properties are set, since some of the config generation (such as temp file locations) depend on system properties.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org