You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/27 20:38:48 UTC

[GitHub] [spark] xkrogen opened a new pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

xkrogen opened a new pull request #34120:
URL: https://github.com/apache/spark/pull/34120


   ### What changes were proposed in this pull request?
   Refactor the logic for constructing the user classpath from `yarn.ApplicationMaster` into `yarn.Client` so that it can be leveraged on the executor side as well, instead of having the driver construct it and pass it to the executor via command-line arguments. A new method, `getUserClassPath`, is added to `CoarseGrainedExecutorBackend` which defaults to `Nil` (consistent with the existing behavior where non-YARN resource managers do not configure the user classpath). `YarnCoarseGrainedExecutorBackend` overrides this to construct the user classpath from the existing `APP_JAR` and `SECONDARY_JARS` configs. Within `yarn.Client`, environment variables in the configured paths are resolved before constructing the classpath.
   
   Please note that this is a re-submission of #32810, which was reverted in #34082 due to the issues described in [this comment](https://issues.apache.org/jira/browse/SPARK-35672?focusedCommentId=17419285&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17419285). This PR additionally includes the changes described in #34084 to resolve the issue, though this PR has been enhanced to properly handle escape strings, unlike #34084.
   
   ### Why are the changes needed?
   User-provided JARs are made available to executors using a custom classloader, so they do not appear on the standard Java classpath. Instead, they are passed as a list to the executor which then creates a classloader out of the URLs. Currently in the case of YARN, this list of JARs is crafted by the Driver (in `ExecutorRunnable`), which then passes the information to the executors (`CoarseGrainedExecutorBackend`) by specifying each JAR on the executor command line as `--user-class-path /path/to/myjar.jar`. This can cause extremely long argument lists when there are many JARs, which can cause the OS argument length to be exceeded, typically manifesting as the error message:
   
   > /bin/bash: Argument list too long
   
   A [Google search](https://www.google.com/search?q=spark%20%22%2Fbin%2Fbash%3A%20argument%20list%20too%20long%22&oq=spark%20%22%2Fbin%2Fbash%3A%20argument%20list%20too%20long%22) indicates that this is not a theoretical problem and afflicts real users, including ours. Passing this list using the configurations instead resolves this issue.
   
   ### Does this PR introduce _any_ user-facing change?
   No, except for fixing the bug, allowing for larger JAR lists to be passed successfully. Configuration of JARs is identical to before. Substitution of environment variables in `spark.jars` or `spark.yarn.config.replacementPath` works as expected.
   
   ### How was this patch tested?
   New unit tests were added in `YarnClusterSuite`. Also, we have been running a similar fix internally for 4 months with great success.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r726474304



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,90 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   * See SPARK-35672 for discussion of why it is necessary to do environment variable substitution.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  // scalastyle:off line.size.limit
+  /**
+   * Replace environment variables in a string according to the same rules as [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO`, `^%FOO^%`, and `%%FOO%%` will be resolved to `$FOO`, `%FOO%`, and `%FOO%`,
+   * respectively, instead of being treated as variable names. Note that Unix variable naming
+   * conventions (alphanumeric plus underscore, case-sensitive, can't start with a digit) are
+   * used for both Unix and Windows, following the convention of Hadoop's [[Shell]]
+   * (see specifically [[Shell.getEnvironmentVariableRegex]]).
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   * @see [[https://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html Environment Variables (IEEE Std 1003.1-2017)]]
+   * @see [[https://en.wikibooks.org/wiki/Windows_Batch_Scripting#Quoting_and_escaping Windows Batch Scripting | Quoting and Escaping]]
+   */
+  // scalastyle:on line.size.limit
+  def replaceEnvVars(

Review comment:
       Agreed, thanks for the suggestion!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928558793


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143660/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956742094


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49290/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956587781


   @tgravescs @attilapiros  -- now that the Spark 3.2 release is all wrapped up, can you take another look? I just rebased on latest master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928356797


   **[Test build #143660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143660/testReport)** for PR 34120 at commit [`708a377`](https://github.com/apache/spark/commit/708a3776cc58633f027b2b3b4e07f0ec55afc2e4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r718770998



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,84 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  /**
+   * Replace environment variables in a string according to the same rules [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO` and `%%FOO%%` will be resolved to `$FOO` and `%FOO%`, respectively, instead of being
+   * treated as variable names.
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   */
+  def replaceEnvVars(
+      unresolvedString: String,
+      env: IMap[String, String],
+      isWindows: Boolean = Shell.WINDOWS): String = {
+    val osResolvedString = if (isWindows) {
+      // Environment variable names can contain anything besides = and %
+      // Ref: https://docs.microsoft.com/en-us/windows/win32/procthread/environment-variables
+      val windowsPattern = "(?:%%|%([^=%]+)%)".r
+      windowsPattern.replaceAllIn(unresolvedString, m =>
+        Regex.quoteReplacement(m.matched match {
+          case "%%" => "%"
+          case _ => env.getOrElse(m.group(1), "")
+        })
+      )
+    } else {
+      // Environment variables are alphanumeric plus underscore, case-sensitive, can't start with
+      // a digit, based on Shell and Utilities volume of IEEE Std 1003.1-2001
+      // Ref: https://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html
+      val unixPattern = """(?i)(?:\\\\|\\\$|\$([A-Z_][A-Z0-9_]*))|\$\{([A-Z_][A-Z0-9_]*)}""".r
+      unixPattern.replaceAllIn(unresolvedString, m =>
+        Regex.quoteReplacement(m.matched match {
+          case """\\""" => """\"""
+          case """\$""" => """$"""
+          case _ => m.subgroups.filterNot(_ == null).headOption match {
+            case Some(v) => env.getOrElse(v, "")
+            case None =>
+              // Note that this should never be reached and indicates a bug
+              throw new IllegalStateException(s"No valid capture group was found for match: $m")
+          }
+        })
+      )
+    }
+
+    // {{...}} is a YARN thing and not OS-specific. Follow Unix shell naming conventions
+    val yarnPattern = "(?i)\\{\\{([A-Z_][A-Z0-9_]*)}}".r
+    yarnPattern.replaceAllIn(osResolvedString,
+      m => Regex.quoteReplacement(env.getOrElse(m.group(1), "")))

Review comment:
       Updated, thanks for the suggestion




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959579435


   Thanks @tgravescs , re-triggered the tests.
   
   @attilapiros no problem -- we can target to merge next week if there are no issues from your side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931548716


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48281/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930434147


   **[Test build #143726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143726/testReport)** for PR 34120 at commit [`ddf4549`](https://github.com/apache/spark/commit/ddf454997d7bf1909a4b6f895fcc4061aaf713f3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930474905






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930474905


   **[Test build #143728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143728/testReport)** for PR 34120 at commit [`dce813c`](https://github.com/apache/spark/commit/dce813c49b41265cb1671ba4daa83e761bc8b4ef).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959574232


   @xkrogen I can only do the review on this Friday/Saturday


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959814217


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49349/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965889031


   **[Test build #145071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145071/testReport)** for PR 34120 at commit [`97540f8`](https://github.com/apache/spark/commit/97540f8f9bb26ba33a93b075e15365b4553834d6).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956587781


   @tgravescs @attilapiros  -- now that the Spark 3.2 release is all wrapped up, can you take another look? I just rebased on latest master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-969169859


   Thanks @attilapiros ! Would you or @tgravescs be willing to help merge this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928273772


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48171/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931518178


   **[Test build #143770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143770/testReport)** for PR 34120 at commit [`92e296b`](https://github.com/apache/spark/commit/92e296b245f5ff0f446f46f1888ddfdc900a2044).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r718570904



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,84 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  /**
+   * Replace environment variables in a string according to the same rules [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO` and `%%FOO%%` will be resolved to `$FOO` and `%FOO%`, respectively, instead of being
+   * treated as variable names.
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   */
+  def replaceEnvVars(
+      unresolvedString: String,
+      env: IMap[String, String],
+      isWindows: Boolean = Shell.WINDOWS): String = {
+    val osResolvedString = if (isWindows) {
+      // Environment variable names can contain anything besides = and %
+      // Ref: https://docs.microsoft.com/en-us/windows/win32/procthread/environment-variables
+      val windowsPattern = "(?:%%|%([^=%]+)%)".r
+      windowsPattern.replaceAllIn(unresolvedString, m =>
+        Regex.quoteReplacement(m.matched match {
+          case "%%" => "%"
+          case _ => env.getOrElse(m.group(1), "")
+        })
+      )
+    } else {
+      // Environment variables are alphanumeric plus underscore, case-sensitive, can't start with
+      // a digit, based on Shell and Utilities volume of IEEE Std 1003.1-2001
+      // Ref: https://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html
+      val unixPattern = """(?i)(?:\\\\|\\\$|\$([A-Z_][A-Z0-9_]*))|\$\{([A-Z_][A-Z0-9_]*)}""".r
+      unixPattern.replaceAllIn(unresolvedString, m =>
+        Regex.quoteReplacement(m.matched match {
+          case """\\""" => """\"""
+          case """\$""" => """$"""
+          case _ => m.subgroups.filterNot(_ == null).headOption match {
+            case Some(v) => env.getOrElse(v, "")
+            case None =>
+              // Note that this should never be reached and indicates a bug
+              throw new IllegalStateException(s"No valid capture group was found for match: $m")
+          }
+        })
+      )
+    }
+
+    // {{...}} is a YARN thing and not OS-specific. Follow Unix shell naming conventions
+    val yarnPattern = "(?i)\\{\\{([A-Z_][A-Z0-9_]*)}}".r
+    yarnPattern.replaceAllIn(osResolvedString,
+      m => Regex.quoteReplacement(env.getOrElse(m.group(1), "")))

Review comment:
       I'd like more details in the comment here. This I assume used to go into the launch script in NM and then yarn would replace {{VAR}} with whatever it was actually.   Have comment about how that used to work would be better so people don't wonder how yarn comes into play

##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,84 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  /**
+   * Replace environment variables in a string according to the same rules [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO` and `%%FOO%%` will be resolved to `$FOO` and `%FOO%`, respectively, instead of being
+   * treated as variable names.
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   */
+  def replaceEnvVars(

Review comment:
       this is very unfortunate, for us to have to emulate this.  I was looking for other libraries that perhaps do this that we could just use, I saw 
   
   https://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/StringSubstitutor.html
   
   but I think its a basic string replace.
   
   If this was passed in an environment variable does everything get replaced properly? I thought yarn would put it in the launch script and then properly handle replacements. 
   env variables have much longer length and normally you would set the class path environment variable if doing it the more traditional way so assume it would fit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931637146


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143770/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931599050


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48281/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931378210


   overall changes look fine, at this point I think hold off on merging til 3.2 go out assuming you want to try to put in 3.2.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r718570904



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,84 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  /**
+   * Replace environment variables in a string according to the same rules [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO` and `%%FOO%%` will be resolved to `$FOO` and `%FOO%`, respectively, instead of being
+   * treated as variable names.
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   */
+  def replaceEnvVars(
+      unresolvedString: String,
+      env: IMap[String, String],
+      isWindows: Boolean = Shell.WINDOWS): String = {
+    val osResolvedString = if (isWindows) {
+      // Environment variable names can contain anything besides = and %
+      // Ref: https://docs.microsoft.com/en-us/windows/win32/procthread/environment-variables
+      val windowsPattern = "(?:%%|%([^=%]+)%)".r
+      windowsPattern.replaceAllIn(unresolvedString, m =>
+        Regex.quoteReplacement(m.matched match {
+          case "%%" => "%"
+          case _ => env.getOrElse(m.group(1), "")
+        })
+      )
+    } else {
+      // Environment variables are alphanumeric plus underscore, case-sensitive, can't start with
+      // a digit, based on Shell and Utilities volume of IEEE Std 1003.1-2001
+      // Ref: https://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html
+      val unixPattern = """(?i)(?:\\\\|\\\$|\$([A-Z_][A-Z0-9_]*))|\$\{([A-Z_][A-Z0-9_]*)}""".r
+      unixPattern.replaceAllIn(unresolvedString, m =>
+        Regex.quoteReplacement(m.matched match {
+          case """\\""" => """\"""
+          case """\$""" => """$"""
+          case _ => m.subgroups.filterNot(_ == null).headOption match {
+            case Some(v) => env.getOrElse(v, "")
+            case None =>
+              // Note that this should never be reached and indicates a bug
+              throw new IllegalStateException(s"No valid capture group was found for match: $m")
+          }
+        })
+      )
+    }
+
+    // {{...}} is a YARN thing and not OS-specific. Follow Unix shell naming conventions
+    val yarnPattern = "(?i)\\{\\{([A-Z_][A-Z0-9_]*)}}".r
+    yarnPattern.replaceAllIn(osResolvedString,
+      m => Regex.quoteReplacement(env.getOrElse(m.group(1), "")))

Review comment:
       I'd like more details in the comment here. This I assume used to go into the launch script in NM and then yarn would replace {{VAR}} with whatever it was actually.   Have comment about how that used to work would be better so people don't wonder how yarn comes into play

##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,84 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  /**
+   * Replace environment variables in a string according to the same rules [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO` and `%%FOO%%` will be resolved to `$FOO` and `%FOO%`, respectively, instead of being
+   * treated as variable names.
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   */
+  def replaceEnvVars(

Review comment:
       this is very unfortunate, for us to have to emulate this.  I was looking for other libraries that perhaps do this that we could just use, I saw 
   
   https://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/StringSubstitutor.html
   
   but I think its a basic string replace.
   
   If this was passed in an environment variable does everything get replaced properly? I thought yarn would put it in the launch script and then properly handle replacements. 
   env variables have much longer length and normally you would set the class path environment variable if doing it the more traditional way so assume it would fit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928286738






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930476122


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143728/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959809023


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49349/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965791270


   **[Test build #145071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145071/testReport)** for PR 34120 at commit [`97540f8`](https://github.com/apache/spark/commit/97540f8f9bb26ba33a93b075e15365b4553834d6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965900564


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145071/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965759901


   Thanks a lot for the review @attilapiros , some great enhancements to the test cases resulted. I believe they all should be addressed now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928286738






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928463953


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48173/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940404635


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48574/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940439796


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144096/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940404635


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48574/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930513203


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48237/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928491178


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48173/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956640806


   **[Test build #144820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144820/testReport)** for PR 34120 at commit [`023963c`](https://github.com/apache/spark/commit/023963cee1b9506119d8fc6e0bbc916d4b59f211).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959814217






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959579435


   Thanks @tgravescs , re-triggered the tests.
   
   @attilapiros no problem -- we can target to merge next week if there are no issues from your side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959814217






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959848367


   **[Test build #144879 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144879/testReport)** for PR 34120 at commit [`e1b0668`](https://github.com/apache/spark/commit/e1b0668ceb138c1450604ca46904418a6b042a42).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956882475






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956882475






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r746857642



##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
##########
@@ -150,12 +152,62 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
     checkResult(finalState, result)
   }
 
-  test("run Spark in yarn-client mode with additional jar") {
-    testWithAddJar(true)
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = true, "local")
   }
 
-  test("run Spark in yarn-cluster mode with additional jar") {
-    testWithAddJar(false)
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = false, "local")
+  }
+
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' " +
+      "and gateway-replacement path") {
+    // Use the original jar URL, but set up the gateway/replacement configs such that if
+    // replacement occurs, things will break. This ensures the replacement doesn't apply to the
+    // driver in 'client' mode. Executors will fail in this case because they still apply the
+    // replacement in client mode.
+    testWithAddJar(clientMode = true, "local", Some(jarUrl => {
+      (jarUrl.getPath, Map(
+        GATEWAY_ROOT_PATH.key -> Paths.get(jarUrl.toURI).getParent.toString,
+        REPLACEMENT_ROOT_PATH.key -> "/nonexistent/path/"
+      ))
+    }), expectExecutorFailure = true)
+  }
+
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' " +

Review comment:
       Actually, this is exposing a legitimate bug in the older code. The gateway replacement was done in `ExecutorRunnable`, so of course, it only affected executors. But, in cluster mode, the driver _also_ does not run on the gateway and thus should also have the replaced path instead of the gateway path (which is handled in the new codepath).
   
   But thanks for calling this out -- I forgot to include it in the PR description. Updated now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928356797


   **[Test build #143660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143660/testReport)** for PR 34120 at commit [`708a377`](https://github.com/apache/spark/commit/708a3776cc58633f027b2b3b4e07f0ec55afc2e4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931578149


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48281/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931599050


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48281/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930470763


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48237/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
attilapiros commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r726185164



##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala
##########
@@ -583,6 +584,79 @@ class ClientSuite extends SparkFunSuite with Matchers {
     }
   }
 
+  test("SPARK-35672: test Client.getUserClasspathUrls") {
+    val gatewayRootPath = "/local/matching/replace"
+    val replacementRootPath = "/replaced/path"
+    val conf = new SparkConf()
+        .set(SECONDARY_JARS, Seq(
+          s"local:$gatewayRootPath/foo.jar",
+          "local:/local/not/matching/replace/foo.jar",
+          "file:/absolute/file/path/foo.jar",
+          s"$gatewayRootPath/but-not-actually-local/foo.jar",
+          "/absolute/path/foo.jar",
+          "relative/path/foo.jar"
+        ))
+        .set(GATEWAY_ROOT_PATH, gatewayRootPath)
+        .set(REPLACEMENT_ROOT_PATH, replacementRootPath)
+
+    def assertUserClasspathUrls(cluster: Boolean, expectedReplacementPath: String): Unit = {
+      val expectedUrls = Seq(
+        Paths.get(APP_JAR_NAME).toAbsolutePath.toUri.toString,
+        s"file:$expectedReplacementPath/foo.jar",
+        "file:/local/not/matching/replace/foo.jar",
+        "file:/absolute/file/path/foo.jar",
+        // since this path wasn't a local URI, it should never be replaced
+        s"file:$gatewayRootPath/but-not-actually-local/foo.jar",
+        "file:/absolute/path/foo.jar",
+        Paths.get("relative/path/foo.jar").toAbsolutePath.toUri.toString
+      ).map(URI.create(_).toURL).toArray
+      assert(Client.getUserClasspathUrls(conf, cluster) === expectedUrls)
+    }
+    // assert that no replacement happens when cluster = false by expecting the replacement
+    // path to be the same as the original path
+    assertUserClasspathUrls(cluster = false, gatewayRootPath)
+    assertUserClasspathUrls(cluster = true, replacementRootPath)
+  }
+
+  test("SPARK-35672: test replaceEnvVars in Unix mode") {

Review comment:
       If you managed to move the `replaceEnvVars` method into `YarnSparkHadoopUtil` then do not forget to move these two tests to `YarnSparkHadoopUtilSuite`.

##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,90 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   * See SPARK-35672 for discussion of why it is necessary to do environment variable substitution.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  // scalastyle:off line.size.limit
+  /**
+   * Replace environment variables in a string according to the same rules as [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO`, `^%FOO^%`, and `%%FOO%%` will be resolved to `$FOO`, `%FOO%`, and `%FOO%`,
+   * respectively, instead of being treated as variable names. Note that Unix variable naming
+   * conventions (alphanumeric plus underscore, case-sensitive, can't start with a digit) are
+   * used for both Unix and Windows, following the convention of Hadoop's [[Shell]]
+   * (see specifically [[Shell.getEnvironmentVariableRegex]]).
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   * @see [[https://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html Environment Variables (IEEE Std 1003.1-2017)]]
+   * @see [[https://en.wikibooks.org/wiki/Windows_Batch_Scripting#Quoting_and_escaping Windows Batch Scripting | Quoting and Escaping]]
+   */
+  // scalastyle:on line.size.limit
+  def replaceEnvVars(

Review comment:
       Please try to move this method into the `YarnSparkHadoopUtil`  object where we already deal with some environment variable transformation (build env vars from string).
   
   It has higher chance for finding it there when somebody will need something similar (+ it has no state so an object fits better for this helper method).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928558793


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143660/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940278328


   **[Test build #144096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144096/testReport)** for PR 34120 at commit [`ec146d1`](https://github.com/apache/spark/commit/ec146d19bf2d1ebd12baaa085f40a77f04d6b0a6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959814217






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959574232






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956882475


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49290/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956640806


   **[Test build #144820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144820/testReport)** for PR 34120 at commit [`023963c`](https://github.com/apache/spark/commit/023963cee1b9506119d8fc6e0bbc916d4b59f211).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965816272


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49540/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959684947


   **[Test build #144879 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144879/testReport)** for PR 34120 at commit [`e1b0668`](https://github.com/apache/spark/commit/e1b0668ceb138c1450604ca46904418a6b042a42).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959814217


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49349/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959871359


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144879/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930513181


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48237/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956994798


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144820/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956587781


   @tgravescs @attilapiros  -- now that the Spark 3.2 release is all wrapped up, can you take another look? I just rebased on latest master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959761571


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49349/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956882475






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
attilapiros commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r744151083



##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
##########
@@ -150,12 +152,62 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
     checkResult(finalState, result)
   }
 
-  test("run Spark in yarn-client mode with additional jar") {
-    testWithAddJar(true)
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = true, "local")
   }
 
-  test("run Spark in yarn-cluster mode with additional jar") {
-    testWithAddJar(false)
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = false, "local")
+  }
+
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' " +
+      "and gateway-replacement path") {
+    // Use the original jar URL, but set up the gateway/replacement configs such that if
+    // replacement occurs, things will break. This ensures the replacement doesn't apply to the
+    // driver in 'client' mode. Executors will fail in this case because they still apply the
+    // replacement in client mode.
+    testWithAddJar(clientMode = true, "local", Some(jarUrl => {
+      (jarUrl.getPath, Map(
+        GATEWAY_ROOT_PATH.key -> Paths.get(jarUrl.toURI).getParent.toString,
+        REPLACEMENT_ROOT_PATH.key -> "/nonexistent/path/"
+      ))
+    }), expectExecutorFailure = true)
+  }
+
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' " +

Review comment:
       I have backported this test (actually the whole suite) to master without your PR changes and this test failed with the following error:
   
   ```
   [info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' and gateway-replacement path *** FAILED *** (13 seconds, 83 milliseconds)
   [info]   "[failure]" did not equal "[ORIGINAL]" (stdout/stderr was not captured) (BaseYarnClusterSuite.scala:229)
   [info]   Analysis:
   [info]   "[failure]" -> "[ORIGINAL]"
   [info]   org.scalatest.exceptions.TestFailedException:
   [info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
   [info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
   [info]   at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
   [info]   at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
   [info]   at org.apache.spark.deploy.yarn.BaseYarnClusterSuite.checkResult(BaseYarnClusterSuite.scala:229)
   [info]   at org.apache.spark.deploy.yarn.YarnClusterSuite.testWithAddJar(YarnClusterSuite.scala:356)
   ```
   
   Is this just a gap between production and testing? I suspect somewhere in runSpark the extra confs are not transformed into sys envs which is done at production, am I right? 
   Could you please look into this?

##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
##########
@@ -286,16 +338,23 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
     checkResult(finalState, result)
   }
 
-  private def testWithAddJar(clientMode: Boolean): Unit = {
+  private def testWithAddJar(
+      clientMode: Boolean,
+      jarUriScheme: String,
+      jarUrlToPathAndConfs: Option[URL => (String, Map[String, String])] = None,

Review comment:
       I think we can easily simply `jarUrlToPathAndConfs`:
   
   I would pass just pass single single String  the `extraJar` instead of `jarUriScheme` and the caller method would create `originalJar` and make the jar path transformation there. This way `jarUrlToPathAndConfs` could be just a simple `Map[String, String]` (actually we can name this arg to `extraConf` used below) with an empty map default.

##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
##########
@@ -150,12 +152,62 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
     checkResult(finalState, result)
   }
 
-  test("run Spark in yarn-client mode with additional jar") {
-    testWithAddJar(true)
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = true, "local")
   }
 
-  test("run Spark in yarn-cluster mode with additional jar") {
-    testWithAddJar(false)
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = false, "local")
+  }
+
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' " +
+      "and gateway-replacement path") {
+    // Use the original jar URL, but set up the gateway/replacement configs such that if
+    // replacement occurs, things will break. This ensures the replacement doesn't apply to the
+    // driver in 'client' mode. Executors will fail in this case because they still apply the
+    // replacement in client mode.
+    testWithAddJar(clientMode = true, "local", Some(jarUrl => {
+      (jarUrl.getPath, Map(
+        GATEWAY_ROOT_PATH.key -> Paths.get(jarUrl.toURI).getParent.toString,
+        REPLACEMENT_ROOT_PATH.key -> "/nonexistent/path/"
+      ))
+    }), expectExecutorFailure = true)
+  }
+
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' " +
+      "and gateway-replacement path") {
+    // Put a prefix in front of the original jar URL which causes it to be an invalid path.
+    // Set up the gateway/replacement configs such that if replacement occurs, it is a valid
+    // path again (by removing the prefix). This ensures the replacement is applied.
+    val gatewayPath = "/replaceme/nonexistent/"
+    testWithAddJar(clientMode = false, "local", Some(jarUrl => {
+      (gatewayPath + jarUrl.getPath, Map(
+        GATEWAY_ROOT_PATH.key -> gatewayPath,
+        REPLACEMENT_ROOT_PATH.key -> ""
+      ))
+    }))
+  }
+
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' " +

Review comment:
       This test on the other hand was successful on master. 
   And as we can see without the replacement the jarUrl will be correct. 
   Could you please change this to fail when the replacement is skipped for some reason?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959120252


   Looking, can you rekick the tests, it looks unrelated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959684947






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959579435






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930550123


   **[Test build #143726 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143726/testReport)** for PR 34120 at commit [`ddf4549`](https://github.com/apache/spark/commit/ddf454997d7bf1909a4b6f895fcc4061aaf713f3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928261733






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930552742






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931471269


   SGTM, let's wait for the release to wrap up. I will also fix the style issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930476122


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143728/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928286738






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930434147






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940384141


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48574/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940248334


   Just pushed up a new commit moving `replaceEnvVars` from `Client` to `YarnSparkHadoopUtil`. Also did a minor refactor to share a common variable for the name regex pattern.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940439796


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144096/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-970736719


   Many thanks @attilapiros and @tgravescs ! Also thanks to @peter-toth for initially reporting the issue with the original PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965859409


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49540/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965900564


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145071/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956882475


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49290/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930552744






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959684947






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959574232


   @xkrogen I can only do the review on this Friday/Saturday


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959684947


   **[Test build #144879 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144879/testReport)** for PR 34120 at commit [`e1b0668`](https://github.com/apache/spark/commit/e1b0668ceb138c1450604ca46904418a6b042a42).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959120252


   Looking, can you rekick the tests, it looks unrelated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959120252


   Looking, can you rekick the tests, it looks unrelated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r745130272



##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
##########
@@ -286,16 +338,23 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
     checkResult(finalState, result)
   }
 
-  private def testWithAddJar(clientMode: Boolean): Unit = {
+  private def testWithAddJar(
+      clientMode: Boolean,
+      jarUriScheme: String,
+      jarUrlToPathAndConfs: Option[URL => (String, Map[String, String])] = None,

Review comment:
       Yes, this is much better. An example of trying a little too hard to be DRY. Thanks for the suggestion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928491178


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48173/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930434147


   **[Test build #143726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143726/testReport)** for PR 34120 at commit [`ddf4549`](https://github.com/apache/spark/commit/ddf454997d7bf1909a4b6f895fcc4061aaf713f3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930474905


   **[Test build #143728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143728/testReport)** for PR 34120 at commit [`dce813c`](https://github.com/apache/spark/commit/dce813c49b41265cb1671ba4daa83e761bc8b4ef).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930476122






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928257442


   cc @mridulm @tgravescs @peter-toth @gengliangwang @HyukjinKwon


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928261733


   **[Test build #143658 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143658/testReport)** for PR 34120 at commit [`77b4fc1`](https://github.com/apache/spark/commit/77b4fc11aa5d0d3f0bbb97c8104236c27ef920cd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956994798


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144820/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956640806






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956882475






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956640806


   **[Test build #144820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144820/testReport)** for PR 34120 at commit [`023963c`](https://github.com/apache/spark/commit/023963cee1b9506119d8fc6e0bbc916d4b59f211).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956640806






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r746857642



##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
##########
@@ -150,12 +152,62 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
     checkResult(finalState, result)
   }
 
-  test("run Spark in yarn-client mode with additional jar") {
-    testWithAddJar(true)
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = true, "local")
   }
 
-  test("run Spark in yarn-cluster mode with additional jar") {
-    testWithAddJar(false)
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = false, "local")
+  }
+
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' " +
+      "and gateway-replacement path") {
+    // Use the original jar URL, but set up the gateway/replacement configs such that if
+    // replacement occurs, things will break. This ensures the replacement doesn't apply to the
+    // driver in 'client' mode. Executors will fail in this case because they still apply the
+    // replacement in client mode.
+    testWithAddJar(clientMode = true, "local", Some(jarUrl => {
+      (jarUrl.getPath, Map(
+        GATEWAY_ROOT_PATH.key -> Paths.get(jarUrl.toURI).getParent.toString,
+        REPLACEMENT_ROOT_PATH.key -> "/nonexistent/path/"
+      ))
+    }), expectExecutorFailure = true)
+  }
+
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' " +

Review comment:
       Actually, this is exposing a legitimate bug in the older code. The gateway replacement was done in `ExecutorRunnable`, so of course, it only affected executors. But, in cluster mode, the driver _also_ does not run on the gateway and thus should also have the replaced path instead of the gateway path (which is handled in the new codepath).
   
   But thanks for calling this out -- I forgot to include it in the PR description.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r746857642



##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
##########
@@ -150,12 +152,62 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
     checkResult(finalState, result)
   }
 
-  test("run Spark in yarn-client mode with additional jar") {
-    testWithAddJar(true)
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = true, "local")
   }
 
-  test("run Spark in yarn-cluster mode with additional jar") {
-    testWithAddJar(false)
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = false, "local")
+  }
+
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' " +
+      "and gateway-replacement path") {
+    // Use the original jar URL, but set up the gateway/replacement configs such that if
+    // replacement occurs, things will break. This ensures the replacement doesn't apply to the
+    // driver in 'client' mode. Executors will fail in this case because they still apply the
+    // replacement in client mode.
+    testWithAddJar(clientMode = true, "local", Some(jarUrl => {
+      (jarUrl.getPath, Map(
+        GATEWAY_ROOT_PATH.key -> Paths.get(jarUrl.toURI).getParent.toString,
+        REPLACEMENT_ROOT_PATH.key -> "/nonexistent/path/"
+      ))
+    }), expectExecutorFailure = true)
+  }
+
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' " +

Review comment:
       Actually, this is exposing a legitimate bug in the older code. The gateway replacement was done in `ExecutorRunnable`, so of course, it only affected executors. But, in cluster mode, the driver _also_ does not run on the gateway and thus should also have the replaced path instead of the gateway path.
   
   But thanks for calling this out -- I forgot to include it in the PR description.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-970484669


   merged to master
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930539322


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48239/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928257442


   cc @mridulm @tgravescs @peter-toth @gengliangwang @HyukjinKwon


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r718765256



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,84 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  /**
+   * Replace environment variables in a string according to the same rules [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO` and `%%FOO%%` will be resolved to `$FOO` and `%FOO%`, respectively, instead of being
+   * treated as variable names.
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   */
+  def replaceEnvVars(

Review comment:
       I agree, it's far from ideal.
   
   I've been looking into `StringSubstitutor`, and it works great for the YARN style (`{{ ... }}`) and the Unix curly-braces style (`${...}`). I'm having trouble getting the escapes to work right for the Windows style (`% ... %`) -- I think it doesn't like that the prefix, suffix, and escape characters are all the same. Unfortunately this won't work at all for the Unix style without curly braces (`$VAR_NAME`), because `StringSubstitutor` assumes you always have a suffix. I tried implementing a custom `StringMatcher` to get around this, but it doesn't work, because there's no way to tell `StringSubstitutor` that the current character does mark the end of a variable, _but_ it is not actually a suffix character and should be maintained in the output string.
   
   Given all of this, I think it makes sense to stick with regex for all of them, rather than using `StringSubstitutor` for some and regex for others.
   
   > If this was passed in an environment variable does everything get replaced properly? I thought yarn would put it in the launch script and then properly handle replacements. env variables have much longer length and normally you would set the class path environment variable if doing it the more traditional way so assume it would fit.
   
   I think you're suggesting that, instead of using Spark configs to pass the values, pass them using an environment variable?
   
   AFAIU, environment variables are still subject to size limitations, and actually the command line arguments plus environment variables contribute to the same total limit ([read more here](https://unix.stackexchange.com/questions/110282/cp-max-source-files-number-arguments-for-copy-utility/110301#110301)). You do at least bypass some limitations, notably the maximum length for a _single_ argument, but I don't see a lot of benefit in going down this route. Let me know if I'm missing anything.

##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,84 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  /**
+   * Replace environment variables in a string according to the same rules [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO` and `%%FOO%%` will be resolved to `$FOO` and `%FOO%`, respectively, instead of being
+   * treated as variable names.
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   */
+  def replaceEnvVars(
+      unresolvedString: String,
+      env: IMap[String, String],
+      isWindows: Boolean = Shell.WINDOWS): String = {
+    val osResolvedString = if (isWindows) {
+      // Environment variable names can contain anything besides = and %
+      // Ref: https://docs.microsoft.com/en-us/windows/win32/procthread/environment-variables
+      val windowsPattern = "(?:%%|%([^=%]+)%)".r
+      windowsPattern.replaceAllIn(unresolvedString, m =>
+        Regex.quoteReplacement(m.matched match {
+          case "%%" => "%"
+          case _ => env.getOrElse(m.group(1), "")
+        })
+      )
+    } else {
+      // Environment variables are alphanumeric plus underscore, case-sensitive, can't start with
+      // a digit, based on Shell and Utilities volume of IEEE Std 1003.1-2001
+      // Ref: https://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html
+      val unixPattern = """(?i)(?:\\\\|\\\$|\$([A-Z_][A-Z0-9_]*))|\$\{([A-Z_][A-Z0-9_]*)}""".r
+      unixPattern.replaceAllIn(unresolvedString, m =>
+        Regex.quoteReplacement(m.matched match {
+          case """\\""" => """\"""
+          case """\$""" => """$"""
+          case _ => m.subgroups.filterNot(_ == null).headOption match {
+            case Some(v) => env.getOrElse(v, "")
+            case None =>
+              // Note that this should never be reached and indicates a bug
+              throw new IllegalStateException(s"No valid capture group was found for match: $m")
+          }
+        })
+      )
+    }
+
+    // {{...}} is a YARN thing and not OS-specific. Follow Unix shell naming conventions
+    val yarnPattern = "(?i)\\{\\{([A-Z_][A-Z0-9_]*)}}".r
+    yarnPattern.replaceAllIn(osResolvedString,
+      m => Regex.quoteReplacement(env.getOrElse(m.group(1), "")))

Review comment:
       Updated, thanks for the suggestion




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931630706


   **[Test build #143770 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143770/testReport)** for PR 34120 at commit [`92e296b`](https://github.com/apache/spark/commit/92e296b245f5ff0f446f46f1888ddfdc900a2044).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930476122






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930513203


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48237/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928286738






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928550100


   **[Test build #143660 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143660/testReport)** for PR 34120 at commit [`708a377`](https://github.com/apache/spark/commit/708a3776cc58633f027b2b3b4e07f0ec55afc2e4).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928261733


   **[Test build #143658 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143658/testReport)** for PR 34120 at commit [`77b4fc1`](https://github.com/apache/spark/commit/77b4fc11aa5d0d3f0bbb97c8104236c27ef920cd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956640806


   **[Test build #144820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144820/testReport)** for PR 34120 at commit [`023963c`](https://github.com/apache/spark/commit/023963cee1b9506119d8fc6e0bbc916d4b59f211).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959684947


   **[Test build #144879 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144879/testReport)** for PR 34120 at commit [`e1b0668`](https://github.com/apache/spark/commit/e1b0668ceb138c1450604ca46904418a6b042a42).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956831102


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49290/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928270013


   **[Test build #143658 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143658/testReport)** for PR 34120 at commit [`77b4fc1`](https://github.com/apache/spark/commit/77b4fc11aa5d0d3f0bbb97c8104236c27ef920cd).
    * This patch **fails to build**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-956968557


   **[Test build #144820 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144820/testReport)** for PR 34120 at commit [`023963c`](https://github.com/apache/spark/commit/023963cee1b9506119d8fc6e0bbc916d4b59f211).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r746998575



##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
##########
@@ -150,12 +152,62 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
     checkResult(finalState, result)
   }
 
-  test("run Spark in yarn-client mode with additional jar") {
-    testWithAddJar(true)
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = true, "local")
   }
 
-  test("run Spark in yarn-cluster mode with additional jar") {
-    testWithAddJar(false)
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local'") {
+    testWithAddJar(clientMode = false, "local")
+  }
+
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' " +
+      "and gateway-replacement path") {
+    // Use the original jar URL, but set up the gateway/replacement configs such that if
+    // replacement occurs, things will break. This ensures the replacement doesn't apply to the
+    // driver in 'client' mode. Executors will fail in this case because they still apply the
+    // replacement in client mode.
+    testWithAddJar(clientMode = true, "local", Some(jarUrl => {
+      (jarUrl.getPath, Map(
+        GATEWAY_ROOT_PATH.key -> Paths.get(jarUrl.toURI).getParent.toString,
+        REPLACEMENT_ROOT_PATH.key -> "/nonexistent/path/"
+      ))
+    }), expectExecutorFailure = true)
+  }
+
+  test("SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' " +
+      "and gateway-replacement path") {
+    // Put a prefix in front of the original jar URL which causes it to be an invalid path.
+    // Set up the gateway/replacement configs such that if replacement occurs, it is a valid
+    // path again (by removing the prefix). This ensures the replacement is applied.
+    val gatewayPath = "/replaceme/nonexistent/"
+    testWithAddJar(clientMode = false, "local", Some(jarUrl => {
+      (gatewayPath + jarUrl.getPath, Map(
+        GATEWAY_ROOT_PATH.key -> gatewayPath,
+        REPLACEMENT_ROOT_PATH.key -> ""
+      ))
+    }))
+  }
+
+  test("SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' " +

Review comment:
       Yes, I originally used this approach because I was having trouble passing environment variables to the spawned job. I finally realized that it was because I was trying to use the `extraEnv` argument on `runSpark`, which sets environment variables for `spark-submit`, not the job itself. I've updated this to (a) fail if gateway string replacement doesn't happen, and (b) actually replace environment variables with a new string, instead of an empty string.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965838898


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49540/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928261733






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-928410804


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48173/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xkrogen commented on a change in pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
xkrogen commented on a change in pull request #34120:
URL: https://github.com/apache/spark/pull/34120#discussion_r718765256



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -1470,6 +1471,84 @@ private object Client extends Logging {
     (mainUri ++ secondaryUris).toArray
   }
 
+  /**
+   * Returns a list of local, absolute file URLs representing the user classpath. Note that this
+   * must be executed on the same host which will access the URLs, as it will resolve relative
+   * paths based on the current working directory, as well as environment variables.
+   *
+   * @param conf Spark configuration.
+   * @param useClusterPath Whether to use the 'cluster' path when resolving paths with the
+   *                       `local` scheme. This should be used when running on the cluster, but
+   *                       not when running on the gateway (i.e. for the driver in `client` mode).
+   * @return Array of local URLs ready to be passed to a [[java.net.URLClassLoader]].
+   */
+  def getUserClasspathUrls(conf: SparkConf, useClusterPath: Boolean): Array[URL] = {
+    Client.getUserClasspath(conf).map { uri =>
+      val inputPath = uri.getPath
+      val replacedFilePath = if (Utils.isLocalUri(uri.toString) && useClusterPath) {
+        Client.getClusterPath(conf, inputPath)
+      } else {
+        // Any other URI schemes should have been resolved by this point
+        assert(uri.getScheme == null || uri.getScheme == "file" || Utils.isLocalUri(uri.toString),
+          "getUserClasspath should only return 'file' or 'local' URIs but found: " + uri)
+        inputPath
+      }
+      val envVarResolvedFilePath = replaceEnvVars(replacedFilePath, sys.env)
+      Paths.get(envVarResolvedFilePath).toAbsolutePath.toUri.toURL
+    }
+  }
+
+  /**
+   * Replace environment variables in a string according to the same rules [[Environment]]:
+   * `$VAR_NAME` for Unix, `%VAR_NAME%` for Windows, and `{{VAR_NAME}}` for all OS.
+   * This support escapes for `$` and `%` characters, e.g.
+   * `\$FOO` and `%%FOO%%` will be resolved to `$FOO` and `%FOO%`, respectively, instead of being
+   * treated as variable names.
+   *
+   * @param unresolvedString The unresolved string which may contain variable references.
+   * @param env The System environment
+   * @param isWindows True iff running in a Windows environment
+   * @return The input string with variables replaced with their values from `env`
+   */
+  def replaceEnvVars(

Review comment:
       I agree, it's far from ideal.
   
   I've been looking into `StringSubstitutor`, and it works great for the YARN style (`{{ ... }}`) and the Unix curly-braces style (`${...}`). I'm having trouble getting the escapes to work right for the Windows style (`% ... %`) -- I think it doesn't like that the prefix, suffix, and escape characters are all the same. Unfortunately this won't work at all for the Unix style without curly braces (`$VAR_NAME`), because `StringSubstitutor` assumes you always have a suffix. I tried implementing a custom `StringMatcher` to get around this, but it doesn't work, because there's no way to tell `StringSubstitutor` that the current character does mark the end of a variable, _but_ it is not actually a suffix character and should be maintained in the output string.
   
   Given all of this, I think it makes sense to stick with regex for all of them, rather than using `StringSubstitutor` for some and regex for others.
   
   > If this was passed in an environment variable does everything get replaced properly? I thought yarn would put it in the launch script and then properly handle replacements. env variables have much longer length and normally you would set the class path environment variable if doing it the more traditional way so assume it would fit.
   
   I think you're suggesting that, instead of using Spark configs to pass the values, pass them using an environment variable?
   
   AFAIU, environment variables are still subject to size limitations, and actually the command line arguments plus environment variables contribute to the same total limit ([read more here](https://unix.stackexchange.com/questions/110282/cp-max-source-files-number-arguments-for-copy-utility/110301#110301)). You do at least bypass some limitations, notably the maximum length for a _single_ argument, but I don't see a lot of benefit in going down this route. Let me know if I'm missing anything.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930509075


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48239/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-930476097


   **[Test build #143728 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143728/testReport)** for PR 34120 at commit [`dce813c`](https://github.com/apache/spark/commit/dce813c49b41265cb1671ba4daa83e761bc8b4ef).
    * This patch **fails Scala style tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940278328


   **[Test build #144096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144096/testReport)** for PR 34120 at commit [`ec146d1`](https://github.com/apache/spark/commit/ec146d19bf2d1ebd12baaa085f40a77f04d6b0a6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940359282


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48574/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-940429253


   **[Test build #144096 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144096/testReport)** for PR 34120 at commit [`ec146d1`](https://github.com/apache/spark/commit/ec146d19bf2d1ebd12baaa085f40a77f04d6b0a6).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959871359


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144879/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959120252


   Looking, can you rekick the tests, it looks unrelated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959684947






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965791270


   **[Test build #145071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145071/testReport)** for PR 34120 at commit [`97540f8`](https://github.com/apache/spark/commit/97540f8f9bb26ba33a93b075e15365b4553834d6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-965859409


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49540/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-959814217






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931518178


   **[Test build #143770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143770/testReport)** for PR 34120 at commit [`92e296b`](https://github.com/apache/spark/commit/92e296b245f5ff0f446f46f1888ddfdc900a2044).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34120:
URL: https://github.com/apache/spark/pull/34120#issuecomment-931637146


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143770/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros closed pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

Posted by GitBox <gi...@apache.org>.
attilapiros closed pull request #34120:
URL: https://github.com/apache/spark/pull/34120


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org