You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/24 11:06:33 UTC

[GitHub] [spark] martin-g opened a new pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

martin-g opened a new pull request #35556:
URL: https://github.com/apache/spark/pull/35556


   
   ### What changes were proposed in this pull request?
   
   Sort the debug info printed by SparkSubmit when `-verbose` is enabled.
   
   
   ### Why are the changes needed?
   
   This way it is easier to find settings/properties and their values.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. The log output changes - some data is sorted.
   
   ### How was this patch tested?
   
   Old unit tests still pass & manually check the sorted output.
   ```
    Main class:
   org.apache.spark.deploy.k8s.submit.KubernetesClientApplication
   Arguments:
   --main-class
   --primary-java-resource
   local:///opt/spark/examples/jars/spark-examples_2.13-3.3.0-SNAPSHOT.jar
   org.apache.spark.examples.SparkPi
   Spark config:
   (spark.app.name,spark-on-k8s-app)
   (spark.app.submitTime,1645106476125)
   (spark.driver.cores,1)
   (spark.driver.extraJavaOptions,-Dio.netty.tryReflectionSetAccessible=true)
   (spark.driver.memory,2048m)
   (spark.dynamicAllocation.enabled,true)
   (spark.dynamicAllocation.shuffleTracking.enabled,true)
   (spark.executor.cores,2)
   (spark.executor.extraJavaOptions,-Dio.netty.tryReflectionSetAccessible=true)
   (spark.executor.instances,3)
   (spark.executor.memory,2048m)
   (spark.jars,local:///opt/spark/examples/jars/spark-examples_2.13-3.3.0-SNAPSHOT.jar)
   (spark.kubernetes.allocation.batch.delay,1)
   (spark.kubernetes.allocation.batch.size,3)
   (spark.kubernetes.authenticate.driver.serviceAccountName,spark-account-name)
   (spark.kubernetes.driver.container.image,spark/spark:3.3.0-SNAPSHOT-scala_2.13-11-jre-slim)
   (spark.kubernetes.executor.container.image,spark/spark:3.3.0-SNAPSHOT-scala_2.13-11-jre-slim)
   (spark.kubernetes.namespace,spark-on-k8s)
   (spark.master,k8s://https://192.168.49.2:8443)
   (spark.network.timeout,300)
   (spark.submit.deployMode,cluster)
   (spark.submit.pyFiles,)
   Classpath elements:
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] martin-g commented on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
martin-g commented on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1052563204


   Thank you, @srowen and @HyukjinKwon !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1050516319


   Ah, after enabling, he needs to push an empty commit or rebase to trigger properly. ... I think probably this is a bug we need to fix 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1044846766


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #35556:
URL: https://github.com/apache/spark/pull/35556#discussion_r810740995



##########
File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
##########
@@ -906,10 +906,10 @@ private[spark] class SparkSubmit extends Logging {
 
     if (args.verbose) {
       logInfo(s"Main class:\n$childMainClass")
-      logInfo(s"Arguments:\n${childArgs.mkString("\n")}")
+      logInfo(s"Arguments:\n${childArgs.sorted.mkString("\n")}")

Review comment:
       Yeah .. I wouldn't sort it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
srowen closed pull request #35556:
URL: https://github.com/apache/spark/pull/35556


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] martin-g commented on a change in pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
martin-g commented on a change in pull request #35556:
URL: https://github.com/apache/spark/pull/35556#discussion_r810858267



##########
File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
##########
@@ -906,10 +906,10 @@ private[spark] class SparkSubmit extends Logging {
 
     if (args.verbose) {
       logInfo(s"Main class:\n$childMainClass")
-      logInfo(s"Arguments:\n${childArgs.mkString("\n")}")
+      logInfo(s"Arguments:\n${childArgs.sorted.mkString("\n")}")
       // sysProps may contain sensitive information, so redact before printing
-      logInfo(s"Spark config:\n${Utils.redact(sparkConf.getAll.toMap).mkString("\n")}")
-      logInfo(s"Classpath elements:\n${childClasspath.mkString("\n")}")
+      logInfo(s"Spark config:\n${Utils.redact(sparkConf.getAll.toMap).sorted.mkString("\n")}")
+      logInfo(s"Classpath elements:\n${childClasspath.sorted.mkString("\n")}")

Review comment:
       Reverted it!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #35556:
URL: https://github.com/apache/spark/pull/35556#discussion_r810503007



##########
File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
##########
@@ -906,10 +906,10 @@ private[spark] class SparkSubmit extends Logging {
 
     if (args.verbose) {
       logInfo(s"Main class:\n$childMainClass")
-      logInfo(s"Arguments:\n${childArgs.mkString("\n")}")
+      logInfo(s"Arguments:\n${childArgs.sorted.mkString("\n")}")

Review comment:
       Ah wait a sec, we can't sort these. Order of args matters; you might have `--flag value`. You can see the issue in your example, where the arg order doesn't reflect what was passed and loses this information

##########
File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
##########
@@ -906,10 +906,10 @@ private[spark] class SparkSubmit extends Logging {
 
     if (args.verbose) {
       logInfo(s"Main class:\n$childMainClass")
-      logInfo(s"Arguments:\n${childArgs.mkString("\n")}")
+      logInfo(s"Arguments:\n${childArgs.sorted.mkString("\n")}")
       // sysProps may contain sensitive information, so redact before printing
-      logInfo(s"Spark config:\n${Utils.redact(sparkConf.getAll.toMap).mkString("\n")}")
-      logInfo(s"Classpath elements:\n${childClasspath.mkString("\n")}")
+      logInfo(s"Spark config:\n${Utils.redact(sparkConf.getAll.toMap).sorted.mkString("\n")}")
+      logInfo(s"Classpath elements:\n${childClasspath.sorted.mkString("\n")}")

Review comment:
       Likewise classpath order matters, so not sure we should do this.
   Spark configs ... yeah order should not matter, so that should be sorted.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] martin-g commented on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
martin-g commented on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1050622482


   Thank you, @HyukjinKwon !
   Nice trick with the Github suggestions feature! 
   `Allow edits and access to secrets by maintainers` is also enabled, so you could even push an empty commit
   
   I used close/reopen PR to re-trigger the checks but this needs https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request and this event is not enabled at https://github.com/apache/spark/blob/e56f8651f2b7e8890cac791db6c01b96946b27bf/.github/workflows/build_and_test.yml#L22. Do you want me to open a new PR to add this event or it is not there intentionally ?
   The change is to add these two lines - https://github.com/apache/avro/blob/f10fa6b9e30e38ec5cde6e00288f0b31b24bf1de/.github/workflows/test-lang-c.yml#L21-L22


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #35556:
URL: https://github.com/apache/spark/pull/35556#discussion_r814470465



##########
File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
##########
@@ -327,7 +327,7 @@ private[deploy] class SparkSubmitArguments(args: Seq[String], env: Map[String, S
     |
     |Spark properties used, including those specified through
     | --conf and those from the properties file $propertiesFile:
-    |${Utils.redact(sparkProperties).mkString("  ", "\n  ", "\n")}
+    |${Utils.redact(sparkProperties).sorted.mkString("  ", "\n  ", "\n")}

Review comment:
       ```suggestion
       |${Utils.redact(sparkProperties).sorted.mkStrng("  ", "\n  ", "\n")}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1048852230


   Hm, I think the test failures may be legitimate here. They are tests related to conf. Can you check the failed tests and see if they need an adjustment after the output order changes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] martin-g closed pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
martin-g closed pull request #35556:
URL: https://github.com/apache/spark/pull/35556


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] martin-g commented on a change in pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
martin-g commented on a change in pull request #35556:
URL: https://github.com/apache/spark/pull/35556#discussion_r810858112



##########
File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
##########
@@ -906,10 +906,10 @@ private[spark] class SparkSubmit extends Logging {
 
     if (args.verbose) {
       logInfo(s"Main class:\n$childMainClass")
-      logInfo(s"Arguments:\n${childArgs.mkString("\n")}")
+      logInfo(s"Arguments:\n${childArgs.sorted.mkString("\n")}")

Review comment:
       Good catch! Reverted it!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1049906316


   @HyukjinKwon do you know why it seems like the main tests weren't run here? does he need to enable tests in his branch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #35556:
URL: https://github.com/apache/spark/pull/35556#discussion_r814470523



##########
File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
##########
@@ -327,7 +327,7 @@ private[deploy] class SparkSubmitArguments(args: Seq[String], env: Map[String, S
     |
     |Spark properties used, including those specified through
     | --conf and those from the properties file $propertiesFile:
-    |${Utils.redact(sparkProperties).mkString("  ", "\n  ", "\n")}
+    |${Utils.redact(sparkProperties).sorted.mkStrng("  ", "\n  ", "\n")}

Review comment:
       ```suggestion
       |${Utils.redact(sparkProperties).sorted.mkString("  ", "\n  ", "\n")}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1050516660


   I pushed (and removed back) some arbitrary changes to trigger the CI 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1050516660


   I pushed some arbitrary changes to trigger the CI 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #35556: [SPARK-38242][CORE] Sort the SparkSubmit debug output

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #35556:
URL: https://github.com/apache/spark/pull/35556#issuecomment-1052189096


   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org