You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "vinodkc (via GitHub)" <gi...@apache.org> on 2023/05/11 19:18:43 UTC

[GitHub] [spark] vinodkc opened a new pull request, #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

vinodkc opened a new pull request, #41144:
URL: https://github.com/apache/spark/pull/41144

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
    
   ### What changes were proposed in this pull request?
   In a heterogeneous cluster, it is hard to debug issues from the Operating system and Java/python versions. Currently, this information is missing in the spark application log, this PR adds that information to the application info log, which will help in troubleshooting and debugging any issues that may arise
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   To troubleshoot host-specific issues in the Spark application ran on heterogeneous cluster
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manually tested


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1193253380


##########
core/src/main/scala/org/apache/spark/SparkContext.scala:
##########
@@ -2114,6 +2114,8 @@ class SparkContext(config: SparkConf) extends Logging {
    * @param exitCode Specified exit code that will passed to scheduler backend in client mode.
    */
   def stop(exitCode: Int): Unit = {
+    logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
+      s"${System.getProperty("os.arch")}, Java version ${System.getProperty("java.version")}")

Review Comment:
   Hmm.. I would expect to see these when we start it (not stop).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1608062932

   @dongjoon-hyun , To avoid overhead, `logPythonInfo` is controlled using `spark.executor.python.worker.log.details`,  _false_ by default
   can you please test with `--conf spark.executor.python.worker.log.details=true`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1576034499

   > For the Python, the information is printed at every task execution. Could you find a proper place to print that info once, @vinodkc ?
   > 
   > ```
   > 23/06/03 18:29:46 INFO PythonRunner: Python version info: python3(3.11)
   > 23/06/03 18:29:46 INFO PythonRunner: Python version info: python3(3.11)
   > 23/06/03 18:29:46 INFO PythonRunner: Times: total = 38, boot = -8568, init = 8606, finish = 0
   > 23/06/03 18:29:46 INFO PythonRunner: Times: total = 38, boot = -8571, init = 8609, finish = 0
   > 23/06/03 18:29:46 INFO Executor: Finished task 0.0 in stage 18.0 (TID 26). 1393 bytes result sent to driver
   > 23/06/03 18:29:46 INFO Executor: Finished task 1.0 in stage 18.0 (TID 27). 1396 bytes result sent to driver
   > 23/06/03 18:29:50 INFO CoarseGrainedExecutorBackend: Got assigned task 28
   > 23/06/03 18:29:50 INFO CoarseGrainedExecutorBackend: Got assigned task 29
   > 23/06/03 18:29:50 INFO Executor: Running task 0.0 in stage 19.0 (TID 28)
   > 23/06/03 18:29:50 INFO Executor: Running task 1.0 in stage 19.0 (TID 29)
   > 23/06/03 18:29:50 INFO TorrentBroadcast: Started reading broadcast variable 14 with 1 pieces (estimated total size 4.0 MiB)
   > 23/06/03 18:29:50 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes in memory (estimated size 3.7 KiB, free 366.2 MiB)
   > 23/06/03 18:29:50 INFO TorrentBroadcast: Reading broadcast variable 14 took 20 ms
   > 23/06/03 18:29:50 INFO MemoryStore: Block broadcast_14 stored as values in memory (estimated size 5.7 KiB, free 366.2 MiB)
   > 23/06/03 18:29:50 INFO PythonRunner: Python version info: python3(3.11)
   > 23/06/03 18:29:50 INFO PythonRunner: Python version info: python3(3.11)
   > ```
   
   Done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1214791186


##########
core/src/main/scala/org/apache/spark/SparkContext.scala:
##########
@@ -2114,6 +2114,8 @@ class SparkContext(config: SparkConf) extends Logging {
    * @param exitCode Specified exit code that will passed to scheduler backend in client mode.
    */
   def stop(exitCode: Int): Unit = {
+    logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
+      s"${System.getProperty("os.arch")}, Java version ${System.getProperty("java.version")}")

Review Comment:
   Thanks for the review comment. I fixed the comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1549107922

   These are part of spark env tab, right ? Why do we need to log them ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1608125882

   Thank you for your patience! Merged to master for Apache Spark 3.5.0.
   
   Thank you, @vinodkc , @HyukjinKwon , @mridulm .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on a diff in pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1225416018


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -704,7 +704,13 @@ private[spark] object PythonRunner {
   // already running worker monitor threads for worker and task attempts ID pairs
   val runningMonitorThreads = ConcurrentHashMap.newKeySet[(Socket, Long)]()
 
+  private var printPythonInfo = true
+
   def apply(func: PythonFunction): PythonRunner = {
+    if (printPythonInfo) {
+      printPythonInfo = false
+      PythonUtils.logPythonInfo(func.pythonExec)
+    }

Review Comment:
   Dongjoon has more context around whether this is the right place.
   One comment - if it is indeed the only place to put this, then `printPythonInfo` is being used in an MT-unsafe way.
   Make it an `AtomicBoolean`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1214831239


##########
core/src/main/scala/org/apache/spark/executor/Executor.scala:
##########
@@ -71,6 +71,8 @@ private[spark] class Executor(
   extends Logging {
 
   logInfo(s"Starting executor ID $executorId on host $executorHostname")
+  logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
+    s"${System.getProperty("os.arch")}, Java version ${System.getProperty("java.version")}")

Review Comment:
   ditto. Newline for Java version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1216028837


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -106,6 +106,7 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   protected val envVars: java.util.Map[String, String] = funcs.head.funcs.head.envVars
   protected val pythonExec: String = funcs.head.funcs.head.pythonExec
   protected val pythonVer: String = funcs.head.funcs.head.pythonVer
+  logInfo(s"Python version info: $pythonExec($pythonVer)")

Review Comment:
   This is printed multiple times, isn't it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1574301599

   Thank you for updates.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1215599942


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -106,7 +106,7 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   protected val envVars: java.util.Map[String, String] = funcs.head.funcs.head.envVars
   protected val pythonExec: String = funcs.head.funcs.head.pythonExec
   protected val pythonVer: String = funcs.head.funcs.head.pythonVer
-

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on a diff in pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1218708566


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -704,7 +704,13 @@ private[spark] object PythonRunner {
   // already running worker monitor threads for worker and task attempts ID pairs
   val runningMonitorThreads = ConcurrentHashMap.newKeySet[(Socket, Long)]()
 
+  private var printPythonInfo = true
+
   def apply(func: PythonFunction): PythonRunner = {
+    if (printPythonInfo) {
+      printPythonInfo = false
+      PythonUtils.logPythonInfo(func.pythonExec)
+    }

Review Comment:
   Yes, this is the first place, I could see in Executor, where we get the expected Python program name from the driver



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1218656537


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -704,7 +704,13 @@ private[spark] object PythonRunner {
   // already running worker monitor threads for worker and task attempts ID pairs
   val runningMonitorThreads = ConcurrentHashMap.newKeySet[(Socket, Long)]()
 
+  private var printPythonInfo = true
+
   def apply(func: PythonFunction): PythonRunner = {
+    if (printPythonInfo) {
+      printPythonInfo = false
+      PythonUtils.logPythonInfo(func.pythonExec)
+    }

Review Comment:
   Is this the only way? It looks like too much to check `if` condition at every `apply` invocation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1608065727

   I tried already, but it doesn't work~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1214830522


##########
core/src/main/scala/org/apache/spark/SparkContext.scala:
##########
@@ -194,6 +194,8 @@ class SparkContext(config: SparkConf) extends Logging {
 
   // log out Spark Version in Spark driver log
   logInfo(s"Running Spark version $SPARK_VERSION")
+  logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
+    s"${System.getProperty("os.arch")}, Java version ${System.getProperty("java.version")}")

Review Comment:
   Maybe, new line for `Java version`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1214891310


##########
core/src/main/scala/org/apache/spark/SparkContext.scala:
##########
@@ -194,6 +194,8 @@ class SparkContext(config: SparkConf) extends Logging {
 
   // log out Spark Version in Spark driver log
   logInfo(s"Running Spark version $SPARK_VERSION")
+  logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
+    s"${System.getProperty("os.arch")}, Java version ${System.getProperty("java.version")}")

Review Comment:
   Do you mean, like this?
   ```
     logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
       s"${System.getProperty("os.arch")} " +
       s"Java version ${System.getProperty("java.version")}")
   ```
   OR like this?
     ```
   logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
       s"${System.getProperty("os.arch")}")
     logInfo(s"Java version ${System.getProperty("java.version")}")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1214891310


##########
core/src/main/scala/org/apache/spark/SparkContext.scala:
##########
@@ -194,6 +194,8 @@ class SparkContext(config: SparkConf) extends Logging {
 
   // log out Spark Version in Spark driver log
   logInfo(s"Running Spark version $SPARK_VERSION")
+  logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
+    s"${System.getProperty("os.arch")}, Java version ${System.getProperty("java.version")}")

Review Comment:
   Do you mean like this?
   ```
     logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
       s"${System.getProperty("os.arch")} " +
       s"Java version ${System.getProperty("java.version")}")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1214831077


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -106,7 +106,7 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   protected val envVars: java.util.Map[String, String] = funcs.head.funcs.head.envVars
   protected val pythonExec: String = funcs.head.funcs.head.pythonExec
   protected val pythonVer: String = funcs.head.funcs.head.pythonVer
-

Review Comment:
   Shall we keep the empty line to split the variable definition and `logInfo`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1214891310


##########
core/src/main/scala/org/apache/spark/SparkContext.scala:
##########
@@ -194,6 +194,8 @@ class SparkContext(config: SparkConf) extends Logging {
 
   // log out Spark Version in Spark driver log
   logInfo(s"Running Spark version $SPARK_VERSION")
+  logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
+    s"${System.getProperty("os.arch")}, Java version ${System.getProperty("java.version")}")

Review Comment:
   Do you mean like this?
   ```
     logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
       s"${System.getProperty("os.arch")} " +
       s"Java version ${System.getProperty("java.version")}")
   ```
   OR like this ?
     logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
       s"${System.getProperty("os.arch")}")
     logInfo(s"Java version ${System.getProperty("java.version")}")



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun closed pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log
URL: https://github.com/apache/spark/pull/41144


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1608067476

   I tested two cases, but cannot find the Python package logs.
   - Spark Standalone Cluster (one master and one executor) with `spark-defaults.conf`.
   - A single JVM `pyspark`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on a diff in pull request #41144: [SPARK-43470][CORE] Add operating system ,Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1214891310


##########
core/src/main/scala/org/apache/spark/SparkContext.scala:
##########
@@ -194,6 +194,8 @@ class SparkContext(config: SparkConf) extends Logging {
 
   // log out Spark Version in Spark driver log
   logInfo(s"Running Spark version $SPARK_VERSION")
+  logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
+    s"${System.getProperty("os.arch")}, Java version ${System.getProperty("java.version")}")

Review Comment:
   Do you mean like this?
   ```
     logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
       s"${System.getProperty("os.arch")} " +
       s"Java version ${System.getProperty("java.version")}")
   ```
   OR like this ?
     ```
   logInfo(s"OS info ${System.getProperty("os.name")}, ${System.getProperty("os.version")}, " +
       s"${System.getProperty("os.arch")}")
     logInfo(s"Java version ${System.getProperty("java.version")}")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on a diff in pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on code in PR #41144:
URL: https://github.com/apache/spark/pull/41144#discussion_r1217472619


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -106,6 +106,7 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   protected val envVars: java.util.Map[String, String] = funcs.head.funcs.head.envVars
   protected val pythonExec: String = funcs.head.funcs.head.pythonExec
   protected val pythonVer: String = funcs.head.funcs.head.pythonVer
+  logInfo(s"Python version info: $pythonExec($pythonVer)")

Review Comment:
   Updated code to print version and package details only once per executor
   Added a bit more details to log list of installed packages too. 
   A new property `spark.executor.python.worker.log.details` has been added to enable this Python info  logging 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] vinodkc commented on pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "vinodkc (via GitHub)" <gi...@apache.org>.
vinodkc commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1608090108

   @dongjoon-hyun , Could please share the pyspark command you used to test
   
   This is the command I used for the local mode 
   `./bin/pyspark  --jars --master local  --conf spark.executor.python.worker.log.details=true  --conf spark.log.level=INFO`
   
   Then Searched the words `Python version` or `List of Python packages` in console.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41144: [SPARK-43470][CORE] Add OS, Java, Python version information to application log

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41144:
URL: https://github.com/apache/spark/pull/41144#issuecomment-1608093307

   Thanks. Let me try once more and share the result.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org