You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "jzhuge (via GitHub)" <gi...@apache.org> on 2023/02/27 19:22:24 UTC

[GitHub] [spark] jzhuge opened a new pull request, #40199: [SPARK-42596][CORE] OMP_NUM_THREADS not set to number of executor cor…

jzhuge opened a new pull request, #40199:
URL: https://github.com/apache/spark/pull/40199

   ### What changes were proposed in this pull request?
   
   Revert changes in PythonRunner from SPARK-41188.
   
   ### Why are the changes needed?
   
   SPARK-41188 stopped setting OMP_NUM_THREADS to number of executor cores by default when running Python UDF on YARN.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Manual testing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jzhuge commented on a diff in pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "jzhuge (via GitHub)" <gi...@apache.org>.
jzhuge commented on code in PR #40199:
URL: https://github.com/apache/spark/pull/40199#discussion_r1119501186


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -135,6 +135,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
     val execCoresProp = Option(context.getLocalProperty(EXECUTOR_CORES_LOCAL_PROPERTY))
     val memoryMb = Option(context.getLocalProperty(PYSPARK_MEMORY_LOCAL_PROPERTY)).map(_.toLong)
     val localdir = env.blockManager.diskBlockManager.localDirs.map(f => f.getPath()).mkString(",")
+    // if OMP_NUM_THREADS is not explicitly set, override it with the number of cores
+    if (conf.getOption("spark.executorEnv.OMP_NUM_THREADS").isEmpty) {
+      // SPARK-28843: limit the OpenMP thread pool to the number of cores assigned to this executor
+      // this avoids high memory consumption with pandas/numpy because of a large OpenMP thread pool
+      // see https://github.com/numpy/numpy/issues/10455
+      execCoresProp.foreach(envVars.put("OMP_NUM_THREADS", _))

Review Comment:
   Will create an PR once this is merged



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jzhuge commented on a diff in pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "jzhuge (via GitHub)" <gi...@apache.org>.
jzhuge commented on code in PR #40199:
URL: https://github.com/apache/spark/pull/40199#discussion_r1119492302


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -135,6 +135,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
     val execCoresProp = Option(context.getLocalProperty(EXECUTOR_CORES_LOCAL_PROPERTY))
     val memoryMb = Option(context.getLocalProperty(PYSPARK_MEMORY_LOCAL_PROPERTY)).map(_.toLong)
     val localdir = env.blockManager.diskBlockManager.localDirs.map(f => f.getPath()).mkString(",")
+    // if OMP_NUM_THREADS is not explicitly set, override it with the number of cores
+    if (conf.getOption("spark.executorEnv.OMP_NUM_THREADS").isEmpty) {
+      // SPARK-28843: limit the OpenMP thread pool to the number of cores assigned to this executor
+      // this avoids high memory consumption with pandas/numpy because of a large OpenMP thread pool
+      // see https://github.com/numpy/numpy/issues/10455
+      execCoresProp.foreach(envVars.put("OMP_NUM_THREADS", _))

Review Comment:
   Thanks for the clarification! Make sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default
URL: https://github.com/apache/spark/pull/40199


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #40199: [SPARK-42596][CORE] OMP_NUM_THREADS not set to number of executor cor…

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #40199:
URL: https://github.com/apache/spark/pull/40199#discussion_r1119453996


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -135,6 +135,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
     val execCoresProp = Option(context.getLocalProperty(EXECUTOR_CORES_LOCAL_PROPERTY))
     val memoryMb = Option(context.getLocalProperty(PYSPARK_MEMORY_LOCAL_PROPERTY)).map(_.toLong)
     val localdir = env.blockManager.diskBlockManager.localDirs.map(f => f.getPath()).mkString(",")
+    // if OMP_NUM_THREADS is not explicitly set, override it with the number of cores
+    if (conf.getOption("spark.executorEnv.OMP_NUM_THREADS").isEmpty) {
+      // SPARK-28843: limit the OpenMP thread pool to the number of cores assigned to this executor
+      // this avoids high memory consumption with pandas/numpy because of a large OpenMP thread pool
+      // see https://github.com/numpy/numpy/issues/10455
+      execCoresProp.foreach(envVars.put("OMP_NUM_THREADS", _))

Review Comment:
   We should get this from `spark.task.cpus` or something equivalent from resource profile so the current tasks can uses the number of threads allocated for the number of cores to use (`spark.task.cpus`).
   
   Since this is not enabled by default, I am fine restoring this but would be great if we can track and handle them. Otherwise SPARK-41188 persists with resource profile enabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #40199:
URL: https://github.com/apache/spark/pull/40199#discussion_r1119488775


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -135,6 +135,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
     val execCoresProp = Option(context.getLocalProperty(EXECUTOR_CORES_LOCAL_PROPERTY))
     val memoryMb = Option(context.getLocalProperty(PYSPARK_MEMORY_LOCAL_PROPERTY)).map(_.toLong)
     val localdir = env.blockManager.diskBlockManager.localDirs.map(f => f.getPath()).mkString(",")
+    // if OMP_NUM_THREADS is not explicitly set, override it with the number of cores
+    if (conf.getOption("spark.executorEnv.OMP_NUM_THREADS").isEmpty) {
+      // SPARK-28843: limit the OpenMP thread pool to the number of cores assigned to this executor
+      // this avoids high memory consumption with pandas/numpy because of a large OpenMP thread pool
+      // see https://github.com/numpy/numpy/issues/10455
+      execCoresProp.foreach(envVars.put("OMP_NUM_THREADS", _))

Review Comment:
   To be clear, I am fine without fixing this now, and just merge this PR. I know it will need non-trivial change to address this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jzhuge commented on a diff in pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "jzhuge (via GitHub)" <gi...@apache.org>.
jzhuge commented on code in PR #40199:
URL: https://github.com/apache/spark/pull/40199#discussion_r1119500644


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -135,6 +135,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
     val execCoresProp = Option(context.getLocalProperty(EXECUTOR_CORES_LOCAL_PROPERTY))
     val memoryMb = Option(context.getLocalProperty(PYSPARK_MEMORY_LOCAL_PROPERTY)).map(_.toLong)
     val localdir = env.blockManager.diskBlockManager.localDirs.map(f => f.getPath()).mkString(",")
+    // if OMP_NUM_THREADS is not explicitly set, override it with the number of cores
+    if (conf.getOption("spark.executorEnv.OMP_NUM_THREADS").isEmpty) {
+      // SPARK-28843: limit the OpenMP thread pool to the number of cores assigned to this executor
+      // this avoids high memory consumption with pandas/numpy because of a large OpenMP thread pool
+      // see https://github.com/numpy/numpy/issues/10455
+      execCoresProp.foreach(envVars.put("OMP_NUM_THREADS", _))

Review Comment:
   Created https://issues.apache.org/jira/browse/SPARK-42613



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40199:
URL: https://github.com/apache/spark/pull/40199#issuecomment-1447340266

   @jzhuge let's fix the PR description. This technically does not report https://github.com/apache/spark/pull/38699 but fixes a mistake that removed the code (that author and reviewers thought it's a duplicate).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jzhuge commented on a diff in pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "jzhuge (via GitHub)" <gi...@apache.org>.
jzhuge commented on code in PR #40199:
URL: https://github.com/apache/spark/pull/40199#discussion_r1119500644


##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -135,6 +135,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
     val execCoresProp = Option(context.getLocalProperty(EXECUTOR_CORES_LOCAL_PROPERTY))
     val memoryMb = Option(context.getLocalProperty(PYSPARK_MEMORY_LOCAL_PROPERTY)).map(_.toLong)
     val localdir = env.blockManager.diskBlockManager.localDirs.map(f => f.getPath()).mkString(",")
+    // if OMP_NUM_THREADS is not explicitly set, override it with the number of cores
+    if (conf.getOption("spark.executorEnv.OMP_NUM_THREADS").isEmpty) {
+      // SPARK-28843: limit the OpenMP thread pool to the number of cores assigned to this executor
+      // this avoids high memory consumption with pandas/numpy because of a large OpenMP thread pool
+      // see https://github.com/numpy/numpy/issues/10455
+      execCoresProp.foreach(envVars.put("OMP_NUM_THREADS", _))

Review Comment:
   Created SPARK-42613 to follow up



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40199:
URL: https://github.com/apache/spark/pull/40199#issuecomment-1447455996

   nit but mind fixing up the PR description? want to merge it now :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jzhuge commented on pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "jzhuge (via GitHub)" <gi...@apache.org>.
jzhuge commented on PR #40199:
URL: https://github.com/apache/spark/pull/40199#issuecomment-1447461080

   Updated the PR description


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #40199: [SPARK-42596][CORE] OMP_NUM_THREADS not set to number of executor cor…

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #40199:
URL: https://github.com/apache/spark/pull/40199#issuecomment-1447225712

   cc @WeichenXu123 , @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40199: [SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40199:
URL: https://github.com/apache/spark/pull/40199#issuecomment-1447467586

   Merged to master, branch-3.4, branch-3.3, and branch-3.2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org