You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/12/23 04:18:31 UTC
[GitHub] [spark] Yikun opened a new pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Yikun opened a new pull request #34993:
URL: https://github.com/apache/spark/pull/34993
### What changes were proposed in this pull request?
Replace `os.environ` direct access with `_find_spark_home` in pyspark testing utils.
### Why are the changes needed?
When running PySpark unit test in WSL with PyCharm, all test raise the `KeyError: 'SPARK_HOME'`
```
Launching unittests with arguments python -m unittest test_rdd.RDDTests.test_range in /home/yikun/spark/python/pyspark/testsTraceback (most recent call last):
File "/mnt/d/Program Files/JetBrains/PyCharm 2021.1.3/plugins/python/helpers/pycharm/_jb_unittest_runner.py", line 35, in <module>
sys.exit(main(argv=args, module=None, testRunner=unittestpy.TeamcityTestRunner, buffer=not JB_DISABLE_BUFFERING))
File "/usr/lib/python3.8/unittest/main.py", line 100, in __init__
self.parseArgs(argv)
File "/usr/lib/python3.8/unittest/main.py", line 147, in parseArgs
self.createTests()
File "/usr/lib/python3.8/unittest/main.py", line 158, in createTests
self.test = self.testLoader.loadTestsFromNames(self.testNames,
File "/usr/lib/python3.8/unittest/loader.py", line 220, in loadTestsFromNames
suites = [self.loadTestsFromName(name, module) for name in names]
File "/usr/lib/python3.8/unittest/loader.py", line 220, in <listcomp>
suites = [self.loadTestsFromName(name, module) for name in names]
File "/usr/lib/python3.8/unittest/loader.py", line 154, in loadTestsFromName
module = __import__(module_name)
File "/home/yikun/spark/python/pyspark/tests/test_rdd.py", line 37, in <module>
from pyspark.testing.utils import ReusedPySparkTestCase, SPARK_HOME, QuietTest
File "/home/yikun/spark/python/pyspark/testing/utils.py", line 47, in <module>
SPARK_HOME = os.environ["SPARK_HOME"]#_find_spark_home()
File "/usr/lib/python3.8/os.py", line 675, in __getitem__
raise KeyError(key) from None
KeyError: 'SPARK_HOME'
```
We should use `_find_spark_home` to get SPARK_HOME rather than access `os.environ`.
### Does this PR introduce _any_ user-facing change?
No, test only
### How was this patch tested?
- UT passed.
- Run test in Win WSL.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774350553
##########
File path: python/pyspark/testing/utils.py
##########
@@ -152,7 +153,7 @@ def close(self):
def search_jar(project_relative_path, sbt_jar_name_prefix, mvn_jar_name_prefix):
# Note that 'sbt_jar_name_prefix' and 'mvn_jar_name_prefix' are used since the prefix can
# vary for SBT or Maven specifically. See also SPARK-26856
- project_full_path = os.path.join(os.environ["SPARK_HOME"], project_relative_path)
+ project_full_path = os.path.join(_find_spark_home(), project_relative_path)
Review comment:
I think we can just use `SPARK_HOME`. SPARK_HOME should be set before importing this in our testing script anyway.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000050229
**[Test build #146500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146500/testReport)** for PR 34993 at commit [`2b1040f`](https://github.com/apache/spark/commit/2b1040fecf10cc8697c36b8c50fc62c474c84d4f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #34993:
URL: https://github.com/apache/spark/pull/34993
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] Yikun commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774372541
##########
File path: python/pyspark/sql/utils.py
##########
@@ -245,12 +246,7 @@ def require_test_compiled() -> None:
import os
import glob
- try:
- spark_home = os.environ["SPARK_HOME"]
- except KeyError:
- raise RuntimeError("SPARK_HOME is not defined in environment")
-
- test_class_path = os.path.join(spark_home, "sql", "core", "target", "*", "test-classes")
+ test_class_path = os.path.join(_find_spark_home(), "sql", "core", "target", "*", "test-classes")
Review comment:
It works, but I assumed `python/pyspark/testing/utils.py` should be imported py in tests/* rather than utils (even this function is only used for tests), it perhaps bring [some redundant test import and even circle dependecy](https://github.com/apache/spark/blob/6fb417e21c98283c40713698c58a73e68ea9614f/python/pyspark/testing/sqlutils.py#L30-L58) to sql utils, might bring some performance regression when someone use sql utils in non-test code, so I didn't change this in here. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000059358
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50976/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774333683
##########
File path: python/pyspark/testing/utils.py
##########
@@ -22,6 +22,7 @@
import unittest
from time import time, sleep
+from pyspark.find_spark_home import _find_spark_home
from pyspark import SparkContext, SparkConf
Review comment:
Actually it says:
> imports should be sorted lexicographically, ignoring case, according to each module’s full package path (the `path` in `from path import ...`)
Since `pyspark.` is after `pyspark` , I think placing `from pyspark import SparkContext, SparkConf` up seems correct :-).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000085510
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50976/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000110567
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146505/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774350685
##########
File path: python/pyspark/tests/test_appsubmit.py
##########
@@ -23,13 +23,15 @@
import unittest
import zipfile
+from pyspark.find_spark_home import _find_spark_home
+
class SparkSubmitTests(unittest.TestCase):
def setUp(self):
self.programDir = tempfile.mkdtemp()
tmp_dir = tempfile.gettempdir()
self.sparkSubmit = [
- os.path.join(os.environ.get("SPARK_HOME"), "bin", "spark-submit"),
+ os.path.join(_find_spark_home(), "bin", "spark-submit"),
Review comment:
Here too. Let's just import `SPARK_HOME`. I think it's fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000042280
**[Test build #146500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146500/testReport)** for PR 34993 at commit [`2b1040f`](https://github.com/apache/spark/commit/2b1040fecf10cc8697c36b8c50fc62c474c84d4f).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000132241
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50981/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000110567
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146505/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000096203
**[Test build #146505 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146505/testReport)** for PR 34993 at commit [`90648bc`](https://github.com/apache/spark/commit/90648bc43a62656742f65bb3f657b8e154147b3c).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000141782
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50981/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000085461
**[Test build #146505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146505/testReport)** for PR 34993 at commit [`90648bc`](https://github.com/apache/spark/commit/90648bc43a62656742f65bb3f657b8e154147b3c).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] Yikun commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774372541
##########
File path: python/pyspark/sql/utils.py
##########
@@ -245,12 +246,7 @@ def require_test_compiled() -> None:
import os
import glob
- try:
- spark_home = os.environ["SPARK_HOME"]
- except KeyError:
- raise RuntimeError("SPARK_HOME is not defined in environment")
-
- test_class_path = os.path.join(spark_home, "sql", "core", "target", "*", "test-classes")
+ test_class_path = os.path.join(_find_spark_home(), "sql", "core", "target", "*", "test-classes")
Review comment:
It works, but I assumed `python/pyspark/testing/utils.py` should be imported in tests/* rather than utils (even this function is only used for tests), it perhaps bring [some redundant test import and even circle dependecy](https://github.com/apache/spark/blob/6fb417e21c98283c40713698c58a73e68ea9614f/python/pyspark/testing/sqlutils.py#L30-L58) to sql utils, might bring some performance regression when someone use sql utils in non-test code, so I didn't change this in here. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000104776
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50981/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774313424
##########
File path: python/pyspark/testing/utils.py
##########
@@ -22,6 +22,7 @@
import unittest
from time import time, sleep
+from pyspark.find_spark_home import _find_spark_home
from pyspark import SparkContext, SparkConf
Review comment:
```suggestion
from pyspark import SparkContext, SparkConf
from pyspark.find_spark_home import _find_spark_home
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000192240
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000060766
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146500/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774357011
##########
File path: python/pyspark/sql/utils.py
##########
@@ -245,12 +246,7 @@ def require_test_compiled() -> None:
import os
import glob
- try:
- spark_home = os.environ["SPARK_HOME"]
- except KeyError:
- raise RuntimeError("SPARK_HOME is not defined in environment")
-
- test_class_path = os.path.join(spark_home, "sql", "core", "target", "*", "test-classes")
+ test_class_path = os.path.join(_find_spark_home(), "sql", "core", "target", "*", "test-classes")
Review comment:
Can we use `pyspark.testing.utils.SPARK_HOME` here too?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] Yikun commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774372541
##########
File path: python/pyspark/sql/utils.py
##########
@@ -245,12 +246,7 @@ def require_test_compiled() -> None:
import os
import glob
- try:
- spark_home = os.environ["SPARK_HOME"]
- except KeyError:
- raise RuntimeError("SPARK_HOME is not defined in environment")
-
- test_class_path = os.path.join(spark_home, "sql", "core", "target", "*", "test-classes")
+ test_class_path = os.path.join(_find_spark_home(), "sql", "core", "target", "*", "test-classes")
Review comment:
It works, but I assumed `python/pyspark/testing/utils.py` should be imported py in tests/* rather than utils (even this function is only used for tests), it perhaps bring [some redundant test import and even circle dependecy](https://github.com/apache/spark/blob/6fb417e21c98283c40713698c58a73e68ea9614f/python/pyspark/testing/sqlutils.py#L30-L58) to sql utils, might bring some performance regression when someone use sql utils in non-test code, so I didn't change this in here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000060766
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146500/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000141782
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50981/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000042280
**[Test build #146500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146500/testReport)** for PR 34993 at commit [`2b1040f`](https://github.com/apache/spark/commit/2b1040fecf10cc8697c36b8c50fc62c474c84d4f).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000085510
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50976/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000085461
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] Yikun commented on pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
Yikun commented on pull request #34993:
URL: https://github.com/apache/spark/pull/34993#issuecomment-1000059071
> Maybe let's fix https://github.com/apache/spark/blob/master/python/pyspark/sql/utils.py#L248-L251 too.
Sure, will fix in next commit after confirm https://github.com/apache/spark/pull/34993#discussion_r774325742
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] Yikun commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774338593
##########
File path: python/pyspark/testing/utils.py
##########
@@ -22,6 +22,7 @@
import unittest
from time import time, sleep
+from pyspark.find_spark_home import _find_spark_home
from pyspark import SparkContext, SparkConf
Review comment:
oops, I didn't catch the ".", thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774464655
##########
File path: python/pyspark/sql/utils.py
##########
@@ -245,12 +246,7 @@ def require_test_compiled() -> None:
import os
import glob
- try:
- spark_home = os.environ["SPARK_HOME"]
- except KeyError:
- raise RuntimeError("SPARK_HOME is not defined in environment")
-
- test_class_path = os.path.join(spark_home, "sql", "core", "target", "*", "test-classes")
+ test_class_path = os.path.join(_find_spark_home(), "sql", "core", "target", "*", "test-classes")
Review comment:
oh okay, thats fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] Yikun commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774325742
##########
File path: python/pyspark/testing/utils.py
##########
@@ -22,6 +22,7 @@
import unittest
from time import time, sleep
+from pyspark.find_spark_home import _find_spark_home
from pyspark import SparkContext, SparkConf
Review comment:
I'm okay with this change, but I'm not sure why we need this change?
Actually, I deliberately put this addtion to the front 😂, because [1]: `imports should be sorted lexicographically`. And flake8 also has a plugin `import-order-style` (not enable in Spark) helps to check it.
`pysparkfind_spark_home_find_spark_home` should before `pysparkSparkContextSparkConf`.
[1] https://google.github.io/styleguide/pyguide.html?showone=Imports_formatting#313-imports-formatting
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774332398
##########
File path: python/pyspark/testing/utils.py
##########
@@ -22,6 +22,7 @@
import unittest
from time import time, sleep
+from pyspark.find_spark_home import _find_spark_home
from pyspark import SparkContext, SparkConf
Review comment:
oh okay
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34993: [SPARK-37721][PYTHON] Fix SPARK_HOME key error when running PySpark test
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34993:
URL: https://github.com/apache/spark/pull/34993#discussion_r774350553
##########
File path: python/pyspark/testing/utils.py
##########
@@ -152,7 +153,7 @@ def close(self):
def search_jar(project_relative_path, sbt_jar_name_prefix, mvn_jar_name_prefix):
# Note that 'sbt_jar_name_prefix' and 'mvn_jar_name_prefix' are used since the prefix can
# vary for SBT or Maven specifically. See also SPARK-26856
- project_full_path = os.path.join(os.environ["SPARK_HOME"], project_relative_path)
+ project_full_path = os.path.join(_find_spark_home(), project_relative_path)
Review comment:
I think we can just use `pyspark.testing.utils.SPARK_HOME`. SPARK_HOME should be set before importing this in our testing script anyway.
##########
File path: python/pyspark/tests/test_appsubmit.py
##########
@@ -23,13 +23,15 @@
import unittest
import zipfile
+from pyspark.find_spark_home import _find_spark_home
+
class SparkSubmitTests(unittest.TestCase):
def setUp(self):
self.programDir = tempfile.mkdtemp()
tmp_dir = tempfile.gettempdir()
self.sparkSubmit = [
- os.path.join(os.environ.get("SPARK_HOME"), "bin", "spark-submit"),
+ os.path.join(_find_spark_home(), "bin", "spark-submit"),
Review comment:
Here too. Let's just import `pyspark.testing.utils.SPARK_HOME`. I think it's fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org