You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/03/27 08:34:56 UTC

[GitHub] [spark] stczwd opened a new pull request #28048: Remove useless conf set in pyspark context

stczwd opened a new pull request #28048: Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048
 
 
    ### What changes were proposed in this pull request?
   In Pyspark/context.py, we extract the configuration of the "spark.executorEnv" prefix from conf and put it in env. But in PythonWorkerFactory, we have already obtained these environment variables from ProgressBuilder, thus we do not need the extra copy.
   In addition, in some special environment deployments, user-configured environment variables such as "spark.executorEnv.Path=/usr/lib:$PATH" will be automatically parsed as actual machine confs. If we still directly copy and overwrite it, the configuration will be a wrong config.
   
   ### Does this PR introduce any user-facing change?
   No
   
   
   ### How was this patch tested?
   run suite tests

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399461868
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
-            if k.startswith("spark.executorEnv."):
-                varName = k[len("spark.executorEnv."):]
-                self.environment[varName] = v
-
 
 Review comment:
   Do you think we can have a test case for your claim?
   > If we still directly copy and overwrite it, the configuration will be a wrong config.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #28048: Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #28048: Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-604881890
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605201769
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25218/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399929319
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   > Can you describe which issue you faced? It seems not an issue in the actual enviornment.
   
   We did meet some problems in actual environment. When the user submits some basic environment configuration, such as LD_LIBRARY_PATH, we will automatically load the current or related environment before JVM starts on yarn or k8s.
   For example, We have already put hadoop libraries in cluster and set `LD_LIBRARY_PATH='/usr/native_lib'`.  User use these configuration to load hadoop libraries and python libraries.
   ```
   spark.executorEnv.PYTHONHOME ./python
   spark.executorEnv.LD_LIBRARY_PATH $ PYTHONHOME / lib / python2.7 / site-packages: $ LD_LIBRARY_PATH
   ```
   As pyspark always overwrites the jvm environment, the environment, passed into python worker, will be invalidated. LD_LIBRARY_PATH will be configured as `$PYTHONHOME/lib/python2.7/site-packages:$ LD_LIBRARY_PATH`, without any configuration taking effect.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399957870
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   > As pyspark always overwrites the jvm environment, the environment, passed into python worker, will be invalidated.
   
   Are you saying you're relying on an environment that was not set by `spark.executorEnv.*`? The correct way to set the environment variables is to use `spark.executorEnv.*` explicitly. It is correct for `spark.executorEnv.*` to overwrite the environment variables.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #28048: Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #28048: Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-604881890
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399965407
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   > Are you saying you're relying on an environment that was not set by `spark.executorEnv.*`? 
   
   No. We are just trying to be compatible with the user's environment variable parameters in spark.executorEnv. *.
   As for example in the above:
   ```bash
   # these two has same results
   spark.executorEnv.LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH;
   spark.executorEnv.LD_LIBRARY_PATH =./python/lib:/usr/native_lib;
   # this is also supported if user doesn't need $LD_LIBRARY_PATH
   spark.executorEnv.LD_LIBRARY_PATH =./python/lib
   ```
   We do not prevent users from overriding environment variables, we just try to meet users' needs for using cluster environment variables or their environment variables, like linux.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399774901
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   It seems like this is used to fix pyspark's tests, not really needed in actual environment. And this part of code is not to synchronize the configuration of the JVM, but to Ignore it.
   We did meet some problems here. The code here would overwrite the environment variables passed by the JVM, it is terrible especially when we optimized the environment variables in the JVM.
   I will see if there is any way to solve the test failure in [615fb64](https://github.com/apache/spark/commit/615fb649d66b13371927a051d249433d746c5f19). Even if there is no way, we should add the judgment and won't use it in actual environment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605236342
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120511/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605199160
 
 
   ok to test

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605236342
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120511/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #28048: Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #28048: Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-604882443
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on issue #28048: Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on issue #28048: Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-604889564
 
 
   cc @cloud-fan @dongjoon-hyun 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605559683
 
 
   Thanks for cc'ing me @dongjoon-hyun.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r400663732
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   > with this code, the environment configuration in JVM and python worker won't be same,
   
   I don't see a clear explanation about it. The code here literally want to keep the JVM and python worker env configs to be the same.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605236272
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-606412176
 
 
   > Sure, you can reopen this when you have a valid use case and a unit test case, @stczwd .
   
   Thanks, @dongjoon-hyun @HyukjinKwon @cloud-fan 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-606339935
 
 
   I still think this is really needed. Maybe reopen this again if other also need this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-606338842
 
 
   I still think this is really needed. Maybe reopen this again.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605201305
 
 
   **[Test build #120511 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120511/testReport)** for PR 28048 at commit [`3fe6f9b`](https://github.com/apache/spark/commit/3fe6f9b7967d26e2f103843d80664fb9a2f93ccc).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399777410
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   Also please fill in "Does this PR introduce any user-facing change?" as well. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399746815
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   Assuming from https://github.com/apache/spark/commit/615fb649d66b13371927a051d249433d746c5f19, seems like this codes are to make sure `self.environment` are synced with the actual `spark.executorEnv` in JVM side. This isn't needed but seems right to keep it matched. I wouldn't remove this unless there's an issue caused by this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399972619
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   In linux, we can use export LD_LIBRARY_PATH to inherit or overwrite environment. We want to support this in spark as well.
   ```bash
   # we got same result in linux
   export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH;
   export LD_LIBRARY_PATH=./python/lib:/usr/native_lib;
   # we can overwrite environment in linux like this
   export LD_LIBRARY_PATH=./python/lib
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605201769
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25218/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399934757
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   It will be great to add a test to demonstrate this problem.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605201305
 
 
   **[Test build #120511 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120511/testReport)** for PR 28048 at commit [`3fe6f9b`](https://github.com/apache/spark/commit/3fe6f9b7967d26e2f103843d80664fb9a2f93ccc).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605199329
 
 
   cc @HyukjinKwon 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605236272
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399777193
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   Can you describe which issue you faced? It seems not an issue in the actual enviornment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r400585844
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
 
 Review comment:
   The mainly change is in yarn and k8s, not in spark, we should adapt to it.
   What I claim here is that, with this code, the environment configuration in JVM and python worker  won't be same, and the configure passed into python worker is awful.
   You can try `--conf spark.executorEnv.LD_LIBRARY_PATH=$PYTHONHOME/lib/python2.7/site-packages:$ LD_LIBRARY_PATH`, and find what I mean.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-606385501
 
 
   Sure, you can reopen this when you have a valid use case and a unit test case, @stczwd .

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-604882443
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
stczwd commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-606338842
 
 
   I still think this is really needed. Maybe reopen this again.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605201764
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399461868
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
         self.appName = self._conf.get("spark.app.name")
         self.sparkHome = self._conf.get("spark.home", None)
 
-        for (k, v) in self._conf.getAll():
-            if k.startswith("spark.executorEnv."):
-                varName = k[len("spark.executorEnv."):]
-                self.environment[varName] = v
-
 
 Review comment:
   Can we have a test case for your claim?
   > If we still directly copy and overwrite it, the configuration will be a wrong config.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605232957
 
 
   **[Test build #120511 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120511/testReport)** for PR 28048 at commit [`3fe6f9b`](https://github.com/apache/spark/commit/3fe6f9b7967d26e2f103843d80664fb9a2f93ccc).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #28048: [SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#issuecomment-605201764
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org