You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "jingxiong zhong (Jira)" <ji...@apache.org> on 2021/12/21 17:53:00 UTC

[jira] [Comment Edited] (SPARK-26404) set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s client-cluster mode.

    [ https://issues.apache.org/jira/browse/SPARK-26404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17463384#comment-17463384 ] 

jingxiong zhong edited comment on SPARK-26404 at 12/21/21, 5:52 PM:
--------------------------------------------------------------------

@gollum999Tim Sanders,hey sir,  I have a question about that how do I add my Python dependencies to Spark Job, as following 
{code:sh}
spark-submit \
--archives s3a://path/python3.6.9.tgz#python3.6.9 \
--conf "spark.pyspark.driver.python=python3.6.9/bin/python3" \
--conf "spark.pyspark.python=python3.6.9/bin/python3" \
--name "piroottest" \
./examples/src/main/python/pi.py 10
{code}
this can't run my job sucessfully,it throw error

{code:sh}
Traceback (most recent call last):
  File "/tmp/spark-63b77184-6e89-4121-bc32-6a1b793e0c85/pi.py", line 21, in <module>
    from pyspark.sql import SparkSession
  File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 121, in <module>
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/__init__.py", line 42, in <module>
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 27, in <module>
    async def _ag():
  File "/opt/spark/work-dir/python3.6.9/lib/python3.6/ctypes/__init__.py", line 7, in <module>
    from _ctypes import Union, Structure, Array
ImportError: libffi.so.6: cannot open shared object file: No such file or directory
{code}

Or is there another way to add Python dependencies?



was (Author: JIRAUSER281124):
@gollum999Tim Sanders,hey sir,  I have a question about that how can I add my python dependency into spark job, as following 
{code:sh}
spark-submit \
--archives s3a://path/python3.6.9.tgz#python3.6.9 \
--conf "spark.pyspark.driver.python=python3.6.9/bin/python3" \
--conf "spark.pyspark.python=python3.6.9/bin/python3" \
--name "piroottest" \
./examples/src/main/python/pi.py 10
{code}
this can't run my job sucessfully,it throw error

{code:sh}
Traceback (most recent call last):
  File "/tmp/spark-63b77184-6e89-4121-bc32-6a1b793e0c85/pi.py", line 21, in <module>
    from pyspark.sql import SparkSession
  File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 121, in <module>
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/__init__.py", line 42, in <module>
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 27, in <module>
    async def _ag():
  File "/opt/spark/work-dir/python3.6.9/lib/python3.6/ctypes/__init__.py", line 7, in <module>
    from _ctypes import Union, Structure, Array
ImportError: libffi.so.6: cannot open shared object file: No such file or directory
{code}

Or is there another way to add Python dependencies?


> set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s client-cluster mode.
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-26404
>                 URL: https://issues.apache.org/jira/browse/SPARK-26404
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes, Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Dongqing  Liu
>            Priority: Major
>
> Neither
>    conf.set("spark.executorEnv.PYSPARK_PYTHON", "/opt/pythonenvs/bin/python")
> nor 
>   conf.set("spark.pyspark.python", "/opt/pythonenvs/bin/python") 
> works. 
> Looks like the executor always picks python from PATH.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org