You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by javacaoyu <ja...@163.com> on 2022/09/20 09:57:32 UTC

回复:

Try:


import os
os.environ['PYSPARK_PYTHON'] = “python path”
os.environ[’SPARK_HOME’] = “SPARK path”






在 2022年9月20日 17:51,yogita bhardwaj<yo...@iktara.ai> 写道:


 
I have installed pyspark using pip.
I m getting the error while running the following code.
from pyspark import SparkContext
sc=SparkContext()
a=sc.parallelize([1,2,3,4])
print(f"a_take:{a.take(2)}")
 
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (DESKTOP-DR2QC97.mshome.net executor driver): org.apache.spark.SparkException: Python worker failed to connect back.
                at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:189)
                at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:109)
                at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:124)
                at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:164)
                at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
 
Can anyone please help me to resolve this issue.