You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by yogita bhardwaj <yo...@iktara.ai> on 2022/09/20 22:20:21 UTC

I have installed pyspark using pip.
I m getting the error while running the following code.
from pyspark import SparkContext
sc=SparkContext()
a=sc.parallelize([1,2,3,4])
print(f"a_take:{a.take(2)}")

py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (DESKTOP-DR2QC97.mshome.net executor driver): org.apache.spark.SparkException: Python worker failed to connect back.
                at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:189)
                at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:109)
                at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:124)
                at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:164)
                at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)

Can anyone please help me to resolve this issue.