You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by swaapnika-guntaka <gi...@git.apache.org> on 2017/10/11 17:57:26 UTC

[GitHub] spark issue #30: SPARK-1004. PySpark on YARN

Github user swaapnika-guntaka commented on the issue:

    https://github.com/apache/spark/pull/30
  
    I see the Java EOF Exception when I run python packaged jar(using JDK 8) using Spark-2.2
    I'm trying to run this using the below command.
    `time bash -x $SPARK_HOME/bin/spark-submit --driver-class-path .:<pathtojars>:</spark/python/lib> -v $PYTHONPATH/<packaged.jar> >& run.log` 
    ```
    Recent failure: Lost task 3.3 in stage 0.0 (TID 36, 10.15.163.25, executor 0): java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:166)
        at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
        at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
        at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
        at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:395)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org