You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "cocoatomo (JIRA)" <ji...@apache.org> on 2014/10/03 02:04:33 UTC

[jira] [Created] (SPARK-3772) RDD operation on IPython REPL failed with an illegal port number

cocoatomo created SPARK-3772:
--------------------------------

             Summary: RDD operation on IPython REPL failed with an illegal port number
                 Key: SPARK-3772
                 URL: https://issues.apache.org/jira/browse/SPARK-3772
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.2.0
         Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0
            Reporter: cocoatomo


To reproduce this issue, we should execute following commands.

{quote}
$ PYSPARK_PYTHON=ipython ./bin/pyspark
...
In [1]: file = sc.textFile('README.md')
In [2]: file.first()
...
14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded
14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1
14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at PythonRDD.scala:334
14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at PythonRDD.scala:334) with 1 output partitions (allowLocal=true)
14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at PythonRDD.scala:334)
14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List()
14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List()
14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44), which has no missing parents
14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with curMem=57388, maxMem=278019440
14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.4 KB, free 265.1 MB)
14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44)
14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1207 bytes)
14/10/03 08:50:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
14/10/03 08:50:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: port out of range:1027423549
	at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
	at java.net.InetSocketAddress.<init>(InetSocketAddress.java:188)
	at java.net.Socket.<init>(Socket.java:244)
	at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75)
	at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90)
	at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
	at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
	at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:100)
	at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:71)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
	at org.apache.spark.scheduler.Task.run(Task.scala:56)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:744)
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org