You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nicholas Chammas <ni...@gmail.com> on 2014/03/20 06:12:49 UTC

PySpark worker fails with IOError Broken Pipe

So I have the pyspark shell open and after some idle time I sometimes get
this:

>>> PySpark worker failed with exception:
> Traceback (most recent call last):
>   File "/root/spark/python/pyspark/worker.py", line 77, in main
>     serializer.dump_stream(func(split_index, iterator), outfile)
>   File "/root/spark/python/pyspark/serializers.py", line 182, in
> dump_stream
>     self.serializer.dump_stream(self._batched(iterator), stream)
>   File "/root/spark/python/pyspark/serializers.py", line 118, in
> dump_stream
>     self._write_with_length(obj, stream)
>   File "/root/spark/python/pyspark/serializers.py", line 130, in
> _write_with_length
>     stream.write(serialized)
> IOError: [Errno 32] Broken pipe
> Traceback (most recent call last):
>   File "/root/spark/python/pyspark/daemon.py", line 117, in launch_worker
>     worker(listen_sock)
>   File "/root/spark/python/pyspark/daemon.py", line 107, in worker
>     outfile.flush()
> IOError: [Errno 32] Broken pipe


The shell is still alive and I can continue to do work.

Is this anything to worry about or fix?

Nick




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-worker-fails-with-IOError-Broken-Pipe-tp2916.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: PySpark worker fails with IOError Broken Pipe

Posted by Nicholas Chammas <ni...@gmail.com>.
I'm using Spark 0.9.0 on EC2, deployed via spark-ec2.

The few times it's happened to me so far, the shell will just be idle for a
few minutes and then BAM I get that error, but the shell still seems to
work.

If I find a pattern to the issue I will report it here.


On Thu, Mar 20, 2014 at 8:10 AM, Jim Blomo <ji...@gmail.com> wrote:

> I think I've encountered the same problem and filed
> https://spark-project.atlassian.net/plugins/servlet/mobile#issue/SPARK-1284
>
> For me it hung the worker, though. Can you add reproducible steps and what
> version you're running?
> On Mar 19, 2014 10:13 PM, "Nicholas Chammas" <ni...@gmail.com>
> wrote:
>
>> So I have the pyspark shell open and after some idle time I sometimes get
>> this:
>>
>> >>> PySpark worker failed with exception:
>>> Traceback (most recent call last):
>>>   File "/root/spark/python/pyspark/worker.py", line 77, in main
>>>     serializer.dump_stream(func(split_index, iterator), outfile)
>>>   File "/root/spark/python/pyspark/serializers.py", line 182, in
>>> dump_stream
>>>     self.serializer.dump_stream(self._batched(iterator), stream)
>>>   File "/root/spark/python/pyspark/serializers.py", line 118, in
>>> dump_stream
>>>     self._write_with_length(obj, stream)
>>>   File "/root/spark/python/pyspark/serializers.py", line 130, in
>>> _write_with_length
>>>     stream.write(serialized)
>>> IOError: [Errno 32] Broken pipe
>>> Traceback (most recent call last):
>>>   File "/root/spark/python/pyspark/daemon.py", line 117, in launch_worker
>>>     worker(listen_sock)
>>>   File "/root/spark/python/pyspark/daemon.py", line 107, in worker
>>>     outfile.flush()
>>> IOError: [Errno 32] Broken pipe
>>
>>
>> The shell is still alive and I can continue to do work.
>>
>> Is this anything to worry about or fix?
>>
>> Nick
>>
>>
>> ------------------------------
>> View this message in context: PySpark worker fails with IOError Broken
>> Pipe<http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-worker-fails-with-IOError-Broken-Pipe-tp2916.html>
>> Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at Nabble.com.
>>
>

Re: PySpark worker fails with IOError Broken Pipe

Posted by Jim Blomo <ji...@gmail.com>.
I think I've encountered the same problem and filed
https://spark-project.atlassian.net/plugins/servlet/mobile#issue/SPARK-1284

For me it hung the worker, though. Can you add reproducible steps and what
version you're running?
On Mar 19, 2014 10:13 PM, "Nicholas Chammas" <ni...@gmail.com>
wrote:

> So I have the pyspark shell open and after some idle time I sometimes get
> this:
>
> >>> PySpark worker failed with exception:
>> Traceback (most recent call last):
>>   File "/root/spark/python/pyspark/worker.py", line 77, in main
>>     serializer.dump_stream(func(split_index, iterator), outfile)
>>   File "/root/spark/python/pyspark/serializers.py", line 182, in
>> dump_stream
>>     self.serializer.dump_stream(self._batched(iterator), stream)
>>   File "/root/spark/python/pyspark/serializers.py", line 118, in
>> dump_stream
>>     self._write_with_length(obj, stream)
>>   File "/root/spark/python/pyspark/serializers.py", line 130, in
>> _write_with_length
>>     stream.write(serialized)
>> IOError: [Errno 32] Broken pipe
>> Traceback (most recent call last):
>>   File "/root/spark/python/pyspark/daemon.py", line 117, in launch_worker
>>     worker(listen_sock)
>>   File "/root/spark/python/pyspark/daemon.py", line 107, in worker
>>     outfile.flush()
>> IOError: [Errno 32] Broken pipe
>
>
> The shell is still alive and I can continue to do work.
>
> Is this anything to worry about or fix?
>
> Nick
>
>
> ------------------------------
> View this message in context: PySpark worker fails with IOError Broken
> Pipe<http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-worker-fails-with-IOError-Broken-Pipe-tp2916.html>
> Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at Nabble.com.
>