You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/07/22 05:35:22 UTC

new error for me

Does anyone know what this error means:
14/07/21 23:07:22 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks
14/07/21 23:07:22 INFO TaskSetManager: Starting task 3.0:0 as TID 1620 on
executor 27: r104u05.oculus.local (PROCESS_LOCAL)
14/07/21 23:07:22 INFO TaskSetManager: Serialized task 3.0:0 as 8620 bytes
in 1 ms
14/07/21 23:07:36 INFO BlockManagerInfo: Added taskresult_1620 in memory on
r104u05.oculus.local:50795 (size: 64.9 MB, free: 18.3 GB)
14/07/21 23:07:36 INFO SendingConnection: Initiating connection to [r104u05.
oculus.local/192.168.0.105:50795]
14/07/21 23:07:57 INFO ConnectionManager: key already cancelled ?
sun.nio.ch.SelectionKeyImpl@1d86a150
java.nio.channels.CancelledKeyException
    at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
    at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
    at
org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:265)
    at
org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:115)
14/07/21 23:07:57 WARN SendingConnection: Error finishing connection to
r104u05.oculus.local/192.168.0.105:50795
java.net.ConnectException: Connection timed out
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
    at
org.apache.spark.network.SendingConnection.finishConnect(Connection.scala:318)
    at
org.apache.spark.network.ConnectionManager$$anon$7.run(ConnectionManager.scala:202)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
14/07/21 23:07:57 INFO ConnectionManager: Handling connection error on
connection to ConnectionManagerId(r104u05.oculus.local,50795)
14/07/21 23:07:57 INFO ConnectionManager: Removing SendingConnection to
ConnectionManagerId(r104u05.oculus.local,50795)
14/07/21 23:07:57 INFO ConnectionManager: Notifying
org.apache.spark.network.ConnectionManager$MessageStatus@13ad274d
14/07/21 23:07:57 INFO ConnectionManager: Handling connection error on
connection to ConnectionManagerId(r104u05.oculus.local,50795)
14/07/21 23:07:57 INFO ConnectionManager: Removing SendingConnection to
ConnectionManagerId(r104u05.oculus.local,50795)
14/07/21 23:07:57 INFO ConnectionManager: Removing SendingConnection to
ConnectionManagerId(r104u05.oculus.local,50795)
14/07/21 23:07:57 WARN TaskSetManager: Lost TID 1620 (task 3.0:0)
14/07/21 23:07:57 WARN TaskSetManager: Lost result for TID 1620 on host
r104u05.oculus.local

I've never seen this one before, and now it's coming up consistently.

Thanks,
         -Nathan

Re: new error for me

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
I used to face this while running it on a single node machine and when i
allocate more memory for the executor. (ie, my machine was 28Gb memory and
i allocated 26Gb for the executor, dropping the memory from 26 to 20Gb
solved my issue.). If you are seeing an executor lost exception then you
can try reducing the memory.

Thanks
Best Regards

On Fri, Oct 3, 2014 at 7:53 AM, jamborta <ja...@gmail.com> wrote:

> have you found a solution this problem? (or at least a cause)
>
> thanks,
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/new-error-for-me-tp10378p15655.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: new error for me

Posted by jamborta <ja...@gmail.com>.
have you found a solution this problem? (or at least a cause)

thanks,



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/new-error-for-me-tp10378p15655.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: new error for me

Posted by phoenix bai <mi...@gmail.com>.
I am currently facing the same problem. error snapshot as below:

14-07-24 19:15:30 WARN [pool-3-thread-1] SendingConnection: Error
finishing connection to r64b22034.tt.net/10.148.129.84:47525
java.net.ConnectException: Connection timed out
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
	at org.apache.spark.network.SendingConnection.finishConnect(Connection.scala:318)
	at org.apache.spark.network.ConnectionManager$$anon$7.run(ConnectionManager.scala:203)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
14-07-24 19:15:30 INFO [pool-3-thread-1] ConnectionManager: Handling
connection error on connection to
ConnectionManagerId(r64b22034.tt.net,47525)
14-07-24 19:15:30 INFO [pool-3-thread-1] ConnectionManager: Removing
SendingConnection to ConnectionManagerId(r64b22034.tt.net,47525)
14-07-24 19:15:30 INFO [pool-3-thread-1] ConnectionManager: Notifying
org.apache.spark.network.ConnectionManager$MessageStatus@1704ebb


could anyone help shed a light on this?


thanks




On Tue, Jul 22, 2014 at 11:35 AM, Nathan Kronenfeld <
nkronenfeld@oculusinfo.com> wrote:

> Does anyone know what this error means:
> 14/07/21 23:07:22 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks
> 14/07/21 23:07:22 INFO TaskSetManager: Starting task 3.0:0 as TID 1620 on
> executor 27: r104u05.oculus.local (PROCESS_LOCAL)
> 14/07/21 23:07:22 INFO TaskSetManager: Serialized task 3.0:0 as 8620 bytes
> in 1 ms
> 14/07/21 23:07:36 INFO BlockManagerInfo: Added taskresult_1620 in memory
> on r104u05.oculus.local:50795 (size: 64.9 MB, free: 18.3 GB)
> 14/07/21 23:07:36 INFO SendingConnection: Initiating connection to
> [r104u05.oculus.local/192.168.0.105:50795]
> 14/07/21 23:07:57 INFO ConnectionManager: key already cancelled ?
> sun.nio.ch.SelectionKeyImpl@1d86a150
> java.nio.channels.CancelledKeyException
>     at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
>     at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
>     at
> org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:265)
>     at
> org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:115)
> 14/07/21 23:07:57 WARN SendingConnection: Error finishing connection to
> r104u05.oculus.local/192.168.0.105:50795
> java.net.ConnectException: Connection timed out
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
>     at
> org.apache.spark.network.SendingConnection.finishConnect(Connection.scala:318)
>     at
> org.apache.spark.network.ConnectionManager$$anon$7.run(ConnectionManager.scala:202)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:724)
> 14/07/21 23:07:57 INFO ConnectionManager: Handling connection error on
> connection to ConnectionManagerId(r104u05.oculus.local,50795)
> 14/07/21 23:07:57 INFO ConnectionManager: Removing SendingConnection to
> ConnectionManagerId(r104u05.oculus.local,50795)
> 14/07/21 23:07:57 INFO ConnectionManager: Notifying
> org.apache.spark.network.ConnectionManager$MessageStatus@13ad274d
> 14/07/21 23:07:57 INFO ConnectionManager: Handling connection error on
> connection to ConnectionManagerId(r104u05.oculus.local,50795)
> 14/07/21 23:07:57 INFO ConnectionManager: Removing SendingConnection to
> ConnectionManagerId(r104u05.oculus.local,50795)
> 14/07/21 23:07:57 INFO ConnectionManager: Removing SendingConnection to
> ConnectionManagerId(r104u05.oculus.local,50795)
> 14/07/21 23:07:57 WARN TaskSetManager: Lost TID 1620 (task 3.0:0)
> 14/07/21 23:07:57 WARN TaskSetManager: Lost result for TID 1620 on host
> r104u05.oculus.local
>
> I've never seen this one before, and now it's coming up consistently.
>
> Thanks,
>          -Nathan
>
>