You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Alexey Romanchuk <al...@gmail.com> on 2014/12/01 09:37:27 UTC

akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down

Hello spark users!

I found lots of strange messages in driver log. Here it is:

2014-12-01 11:54:23,849 [sparkDriver-akka.actor.default-dispatcher-25]
ERROR
akka.remote.EndpointWriter[akka://sparkDriver/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkExecutor%40data1.hadoop%3A17372-5/endpointWriter]
- AssociationError [akka.tcp://sparkDriver@10.54.87.173:55034] <-
[akka.tcp://sparkExecutor@data1.hadoop:17372]: Error [Shut down address:
akka.tcp://sparkExecutor@data1.hadoop:17372] [
akka.remote.ShutDownAssociation: Shut down address:
akka.tcp://sparkExecutor@data1.hadoop:17372
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The
remote system terminated the association because it is shutting down.
]

I got this message for every worker twice. First - for driverPropsFetcher
and next for sparkExecutor. Looks like spark shutdown remote akka system
incorrectly or there is some race condition in this process and driver sent
some data to worker, but worker's actor system already in shutdown state.

Except for this message everything works fine. But this is ERROR level
message and I found it in my "ERROR only" log.

Do you have any idea is it configuration issue, bug in spark or akka or
something else?

Thanks!

Re: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down

Posted by Lan <nd...@gmail.com>.
Hi Alexey and Daniel,

I'm using Spark 1.2.0 and still having the same error, as described below.

Do you have any news on this? Really appreciate your responses!!!

"a Spark cluster of 1 master VM SparkV1 and 1 worker VM SparkV4 (the error
is the same if I have 2 workers). They are connected without a problem now.
But when I submit a job (as in
https://spark.apache.org/docs/latest/quick-start.html) at the master: 

>spark-submit --master spark://SparkV1:7077 examples/src/main/python/pi.py 

it seems to run ok and returns "Pi is roughly...", but the worker has the
following Error: 

15/02/07 15:22:33 ERROR EndpointWriter: AssociationError
[akka.tcp://sparkWorker@SparkV4:47986] <-
[akka.tcp://sparkExecutor@SparkV4:46630]: Error [Shut down address:
akka.tcp://sparkExecutor@SparkV4:46630] [ 
akka.remote.ShutDownAssociation: Shut down address:
akka.tcp://sparkExecutor@SparkV4:46630 
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The
remote system terminated the association because it is shutting down. 
] 

More about the setup: each VM has only 4GB RAM, running Ubuntu, using
spark-1.2.0, built for Hadoop 2.6.0 or 2.4.0. "




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/akka-remote-transport-Transport-InvalidAssociationException-The-remote-system-terminated-the-associan-tp20071p21607.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down

Posted by maxdml <ma...@gmail.com>.
Same feedback with spark 1.4.0 and hadoop 2.5.2.

Workload is completing tho.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/akka-remote-transport-Transport-InvalidAssociationException-The-remote-system-terminated-the-associan-tp20071p23713.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down

Posted by Daniel Darabos <da...@lynxanalytics.com>.
Hi, Alexey,
I'm getting the same error on startup with Spark 1.1.0. Everything works
fine fortunately.

The error is mentioned in the logs in
https://issues.apache.org/jira/browse/SPARK-4498, so maybe it will also be
fixed in Spark 1.2.0 and 1.1.2. I have no insight into it unfortunately.

On Tue, Dec 2, 2014 at 1:38 PM, Alexey Romanchuk <alexey.romanchuk@gmail.com
> wrote:

> Any ideas? Anyone got the same error?
>
> On Mon, Dec 1, 2014 at 2:37 PM, Alexey Romanchuk <
> alexey.romanchuk@gmail.com> wrote:
>
>> Hello spark users!
>>
>> I found lots of strange messages in driver log. Here it is:
>>
>> 2014-12-01 11:54:23,849 [sparkDriver-akka.actor.default-dispatcher-25]
>> ERROR
>> akka.remote.EndpointWriter[akka://sparkDriver/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkExecutor%40data1.hadoop%3A17372-5/endpointWriter]
>> - AssociationError [akka.tcp://sparkDriver@10.54.87.173:55034] <-
>> [akka.tcp://sparkExecutor@data1.hadoop:17372]: Error [Shut down address:
>> akka.tcp://sparkExecutor@data1.hadoop:17372] [
>> akka.remote.ShutDownAssociation: Shut down address:
>> akka.tcp://sparkExecutor@data1.hadoop:17372
>> Caused by: akka.remote.transport.Transport$InvalidAssociationException:
>> The remote system terminated the association because it is shutting down.
>> ]
>>
>> I got this message for every worker twice. First - for driverPropsFetcher
>> and next for sparkExecutor. Looks like spark shutdown remote akka system
>> incorrectly or there is some race condition in this process and driver sent
>> some data to worker, but worker's actor system already in shutdown state.
>>
>> Except for this message everything works fine. But this is ERROR level
>> message and I found it in my "ERROR only" log.
>>
>> Do you have any idea is it configuration issue, bug in spark or akka or
>> something else?
>>
>> Thanks!
>>
>>
>

Re: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down

Posted by Alexey Romanchuk <al...@gmail.com>.
Any ideas? Anyone got the same error?

On Mon, Dec 1, 2014 at 2:37 PM, Alexey Romanchuk <alexey.romanchuk@gmail.com
> wrote:

> Hello spark users!
>
> I found lots of strange messages in driver log. Here it is:
>
> 2014-12-01 11:54:23,849 [sparkDriver-akka.actor.default-dispatcher-25]
> ERROR
> akka.remote.EndpointWriter[akka://sparkDriver/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkExecutor%40data1.hadoop%3A17372-5/endpointWriter]
> - AssociationError [akka.tcp://sparkDriver@10.54.87.173:55034] <-
> [akka.tcp://sparkExecutor@data1.hadoop:17372]: Error [Shut down address:
> akka.tcp://sparkExecutor@data1.hadoop:17372] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@data1.hadoop:17372
> Caused by: akka.remote.transport.Transport$InvalidAssociationException:
> The remote system terminated the association because it is shutting down.
> ]
>
> I got this message for every worker twice. First - for driverPropsFetcher
> and next for sparkExecutor. Looks like spark shutdown remote akka system
> incorrectly or there is some race condition in this process and driver sent
> some data to worker, but worker's actor system already in shutdown state.
>
> Except for this message everything works fine. But this is ERROR level
> message and I found it in my "ERROR only" log.
>
> Do you have any idea is it configuration issue, bug in spark or akka or
> something else?
>
> Thanks!
>
>