You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Liu, Raymond" <ra...@intel.com> on 2013/11/01 08:14:46 UTC
Executor could not connect to Driver?
Hi
I am encounter an issue that the executor actor could not connect to Driver actor. But I could not figure out what's the reason.
Say the Driver actor is listening on :35838
root@sr434:~# netstat -lpv
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 *:50075 *:* LISTEN 18242/java
tcp 0 0 *:50020 *:* LISTEN 18242/java
tcp 0 0 *:ssh *:* LISTEN 1325/sshd
tcp 0 0 *:50010 *:* LISTEN 18242/java
tcp6 0 0 sr434:35838 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:40390 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:4040 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:8040 [::]:* LISTEN 28324/java
tcp6 0 0 [::]:60712 [::]:* LISTEN 28324/java
tcp6 0 0 [::]:8042 [::]:* LISTEN 28324/java
tcp6 0 0 [::]:34028 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:ssh [::]:* LISTEN 1325/sshd
tcp6 0 0 [::]:45528 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:13562 [::]:* LISTEN 28324/java
while the executor driver report errors as below :
13/11/01 13:16:43 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka://spark@sr434:35838/user/CoarseGrainedScheduler
13/11/01 13:16:43 ERROR executor.CoarseGrainedExecutorBackend: Driver terminated or disconnected! Shutting down.
Any idea?
Best Regards,
Raymond Liu
RE: Executor could not connect to Driver?
Posted by "Liu, Raymond" <ra...@intel.com>.
Thanks, my case seems not caused by GC, cpu is pretty low and both YGC and FGC seems behavior quite normal. Hmm, weird.
Best Regards,
Raymond Liu
From: Aaron Davidson [mailto:ilikerps@gmail.com]
Sent: Saturday, November 02, 2013 12:07 AM
To: user@spark.incubator.apache.org
Subject: Re: Executor could not connect to Driver?
I've seen this happen before due to the driver doing long GCs when the driver machine was heavily memory-constrained. For this particular issue, simply freeing up memory used by other applications fixed the problem.
On Fri, Nov 1, 2013 at 12:14 AM, Liu, Raymond <ra...@intel.com>> wrote:
Hi
I am encounter an issue that the executor actor could not connect to Driver actor. But I could not figure out what's the reason.
Say the Driver actor is listening on :35838
root@sr434:~# netstat -lpv
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 *:50075 *:* LISTEN 18242/java
tcp 0 0 *:50020 *:* LISTEN 18242/java
tcp 0 0 *:ssh *:* LISTEN 1325/sshd
tcp 0 0 *:50010 *:* LISTEN 18242/java
tcp6 0 0 sr434:35838 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:40390 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:4040 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:8040 [::]:* LISTEN 28324/java
tcp6 0 0 [::]:60712 [::]:* LISTEN 28324/java
tcp6 0 0 [::]:8042 [::]:* LISTEN 28324/java
tcp6 0 0 [::]:34028 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:ssh [::]:* LISTEN 1325/sshd
tcp6 0 0 [::]:45528 [::]:* LISTEN 9420/java
tcp6 0 0 [::]:13562 [::]:* LISTEN 28324/java
while the executor driver report errors as below :
13/11/01 13:16:43 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka://spark@sr434:35838/user/CoarseGrainedScheduler
13/11/01 13:16:43 ERROR executor.CoarseGrainedExecutorBackend: Driver terminated or disconnected! Shutting down.
Any idea?
Best Regards,
Raymond Liu
Re: Executor could not connect to Driver?
Posted by Aaron Davidson <il...@gmail.com>.
I've seen this happen before due to the driver doing long GCs when the
driver machine was heavily memory-constrained. For this particular issue,
simply freeing up memory used by other applications fixed the problem.
On Fri, Nov 1, 2013 at 12:14 AM, Liu, Raymond <ra...@intel.com> wrote:
> Hi
>
> I am encounter an issue that the executor actor could not connect to
> Driver actor. But I could not figure out what's the reason.
>
> Say the Driver actor is listening on :35838
>
> root@sr434:~# netstat -lpv
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address Foreign Address State
> PID/Program name
> tcp 0 0 *:50075 *:* LISTEN
> 18242/java
> tcp 0 0 *:50020 *:* LISTEN
> 18242/java
> tcp 0 0 *:ssh *:* LISTEN
> 1325/sshd
> tcp 0 0 *:50010 *:* LISTEN
> 18242/java
> tcp6 0 0 sr434:35838 [::]:* LISTEN
> 9420/java
> tcp6 0 0 [::]:40390 [::]:* LISTEN
> 9420/java
> tcp6 0 0 [::]:4040 [::]:* LISTEN
> 9420/java
> tcp6 0 0 [::]:8040 [::]:* LISTEN
> 28324/java
> tcp6 0 0 [::]:60712 [::]:* LISTEN
> 28324/java
> tcp6 0 0 [::]:8042 [::]:* LISTEN
> 28324/java
> tcp6 0 0 [::]:34028 [::]:* LISTEN
> 9420/java
> tcp6 0 0 [::]:ssh [::]:* LISTEN
> 1325/sshd
> tcp6 0 0 [::]:45528 [::]:* LISTEN
> 9420/java
> tcp6 0 0 [::]:13562 [::]:* LISTEN
> 28324/java
>
>
> while the executor driver report errors as below :
>
> 13/11/01 13:16:43 INFO executor.CoarseGrainedExecutorBackend: Connecting
> to driver: akka://spark@sr434:35838/user/CoarseGrainedScheduler
> 13/11/01 13:16:43 ERROR executor.CoarseGrainedExecutorBackend: Driver
> terminated or disconnected! Shutting down.
>
> Any idea?
>
> Best Regards,
> Raymond Liu
>