You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jürgen Thomann <ju...@linfre.de> on 2019/03/04 12:49:19 UTC

Timeout between driver and application master (Thrift Server)

Hi,

I'm using the Spark Thrift Server and after some time the driver and 
application master are shutting down because of timeouts. There is a firewall 
in between and there is no traffic between them as it seems. Is there a way to 
configure TCP keep alive for the connection or some other way to make the 
firewall happy?

Environment:
CentOS 7, HDP 2.6.5 with Spark 2.3.0

The Error on the driver is "ERROR YarnClientSchedulerBackend: Yarn application 
has already exited with state finished" and a bit later there are some 
Exceptions with ClosedChannelException.

The application master has the following message:
WARN TransportChannelHandler: Exception in connection from <driver Host>
java.io.IOException: Connection timed out
... Stacktrace omitted
The messages are at the same time (same second, sadly no milliseconds in the 
logs).

Thanks,
Jürgen



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Timeout between driver and application master (Thrift Server)

Posted by "tibi.bronto" <ti...@bronto.com>.
Hi Jürgen,

Did you ever find a way to resolve this issue ?

Looking at the implementation of the application master, it seems that there
is no heartbeat/keepalive mechanism for the communication between the driver
and AM, so when something closes the connection for inactivity, the AM shuts
down:
https://github.com/apache/spark/blob/branch-2.3/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L807


Jürgen Thomann wrote
> Hi,
> 
> I'm using the Spark Thrift Server and after some time the driver and 
> application master are shutting down because of timeouts. There is a
> firewall 
> in between and there is no traffic between them as it seems. Is there a
> way to 
> configure TCP keep alive for the connection or some other way to make the 
> firewall happy?
> 
> Environment:
> CentOS 7, HDP 2.6.5 with Spark 2.3.0
> 
> The Error on the driver is "ERROR YarnClientSchedulerBackend: Yarn
> application 
> has already exited with state finished" and a bit later there are some 
> Exceptions with ClosedChannelException.
> 
> The application master has the following message:
> WARN TransportChannelHandler: Exception in connection from 
> <driver Host>
> java.io.IOException: Connection timed out
> ... Stacktrace omitted
> The messages are at the same time (same second, sadly no milliseconds in
> the 
> logs).
> 
> Thanks,
> Jürgen
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: 

> user-unsubscribe@.apache





--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org