You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jürgen Thomann <ju...@linfre.de> on 2019/03/04 12:49:19 UTC
Timeout between driver and application master (Thrift Server)
Hi,
I'm using the Spark Thrift Server and after some time the driver and
application master are shutting down because of timeouts. There is a firewall
in between and there is no traffic between them as it seems. Is there a way to
configure TCP keep alive for the connection or some other way to make the
firewall happy?
Environment:
CentOS 7, HDP 2.6.5 with Spark 2.3.0
The Error on the driver is "ERROR YarnClientSchedulerBackend: Yarn application
has already exited with state finished" and a bit later there are some
Exceptions with ClosedChannelException.
The application master has the following message:
WARN TransportChannelHandler: Exception in connection from <driver Host>
java.io.IOException: Connection timed out
... Stacktrace omitted
The messages are at the same time (same second, sadly no milliseconds in the
logs).
Thanks,
Jürgen
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Timeout between driver and application master (Thrift Server)
Posted by "tibi.bronto" <ti...@bronto.com>.
Hi Jürgen,
Did you ever find a way to resolve this issue ?
Looking at the implementation of the application master, it seems that there
is no heartbeat/keepalive mechanism for the communication between the driver
and AM, so when something closes the connection for inactivity, the AM shuts
down:
https://github.com/apache/spark/blob/branch-2.3/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L807
Jürgen Thomann wrote
> Hi,
>
> I'm using the Spark Thrift Server and after some time the driver and
> application master are shutting down because of timeouts. There is a
> firewall
> in between and there is no traffic between them as it seems. Is there a
> way to
> configure TCP keep alive for the connection or some other way to make the
> firewall happy?
>
> Environment:
> CentOS 7, HDP 2.6.5 with Spark 2.3.0
>
> The Error on the driver is "ERROR YarnClientSchedulerBackend: Yarn
> application
> has already exited with state finished" and a bit later there are some
> Exceptions with ClosedChannelException.
>
> The application master has the following message:
> WARN TransportChannelHandler: Exception in connection from
> <driver Host>
> java.io.IOException: Connection timed out
> ... Stacktrace omitted
> The messages are at the same time (same second, sadly no milliseconds in
> the
> logs).
>
> Thanks,
> Jürgen
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail:
> user-unsubscribe@.apache
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org