You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michael Allman (JIRA)" <ji...@apache.org> on 2017/05/25 17:10:04 UTC

[jira] [Commented] (SPARK-20843) Cannot gracefully kill drivers which take longer than 10 seconds to die

    [ https://issues.apache.org/jira/browse/SPARK-20843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025036#comment-16025036 ] 

Michael Allman commented on SPARK-20843:
----------------------------------------

[~rxin] I'd like to bump this to "Critical". This is really a disruptive, potentially dangerous change for spark streaming apps (among others). We could not tolerate this behavior in our production environment, and it caught us off guard in our prod migration to Spark 2.1.

I think this timeout should be configurable per-app (as a driver config param), but I couldn't find a way to do that. In our case, we modified our source build to set the timeout to `Int.MaxValue`, effectively reverting this change. Therefore, the best PR I could offer at this point is to effectively revert this change.

I have another concern that this behavior varies depending on the version of the underlying JDK. Specifically, this behavior will not manifest on Java 7 but will do so on Java 8+. IMO, users who upgrade their Java runtimes should not expect this kind of change in their Spark apps' behavior.

Thank you.

> Cannot gracefully kill drivers which take longer than 10 seconds to die
> -----------------------------------------------------------------------
>
>                 Key: SPARK-20843
>                 URL: https://issues.apache.org/jira/browse/SPARK-20843
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.1
>            Reporter: Michael Allman
>              Labels: regression
>
> Commit https://github.com/apache/spark/commit/1c9a386c6b6812a3931f3fb0004249894a01f657 changed the behavior of driver process termination. Whereas before `Process.destroyForcibly` was never called, now it is called (on Java VM's supporting that API) if the driver process does not die within 10 seconds.
> This prevents apps which take longer than 10 seconds to shutdown gracefully from shutting down gracefully. For example, streaming apps with a large batch duration (say, 30 seconds+) can take minutes to shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org