You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zsxwing <gi...@git.apache.org> on 2017/05/26 21:22:09 UTC

[GitHub] spark pull request #18126: [SPARK-20843][Core]Add a config to set driver ter...

GitHub user zsxwing opened a pull request:

    https://github.com/apache/spark/pull/18126

    [SPARK-20843][Core]Add a config to set driver terminate timeout

    ## What changes were proposed in this pull request?
    
    Add a worker configuration to set how long to wait before force killing driver.
     
    ## How was this patch tested?
    
    Jenkins

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zsxwing/spark SPARK-20843

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18126.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18126
    
----
commit ca2c9c53040373e82ff350c1b3c77c1512926cec
Author: Shixiong Zhu <sh...@databricks.com>
Date:   2017-05-26T21:20:08Z

    Add a config to set driver terminate timeout

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    **[Test build #77438 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77438/testReport)** for PR 18126 at commit [`ca2c9c5`](https://github.com/apache/spark/commit/ca2c9c53040373e82ff350c1b3c77c1512926cec).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    **[Test build #77438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77438/testReport)** for PR 18126 at commit [`ca2c9c5`](https://github.com/apache/spark/commit/ca2c9c53040373e82ff350c1b3c77c1512926cec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18126: [SPARK-20843][Core]Add a config to set driver ter...

Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18126#discussion_r118803788
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala ---
    @@ -57,7 +57,8 @@ private[deploy] class DriverRunner(
       @volatile private[worker] var finalException: Option[Exception] = None
     
       // Timeout to wait for when trying to terminate a driver.
    -  private val DRIVER_TERMINATE_TIMEOUT_MS = 10 * 1000
    +  private val DRIVER_TERMINATE_TIMEOUT_MS =
    +    conf.getTimeAsMs("spark.worker.driverTerminateTimeout", "10s")
    --- End diff --
    
    Just wondering if maybe adding something to the property to be clear that this is for a driver with deploy mode cluster only?  Although it is prefixed with `worker` so maybe that is good enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    Ah, I thought the original change was in 2.2 only. Looks good then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    cc @vanzin @BryanCutler 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77438/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    No, I couldn't think of anything else without it being too long winded.  I
    agree that the `worker` prefix gives enough meaning, plus whomever uses
    this should already know the context that it's intended for.
    
    On Fri, May 26, 2017 at 4:37 PM, Shixiong Zhu <no...@github.com>
    wrote:
    
    > *@zsxwing* commented on this pull request.
    > ------------------------------
    >
    > In core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala
    > <https://github.com/apache/spark/pull/18126#discussion_r118805346>:
    >
    > > @@ -57,7 +57,8 @@ private[deploy] class DriverRunner(
    >    @volatile private[worker] var finalException: Option[Exception] = None
    >
    >    // Timeout to wait for when trying to terminate a driver.
    > -  private val DRIVER_TERMINATE_TIMEOUT_MS = 10 * 1000
    > +  private val DRIVER_TERMINATE_TIMEOUT_MS =
    > +    conf.getTimeAsMs("spark.worker.driverTerminateTimeout", "10s")
    >
    > spark.worker means this is only for Spark workers, so I think it should
    > be obvious. Do you have a better config name?
    >
    > —
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/spark/pull/18126#discussion_r118805346>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/AEUwdQJ6gVTbxvCqyzvN8y5l0OtfzwOEks5r92IhgaJpZM4NoEr0>
    > .
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    This is the behavior in 2.1.0, if we change the default value to `Long.MaxValue`, it would surprise users again :(.
    
    I'm inclined to keep it as 2.1.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    Thanks! Merging to master and 2.2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    Looks ok.
    
    I wonder if keeping the old behavior by default wouldn't be better, to avoid surprising users who upgrade and run into this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18126: [SPARK-20843][Core]Add a config to set driver ter...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18126#discussion_r118805346
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala ---
    @@ -57,7 +57,8 @@ private[deploy] class DriverRunner(
       @volatile private[worker] var finalException: Option[Exception] = None
     
       // Timeout to wait for when trying to terminate a driver.
    -  private val DRIVER_TERMINATE_TIMEOUT_MS = 10 * 1000
    +  private val DRIVER_TERMINATE_TIMEOUT_MS =
    +    conf.getTimeAsMs("spark.worker.driverTerminateTimeout", "10s")
    --- End diff --
    
    `spark.worker` means this is only for Spark workers, so I think it should be obvious. Do you have a better config name?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18126: [SPARK-20843][Core]Add a config to set driver terminate ...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/18126
  
    > 10s is pretty short for a driver timeout
    
    This is usually not a problem. If worker is trying to kill a driver, it often means the driver is unhealthy or being killed by the user intentionally. 10 seconds to allow shutdown hooks cleaning up resources such as deleting temp files is usually enough. Given that replying on shutdown hooks to persist data is not common, and we have a configuration for special cases, I think it's fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18126: [SPARK-20843][Core]Add a config to set driver ter...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18126


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org