You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@livy.apache.org by "Björn Lohrmann (JIRA)" <ji...@apache.org> on 2018/11/13 22:23:00 UTC
[jira] [Updated] (LIVY-533) Spark jobs submitted via programmatic
API cannot always be canceled
[ https://issues.apache.org/jira/browse/LIVY-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Björn Lohrmann updated LIVY-533:
--------------------------------
Description:
Running stages of Spark jobs submitted via Livy' programmatic API cannot (always) be successfully cancelled.
The current implementation of .JobWrapper.cancel() interrupts the worker thread on the Spark driver (via Future.cancel(true)):
[https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/rsc/src/main/java/org/apache/livy/rsc/driver/JobWrapper.java#L84]
This does not always cancel all activity in Spark, e.g. long-running stages may remain unaffected.
The Spark-way of cancelling jobs seems to be via SparkContext.setJobGroup()/cancelJobGroup(), which is also being used in Livy's REPL Session:
[https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/repl/src/main/scala/org/apache/livy/repl/Session.scala#L164]
I have opened a PR that invokes setJobGroup()/cancelJobGroup() in addition to interrupting the worker thread running on the driver:
[https://github.com/apache/incubator-livy/pull/128]
was:
Running stages of Spark jobs submitted via Livy' programmatic API cannot (always) be successfully cancelled.
The current implementation of .JobWrapper.cancel() interrupts the worker thread on the Spark driver (via Future.cancel(true)):
[https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/rsc/src/main/java/org/apache/livy/rsc/driver/JobWrapper.java#L84]
This does not always cancel all activity in Spark, e.g. long-running stages may remain unaffected.
The Spark-way of cancelling jobs seems to be via SparkContext.setJobGroup()/cancelJobGroup(), which is also being used in Livy's REPL Session:
[https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/repl/src/main/scala/org/apache/livy/repl/Session.scala#L164]
I have opened a PR that invokes setJobGroup()/cancelJobGroup() in addition to interrupting the worker thread running on the driver.
> Spark jobs submitted via programmatic API cannot always be canceled
> ---------------------------------------------------------------------
>
> Key: LIVY-533
> URL: https://issues.apache.org/jira/browse/LIVY-533
> Project: Livy
> Issue Type: Bug
> Components: RSC
> Affects Versions: 0.5.0
> Reporter: Björn Lohrmann
> Priority: Major
> Labels: pull-request-available
>
> Running stages of Spark jobs submitted via Livy' programmatic API cannot (always) be successfully cancelled.
> The current implementation of .JobWrapper.cancel() interrupts the worker thread on the Spark driver (via Future.cancel(true)):
> [https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/rsc/src/main/java/org/apache/livy/rsc/driver/JobWrapper.java#L84]
> This does not always cancel all activity in Spark, e.g. long-running stages may remain unaffected.
> The Spark-way of cancelling jobs seems to be via SparkContext.setJobGroup()/cancelJobGroup(), which is also being used in Livy's REPL Session:
> [https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/repl/src/main/scala/org/apache/livy/repl/Session.scala#L164]
> I have opened a PR that invokes setJobGroup()/cancelJobGroup() in addition to interrupting the worker thread running on the driver:
> [https://github.com/apache/incubator-livy/pull/128]
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)