You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/02/03 16:37:52 UTC

[jira] [Commented] (FLINK-5699) Cancel with savepoint fails with a NPE if savepoint target directory not set

    [ https://issues.apache.org/jira/browse/FLINK-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851681#comment-15851681 ] 

ASF GitHub Bot commented on FLINK-5699:
---------------------------------------

GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/3263

    [FLINK-5699] [savepoints] Check target dir when cancelling with savepoint

    When cancelling a job with a savepoint and no savepoint directory is configured, triggering the savepoint fails with an NPE. This is then returned to the user as the root cause.
    
    Instead of simply forwarding the argument (which is possibly null), we check it for null and return a IllegalStateException with a meaningful message.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink 5699-cancel_with_savepoint_directory

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3263.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3263
    
----
commit 5d06fd1c9b3a9ac1b404618b5bc843596e89e0ba
Author: Ufuk Celebi <uc...@apache.org>
Date:   2017-02-03T16:28:27Z

    [FLINK-5699] [savepoints] Check target dir when cancelling with savepoint
    
    Problem: when cancelling a job with a savepoint and no savepoint directory
    is configured, triggering the savepoint fails with an NPE. This is then
    returned to the user as the root cause.
    
    Solution: Instead of simply forwarding the argument (which is possibly
    null), we check it for null and return a IllegalStateException with
    a meaningful message.

----


> Cancel with savepoint fails with a NPE if savepoint target directory not set
> ----------------------------------------------------------------------------
>
>                 Key: FLINK-5699
>                 URL: https://issues.apache.org/jira/browse/FLINK-5699
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.3.0
>            Reporter: Till Rohrmann
>            Assignee: Ufuk Celebi
>            Priority: Minor
>
> When canceling a job with savepoint where one has not configured a savepoint directory, then the command fails with the following exception
> {code}
> java.lang.Exception: Canceling the job with ID 663f9769f0f3565b8ebc2acf0091431a failed.
> 	at org.apache.flink.client.CliFrontend.cancel(CliFrontend.java:633)
> 	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1082)
> 	at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
> 	at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
> 	at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> 	at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120)
> Caused by: java.lang.Exception: Failed to cancel job 663f9769f0f3565b8ebc2acf0091431a with savepoint.
> 	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:634)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> 	at org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> 	at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> 	at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> 	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> 	at org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> 	at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
> 	at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:118)
> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> 	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
> 	at akka.dispatch.Mailbox.run(Mailbox.scala:220)
> 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.NullPointerException: Savepoint target directory
> 	at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:75)
> 	at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.triggerSavepoint(CheckpointCoordinator.java:296)
> 	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:598)
> 	... 22 more
> {code}
> I think we could return a more meaningful exception then the NPE to the user.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)