You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Robert Metzger (JIRA)" <ji...@apache.org> on 2015/11/25 14:30:11 UTC

[jira] [Created] (FLINK-3078) JobManager does not shutdown when checkpointed jobs are running

Robert Metzger created FLINK-3078:
-------------------------------------

             Summary: JobManager does not shutdown when checkpointed jobs are running
                 Key: FLINK-3078
                 URL: https://issues.apache.org/jira/browse/FLINK-3078
             Project: Flink
          Issue Type: Bug
          Components: JobManager
    Affects Versions: 0.10.0, 0.10.1
            Reporter: Robert Metzger


While testing the 0.10.1 release, I found that the JobManager does not shutdown when I'm stopping it while a streaming job is running.

It seems that the checkpoint coordinator and the execution graph are still logging, even though the JobManager actor system and other services have been shut down.

This is a log file of an affected JobManager: https://gist.github.com/rmetzger/a1532c18eb7081977cee

{code}
11:58:04,406 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Flat Map -> Sink: Unnamed (10/10) (c6544ca6d88e2d1acdec5c838d5fce06) switched from CANCELING to FAILED
11:58:04,406 DEBUG org.apache.flink.runtime.executiongraph.ExecutionGraph        - Kafka Consumer Topology switched from FAILING to RESTARTING.
11:58:04,407 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Delaying retry of job execution for 100000 ms ...
11:58:04,417 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:44904
11:58:04,421 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
11:58:04,422 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
11:58:04,446 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
11:58:04,473 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web root dir /tmp/flink-web-2039bed3-d9f9-4950-83ab-6fb70f7fc302
11:58:04,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering checkpoint 66 @ 1448452684590
11:58:04,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint triggering task Source: Custom Source (1/10) is not being executed at the moment. Aborting checkpoint.
11:58:05,091 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering checkpoint 67 @ 1448452685091
11:58:05,091 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint triggering task Source: Custom Source (1/10) is not being executed at the moment. Aborting checkpoint.
11:58:05,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering checkpoint 68 @ 1448452685590
11:58:05,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint triggering task Source: Custom Source (1/10) is not being executed at the moment. Aborting checkpoint.
11:58:06,090 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering checkpoint 69 @ 1448452686090
11:58:06,091 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint triggering task Source: Custom Source (1/10) is not being executed at the moment. Aborting checkpoint.
11:58:06,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering checkpoint 70 @ 1448452686590
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)