You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "jing lining (JIRA)" <ji...@apache.org> on 2017/02/20 03:44:44 UTC

[jira] [Created] (FLINK-5844) jobmanager was killed when disk less 10% and restart fail

jing lining created FLINK-5844:
----------------------------------

             Summary: jobmanager was killed when disk less 10% and restart fail
                 Key: FLINK-5844
                 URL: https://issues.apache.org/jira/browse/FLINK-5844
             Project: Flink
          Issue Type: Bug
          Components: YARN
    Affects Versions: 1.1.3
            Reporter: jing lining


JobManager was killed

log is
{quote}
2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner             - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,088 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard root cache directory /tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache                       - Shutting down BlobCache
2017-02-19 03:20:37,137 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing web dashboard jar upload directory /tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:54513
End of LogType:jobmanager.log
{quote}

then yarn restart new node but always fail

log

{quote}
2017-02-19 03:20:44,244 WARN  org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler     - Error while handling request
org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job with id 1b45608e30808183913eeffbb4d855da
	at org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:88)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:84)
	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:105)
	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
	at java.lang.Thread.run(Thread.java:745)
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)