You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/02/18 22:31:18 UTC

[jira] [Commented] (FLINK-3446) Back pressure tracker throws NPE for archived jobs

    [ https://issues.apache.org/jira/browse/FLINK-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153144#comment-15153144 ] 

ASF GitHub Bot commented on FLINK-3446:
---------------------------------------

GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/1673

    [FLINK-3446] [runtime-web] Don't trigger back pressure sample for archived job

    Sampling a back pressure sample when the job has been archived, lead to a NPE being thrown as reported by @rmetzger. Fix is pretty straight forward, waiting for Travis...

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink 3446-bp_log

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1673
    
----
commit 936297283a2aeae48466ff0740c5d390d1268b79
Author: Ufuk Celebi <uc...@apache.org>
Date:   2016-02-18T21:29:33Z

    [FLINK-3446] [runtime-web] Don't trigger back pressure sample for archived job

----


> Back pressure tracker throws NPE for archived jobs 
> ---------------------------------------------------
>
>                 Key: FLINK-3446
>                 URL: https://issues.apache.org/jira/browse/FLINK-3446
>             Project: Flink
>          Issue Type: Bug
>          Components: Webfrontend
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>
> After a job is archived the execution context of the {{ExecutionGraph}} is {{null}}. When triggering a new back pressure sample in parallel, this can result in a {{NullPointerException}} in the web frontend. This does not break anything, but pollutes the log.
> {code}
> 2016-02-18 20:17:49,071 ERROR org.apache.flink.runtime.webmonitor.BackPressureStatsTracker  - Failed to gather stack trace sample.
> java.lang.RuntimeException: Discarded
> 	at org.apache.flink.runtime.webmonitor.StackTraceSampleCoordinator$PendingStackTraceSample.discard(StackTraceSampleCoordinator.java:394)
> 	at org.apache.flink.runtime.webmonitor.StackTraceSampleCoordinator$1.run(StackTraceSampleCoordinator.java:181)
> 	at java.util.TimerThread.mainLoop(Timer.java:555)
> 	at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.lang.RuntimeException: Time out
> 	... 3 more
> 2016-02-18 20:17:50,088 WARN  org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler     - Error while handling request
> java.lang.NullPointerException
> 	at scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:268)
> 	at org.apache.flink.runtime.webmonitor.BackPressureStatsTracker.triggerStackTraceSample(BackPressureStatsTracker.java:170)
> 	at org.apache.flink.runtime.webmonitor.handlers.JobVertexBackPressureHandler.handleRequest(JobVertexBackPressureHandler.java:98)
> 	at org.apache.flink.runtime.webmonitor.handlers.AbstractJobVertexRequestHandler.handleRequest(AbstractJobVertexRequestHandler.java:58)
> 	at org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:61)
> 	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:135)
> 	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:112)
> 	at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.channelRead0(RuntimeMonitorHandler.java:60)
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> 	at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
> 	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
> 	at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> 	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:104)
> 	at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
> 	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
> 	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> 	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)