You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Stephan Ewen (JIRA)" <ji...@apache.org> on 2018/11/14 12:00:00 UTC

[jira] [Commented] (FLINK-10852) Decremented number of unfinished producers below 0. This is most likely a bug in the execution state/intermediate result partition management

    [ https://issues.apache.org/jira/browse/FLINK-10852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686417#comment-16686417 ] 

Stephan Ewen commented on FLINK-10852:
--------------------------------------

{{jobmanager.execution.failover-strategy}} is very much a "work in progress" internal feature at the moment.

It is actually not working correctly with the DataSet API in general, iterations being only one part.

So this is expected to not work, it is an unfinished feature.

I created [FLINK-10880] to handle this.

> Decremented number of unfinished producers below 0. This is most likely a bug in the execution state/intermediate result partition management
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-10852
>                 URL: https://issues.apache.org/jira/browse/FLINK-10852
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.7.0
>            Reporter: ouyangzhe
>            Priority: Major
>             Fix For: 1.8.0
>
>
>  
> {panel:title=Jobs using DataSet iteration operator, if set jobmanager.execution.failover-strategy: region, will hang on FAILING state when failover and has the following exception.}
> java.lang.IllegalStateException: Decremented number of unfinished producers below 0. This is most likely a bug in the execution state/intermediate result partition management. at org.apache.flink.runtime.executiongraph.IntermediateResultPartition.markFinished(IntermediateResultPartition.java:103) at org.apache.flink.runtime.executiongraph.ExecutionVertex.finishAllBlockingPartitions(ExecutionVertex.java:707) at org.apache.flink.runtime.executiongraph.Execution.markFinished(Execution.java:939) at org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:1568) at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:542) at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162) at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142) at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40) at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165) at akka.actor.Actor$class.aroundReceive(Actor.scala:502) at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) at akka.actor.ActorCell.invoke(ActorCell.scala:495) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) at akka.dispatch.Mailbox.run(Mailbox.scala:224) at akka.dispatch.Mailbox.exec(Mailbox.scala:234) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {panel}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)