You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Benjamin Gonzalez (Jira)" <ji...@apache.org> on 2021/05/10 14:52:00 UTC

[jira] [Updated] (BEAM-10955) Flink Java Runner test flake: Could not find Flink job

     [ https://issues.apache.org/jira/browse/BEAM-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Gonzalez updated BEAM-10955:
-------------------------------------
    Description: 
Opening to track how frequent this is. Observed on

org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestoreLegacy

in https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/17610/
{noformat}
java.util.concurrent.ExecutionException: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (beaf622cf41bf97650ea5b26bbcc3db8)
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
	at org.apache.beam.runners.flink.FlinkSavepointTest.takeSavepointAndCancelJob(FlinkSavepointTest.java:255)
	at org.apache.beam.runners.flink.FlinkSavepointTest.runSavepointAndRestore(FlinkSavepointTest.java:174)
	at org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestoreLegacy(FlinkSavepointTest.java:146)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (beaf622cf41bf97650ea5b26bbcc3db8)
	at org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:894)
	at org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:907)
	at org.apache.flink.runtime.dispatcher.Dispatcher.triggerSavepoint(Dispatcher.java:699)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212)
	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
	at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
	at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
	at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
	at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
	at akka.actor.ActorCell.invoke(ActorCell.scala:561)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
	at akka.dispatch.Mailbox.run(Mailbox.scala:225)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Salida estandardShutting SDK harness down.
Salida de error[Test worker] INFO org.apache.beam.runners.flink.FlinkSavepointTest - Savepoints will be written to file:///tmp/junit8140846753684197576
[Test worker] INFO org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils - The configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) required for local execution is not set, setting it to its default value 1.7976931348623157E308
[Test worker] INFO org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils - The con
...[truncated 529884 chars]...
C service.
[flink-akka.actor.default-dispatcher-4] INFO org.apache.flink.runtime.blob.PermanentBlobCache - Shutting down BLOB cache
[flink-akka.actor.default-dispatcher-4] INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
[flink-akka.actor.default-dispatcher-4] INFO org.apache.flink.runtime.blob.BlobServer - Stopped BLOB server at 0.0.0.0:41763
[flink-akka.actor.default-dispatcher-4] INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
{noformat}

  was:
Opening to track how frequent this is. Observed on

org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestoreLegacy

in https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/13716

{noformat}
java.util.concurrent.ExecutionException: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (6eddfd5820521f0898920a538c4a82dd)
Stacktrace
java.util.concurrent.ExecutionException: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (6eddfd5820521f0898920a538c4a82dd)
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
	at org.apache.beam.runners.flink.FlinkSavepointTest.takeSavepointAndCancelJob(FlinkSavepointTest.java:253)
	at org.apache.beam.runners.flink.FlinkSavepointTest.runSavepointAndRestore(FlinkSavepointTest.java:172)
	at org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestoreLegacy(FlinkSavepointTest.java:144)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (6eddfd5820521f0898920a538c4a82dd)
	at org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGatewayFuture(Dispatcher.java:803)
	at org.apache.flink.runtime.dispatcher.Dispatcher.triggerSavepoint(Dispatcher.java:582)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:274)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:189)
	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:147)
	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
	at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
	at akka.actor.ActorCell.invoke(ActorCell.scala:495)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
	at akka.dispatch.Mailbox.run(Mailbox.scala:224)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Standard Output
Shutting SDK harness down.
Shutting SDK harness down.
Standard Error
[Test worker] INFO org.apache.beam.runners.flink.FlinkSavepointTest - Savepoints will be written to file:///tmp/junit5733722483577894112
[Test worker] INFO org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils - The configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) required for local execution is not set, setting it to its default value 1.7976931348623157E308
[Test worker] INFO org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils - The con
...[truncated 527576 chars]...
flink-metrics-2] INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
[flink-metrics-2] INFO org.apache.flink.runtime.blob.PermanentBlobCache - Shutting down BLOB cache
[flink-metrics-2] INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
[flink-metrics-2] INFO org.apache.flink.runtime.blob.BlobServer - Stopped BLOB server at 0.0.0.0:46101
[flink-metrics-2] INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
{noformat}



> Flink Java Runner test flake: Could not find Flink job 
> -------------------------------------------------------
>
>                 Key: BEAM-10955
>                 URL: https://issues.apache.org/jira/browse/BEAM-10955
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Valentyn Tymofieiev
>            Priority: P1
>              Labels: currently-failing, flake
>
> Opening to track how frequent this is. Observed on
> org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestoreLegacy
> in https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/17610/
> {noformat}
> java.util.concurrent.ExecutionException: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (beaf622cf41bf97650ea5b26bbcc3db8)
> 	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 	at org.apache.beam.runners.flink.FlinkSavepointTest.takeSavepointAndCancelJob(FlinkSavepointTest.java:255)
> 	at org.apache.beam.runners.flink.FlinkSavepointTest.runSavepointAndRestore(FlinkSavepointTest.java:174)
> 	at org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestoreLegacy(FlinkSavepointTest.java:146)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
> 	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (beaf622cf41bf97650ea5b26bbcc3db8)
> 	at org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:894)
> 	at org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:907)
> 	at org.apache.flink.runtime.dispatcher.Dispatcher.triggerSavepoint(Dispatcher.java:699)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305)
> 	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212)
> 	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> 	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> 	at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> 	at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> 	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> 	at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> 	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
> 	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> 	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> 	at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
> 	at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> 	at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> 	at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> 	at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> 	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> 	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Salida estandardShutting SDK harness down.
> Salida de error[Test worker] INFO org.apache.beam.runners.flink.FlinkSavepointTest - Savepoints will be written to file:///tmp/junit8140846753684197576
> [Test worker] INFO org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils - The configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) required for local execution is not set, setting it to its default value 1.7976931348623157E308
> [Test worker] INFO org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils - The con
> ...[truncated 529884 chars]...
> C service.
> [flink-akka.actor.default-dispatcher-4] INFO org.apache.flink.runtime.blob.PermanentBlobCache - Shutting down BLOB cache
> [flink-akka.actor.default-dispatcher-4] INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
> [flink-akka.actor.default-dispatcher-4] INFO org.apache.flink.runtime.blob.BlobServer - Stopped BLOB server at 0.0.0.0:41763
> [flink-akka.actor.default-dispatcher-4] INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)