You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2021/02/05 17:15:04 UTC

[jira] [Commented] (BEAM-11500) Flink ValidatesRunner tests on Go SDK timing out on GRPC Read.

    [ https://issues.apache.org/jira/browse/BEAM-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279823#comment-17279823 ] 

Beam JIRA Bot commented on BEAM-11500:
--------------------------------------

This issue is assigned but has not received an update in 30 days so it has been labeled "stale-assigned". If you are still working on the issue, please give an update and remove the label. If you are no longer working on the issue, please unassign so someone else may work on it. In 7 days the issue will be automatically unassigned.

> Flink ValidatesRunner tests on Go SDK timing out on GRPC Read.
> --------------------------------------------------------------
>
>                 Key: BEAM-11500
>                 URL: https://issues.apache.org/jira/browse/BEAM-11500
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-go
>            Reporter: Daniel Oliveira
>            Assignee: Daniel Oliveira
>            Priority: P2
>              Labels: stale-assigned
>
> Configuration:
> * Go SDK Pipelines
> * Flink 1.10 Job Server (Java)
> * Java Test Expansion Service
> Fails on TestXLang_Combine and TestXLang_CombineGlobally.
> The two tests are cross-language tests, but there's a decent chance the problem isn't directly cross-language related since only two of the cross-language tests are affected and others run fine.
> Error stacktrace:
> {noformat}
> 2020/12/17 19:12:29  (): java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>         at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>         at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>         at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:864)
>         at org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:199)
>         at org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:118)
>         at org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:85)
>         at org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:86)
>         at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>         at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
>         at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:835)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>         at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
>         at org.apache.flink.client.program.PerJobMiniClusterFactory$PerJobMiniClusterJobClient.lambda$getJobExecutionResult$2(PerJobMiniClusterFactory.java:175)
>         at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>         at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>         at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>         at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:874)
>         at akka.dispatch.OnComplete.internal(Future.scala:264)
>         at akka.dispatch.OnComplete.internal(Future.scala:261)
>         at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
>         at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
>         at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
>         at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
>         at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
>         at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
>         at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
>         at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
>         at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
>         at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
>         at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
>         at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
>         at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
>         at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
>         at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
>         at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
>         at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
>         at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
>         at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>         at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
>         at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy
>         at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:110)
>         at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:76)
>         at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)
>         at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:186)
>         at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:180)
>         at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:496)
>         at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:380)
>         at jdk.internal.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
>         at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>         at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:284)
>         at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199)
>         at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
>         at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
>         at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>         at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>         at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>         at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>         at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>         at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>         at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>         at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>         at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>         ... 4 more
> Caused by: java.lang.RuntimeException: No client connected within timeout
>         at org.apache.beam.runners.fnexecution.data.GrpcDataService.send(GrpcDataService.java:192)
>         at org.apache.beam.runners.fnexecution.control.SdkHarnessClient$BundleProcessor.newBundle(SdkHarnessClient.java:287)
>         at org.apache.beam.runners.fnexecution.control.SdkHarnessClient$BundleProcessor.newBundle(SdkHarnessClient.java:197)
>         at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.getBundle(DefaultJobBundleFactory.java:519)
>         at org.apache.beam.runners.fnexecution.control.StageBundleFactory.getBundle(StageBundleFactory.java:60)
>         at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:261)
>         at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
>         at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504)
>         at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369)
>         at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
>         at java.base/java.lang.Thread.run(Thread.java:835)
> Caused by: java.util.concurrent.TimeoutException: Waited 3 minutes for org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.SettableFuture@3082f8b1[status=PENDING]
>         at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:471)
>         at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:90)
>         at org.apache.beam.runners.fnexecution.data.GrpcDataService.send(GrpcDataService.java:187)
>         ... 11 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)