You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 19:06:21 UTC

[GitHub] [beam] damccorm opened a new issue, #20723: Flink ValidatesRunner tests on Go SDK timing out on GRPC Read.

damccorm opened a new issue, #20723:
URL: https://github.com/apache/beam/issues/20723

   Configuration:
   * Go SDK Pipelines
   * Flink 1.10 Job Server (Java)
   * Java Test Expansion Service
   
   Fails on TestXLang_Combine and TestXLang_CombineGlobally.
   
   The two tests are cross-language tests, but there's a decent chance the problem isn't directly cross-language related since only two of the cross-language tests are affected and others run fine.
   
   Error stacktrace:
   
   ```
   
   2020/12/17 19:12:29  (): java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobExecutionException:
   Job execution failed.
           at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
   
          at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
         
    at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:864)
           at
   org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:199)
   
          at org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:118)
   
          at org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:85)
        
     at org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:86)
         
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
   
          at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
   
          at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
   
          at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   
          at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   
          at java.base/java.lang.Thread.run(Thread.java:835)
   Caused by: org.apache.flink.runtime.client.JobExecutionException:
   Job execution failed.
           at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
   
          at org.apache.flink.client.program.PerJobMiniClusterFactory$PerJobMiniClusterJobClient.lambda$getJobExecutionResult$2(PerJobMiniClusterFactory.java:175)
   
          at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
   
          at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
   
          at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
    
         at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:874)
         
    at akka.dispatch.OnComplete.internal(Future.scala:264)
           at akka.dispatch.OnComplete.internal(Future.scala:261)
   
          at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
           at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
   
          at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
           at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
   
          at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
           at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
   
          at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
           at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
   
          at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
   
          at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
           at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
   
          at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
           at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
   
          at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
   
          at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
   
          at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
   
          at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
           at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
   
          at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
           at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
   
          at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
           at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   
          at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
           at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
   Caused
   by: org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy
   
          at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:110)
   
          at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:76)
   
          at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)
   
          at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:186)
   
          at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:180)
   
          at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:496)
   
          at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:380)
   
          at jdk.internal.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
           at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   
          at java.base/java.lang.reflect.Method.invoke(Method.java:566)
           at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:284)
   
          at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199)
    
         at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
   
          at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
       
      at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
           at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
   
          at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
           at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
   
          at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
           at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
   
          at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
           at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
   
          at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
           at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
   
          at akka.actor.ActorCell.invoke(ActorCell.scala:561)
           at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
   
          at akka.dispatch.Mailbox.run(Mailbox.scala:225)
           at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
   
          ... 4 more
   Caused by: java.lang.RuntimeException: No client connected within timeout
        
     at org.apache.beam.runners.fnexecution.data.GrpcDataService.send(GrpcDataService.java:192)
        
     at org.apache.beam.runners.fnexecution.control.SdkHarnessClient$BundleProcessor.newBundle(SdkHarnessClient.java:287)
   
          at org.apache.beam.runners.fnexecution.control.SdkHarnessClient$BundleProcessor.newBundle(SdkHarnessClient.java:197)
   
          at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.getBundle(DefaultJobBundleFactory.java:519)
   
          at org.apache.beam.runners.fnexecution.control.StageBundleFactory.getBundle(StageBundleFactory.java:60)
   
          at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:261)
   
          at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
    
         at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504)
           at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369)
   
          at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
           at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
   
          at java.base/java.lang.Thread.run(Thread.java:835)
   Caused by: java.util.concurrent.TimeoutException:
   Waited 3 minutes for org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.SettableFuture@3082f8b1[status=PENDING]
   
          at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:471)
   
          at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:90)
   
          at org.apache.beam.runners.fnexecution.data.GrpcDataService.send(GrpcDataService.java:187)
   
          ... 11 more
   
   ```
   
   
   
   Imported from Jira [BEAM-11500](https://issues.apache.org/jira/browse/BEAM-11500). Original Jira may contain additional context.
   Reported by: danoliveira.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org