You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 15:27:57 UTC

[GitHub] [beam] damccorm opened a new issue, #20104: Running GroupByKeyLoadTest on Portable Flink fails

damccorm opened a new issue, #20104:
URL: https://github.com/apache/beam/issues/20104

   When running a GBK Load test using Java harness image and JobServer image generated from master, the load test fails with a cryptic exception:
   ```
   
   Exception in thread "main" java.lang.RuntimeException: Invalid job state: FAILED.
   11:45:31 	at org.apache.beam.sdk.loadtests.JobFailure.handleFailure(JobFailure.java:55)
   11:45:31
   	at org.apache.beam.sdk.loadtests.LoadTest.run(LoadTest.java:106)
   11:45:31 	at org.apache.beam.sdk.loadtests.CombineLoadTest.run(CombineLoadTest.java:66)
   11:45:31
   	at org.apache.beam.sdk.loadtests.CombineLoadTest.main(CombineLoadTest.java:169)
   
   ```
   
    
   
   After some investigation, I found a stacktrace of the error:
   ```
   
   org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: CANCELLED: call already cancelledorg.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException:
   CANCELLED: call already cancelled at org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524)
   at org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:339)
   at org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:98) at org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.flush(BeamFnDataSizeBasedBufferingOutboundObserver.java:90)
   at org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.accept(BeamFnDataSizeBasedBufferingOutboundObserver.java:102)
   at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.processElements(FlinkExecutableStageFunction.java:278)
   at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:201)
   at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103) at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504)
   at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) at java.lang.Thread.run(Thread.java:748)
   Suppressed: org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: CANCELLED: call already
   cancelled at org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status.asRuntimeException(Status.java:524)
   at org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:339)
   at org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:98) at org.apache.beam.sdk.fn.data.BeamFnDataSizeBasedBufferingOutboundObserver.close(BeamFnDataSizeBasedBufferingOutboundObserver.java:84)
   at org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:298)
   at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.$closeResource(FlinkExecutableStageFunction.java:202)
   at org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:202)
   ... 6 more Suppressed: java.lang.IllegalStateException: Processing bundle failed, TODO: [BEAM-3962]
   abort bundle. at org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:320)
   ... 8 more
   
   ```
   
   It seems that the core issue is an IllegalStateException thrown from SdkHarnessClient.java:320, related to BEAM-3962.
   
    It is important to note that the stacktrace above comes from the Flink cluster, not from the Gradle job that was executed.
   
   The link to Jenkins job is here: [https://builds.apache.org/job/beam_LoadTests_Java_GBK_Flink_Batch_PR/28/console](https://builds.apache.org/job/beam_LoadTests_Java_GBK_Flink_Batch_PR/28/console)
   
   Imported from Jira [BEAM-8980](https://issues.apache.org/jira/browse/BEAM-8980). Original Jira may contain additional context.
   Reported by: mwalenia.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org