You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/04/26 05:02:11 UTC

[GitHub] [beam] hengfengli commented on pull request #17461: [BEAM-14365]: fix the negative throughput issue

hengfengli commented on PR #17461:
URL: https://github.com/apache/beam/pull/17461#issuecomment-1109346515

   > Is it clear where the negative throughput value is coming from? and setting it to 0 is the right thing to do when it is negative?
   
   @y1chi No, we are still not sure where the negative number comes from since it only happened after running over 2 days. We got the following error: 
   
   "java.lang.IllegalArgumentException: Expected size >= 0 but received -36590.86666666666. at org.apache.beam.sdk.transforms.reflect.ByteBuddyDoFnInvokerFactory$DefaultGetSize.validateSize(ByteBuddyDoFnInvokerFactory.java:438) at org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn.ReadChangeStreamPartitionDoFn$DoFnInvoker.invokeGetSize(Unknown Source) at org.apache.beam.fn.harness.FnApiDoFnRunner.calculateRestrictionSize(FnApiDoFnRunner.java:1182) at org.apache.beam.fn.harness.FnApiDoFnRunner.trySplitForElementAndRestriction(FnApiDoFnRunner.java:1625) at org.apache.beam.fn.harness.FnApiDoFnRunner.access$1800(FnApiDoFnRunner.java:142) at org.apache.beam.fn.harness.FnApiDoFnRunner$SplittableFnDataReceiver.trySplit(FnApiDoFnRunner.java:1113) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$SplittingMetricTrackingFnDataReceiver.trySplit(PCollectionConsumerRegistry.java:342) at org.apache.beam.fn.harness.BeamFnDataReadRunner.trySplit(BeamFnDataReadRunner.java:259) at org.
 apache.beam.fn.harness.control.ProcessBundleHandler.trySplit(ProcessBundleHandler.java:688) at org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:151) at org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver.lambda$onNext$0(BeamFnControlClient.java:116) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833) " 
   
   Basically, it comes from `getSize()` in `ReadChangeStreamPartitionDoFn`: 
   
   https://github.com/apache/beam/blob/07f30d221e4b285b23b74c3509d77b62388b7bb4/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/dofn/ReadChangeStreamPartitionDoFn.java#L159-L172
   
   The remainingWork in Progress cannot be negative ([code](https://github.com/apache/beam/blob/07f30d221e4b285b23b74c3509d77b62388b7bb4/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.java#L187-L192)). So the only possibility is from the `throughput`. 
   
   Setting it to 0 is the current effective way to fix the issue. Also, the throughput is not a critical part and does not impact the correctness of the connector. When the throughput is negative, setting it to 0 is acceptable. 
   
   In addition, I have been running a dataflow job with printing more logs to find out where the negative number actually comes from. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org