You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/06/14 17:26:14 UTC

[GitHub] [beam] MiguelAnzoWizeline commented on a change in pull request #14811: [BEAM-11996] spannerio splittable

MiguelAnzoWizeline commented on a change in pull request #14811:
URL: https://github.com/apache/beam/pull/14811#discussion_r651139190



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/BatchSpannerRead.java
##########
@@ -73,18 +74,16 @@ public static BatchSpannerRead create(
         .apply(
             "Generate Partitions",
             ParDo.of(new GeneratePartitionsFn(getSpannerConfig(), txView)).withSideInputs(txView))
-        .apply("Shuffle partitions", Reshuffle.<Partition>viaRandomKey())
         .apply(
             "Read from Partitions",
             ParDo.of(new ReadFromPartitionFn(getSpannerConfig(), txView)).withSideInputs(txView));
   }
 
   @VisibleForTesting
-  static class GeneratePartitionsFn extends DoFn<ReadOperation, Partition> {
+  static class GeneratePartitionsFn extends DoFn<ReadOperation, List<Partition>> {

Review comment:
       Hello @boyuanzz I have been working in trying to get this part done but I have encountered a problem with the approach. I have changed the code to a single DoFn to remove `GeneratePartitionsFn` as you mentioned, now my DoFn looks like this `private static class ReadFromPartitionFn extends DoFn<ReadOperation, Struct>`. But I'm having problems getting `@GetInitialRestriction `and `@SplitRestriction` right, in order to know the size of the Partition List we are going to process I need to use the side input `c.sideInput(txView);` to process the ReadOperation, is there a way to get the SideInput in `@GetInitialRestriction` or `@SplitRestriction` to get the List Size and the amount of partitions we are going to process? or the approach you were suggesting was something different? Thanks in advance.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org