You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/04/28 06:08:49 UTC

[GitHub] [beam] dmvk commented on a change in pull request #11530: [BEAM-9824] Do not ignore chained Reshuffles on flink batch runner.

dmvk commented on a change in pull request #11530:
URL: https://github.com/apache/beam/pull/11530#discussion_r416351768



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchTransformTranslators.java
##########
@@ -330,7 +331,12 @@ public void translateNode(
               outputType,
               FlinkIdentityFunction.of(),
               getCurrentTransformName(context));
-      context.setOutputDataSet(context.getOutput(transform), retypedDataSet.rebalance());
+      final Configuration partitionOptions = new Configuration();
+      partitionOptions.setString(
+          Optimizer.HINT_SHIP_STRATEGY, Optimizer.HINT_SHIP_STRATEGY_REPARTITION);

Review comment:
       yes, it should be 
   
   `SHIP_REPARTITION` gets intepreted as `PARTITION_RANDOM`
   
   ```java
   			} else if (shipStrategy.equalsIgnoreCase(Optimizer.HINT_SHIP_STRATEGY_REPARTITION)) {
   				preSet = ShipStrategyType.PARTITION_RANDOM;
   ```
   
   which should have exactly the same impl as `PARTITION_FORCED_REBALANCE`
   
   from `org.apache.flink.runtime.operators.shipping.OutputEmitter#selectChannel`:
   
   ```java
   	@Override
   	public final int selectChannel(SerializationDelegate<T> record) {
   		switch (strategy) {
   		case FORWARD:
   			return forward();
   		case PARTITION_RANDOM:
   		case PARTITION_FORCED_REBALANCE:
   			return robin(numberOfChannels);
   		case PARTITION_HASH:
   			return hashPartitionDefault(record.getInstance(), numberOfChannels);
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org