You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/03 03:31:26 UTC

[GitHub] [hudi] cshuo commented on a diff in pull request #6325: [MINOR] Improve flink dummySink's parallelism

cshuo commented on code in PR #6325:
URL: https://github.com/apache/hudi/pull/6325#discussion_r962096207


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/Pipelines.java:
##########
@@ -432,7 +432,9 @@ public static DataStreamSink<Object> clean(Configuration conf, DataStream<Object
   }
 
   public static DataStreamSink<Object> dummySink(DataStream<Object> dataStream) {
-    return dataStream.addSink(Pipelines.DummySink.INSTANCE).name("dummy");
+    return dataStream.addSink(Pipelines.DummySink.INSTANCE)
+        .setParallelism(1)
+        .name("dummy");

Review Comment:
   Sorry to jump in...But I think it's more properly to set the parallelism of dummy sink as `FlinkOptions.WRITE_TASKS`, so that dummy sink can be chained with hoodie_write_task, which would reduce resource cost in some cases, e.g., when slot-sharing is disabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org