You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/11/06 15:49:19 UTC

[GitHub] [flink] wanglijie95 commented on a diff in pull request #21111: [FLINK-29664][runtime] Collect subpartition sizes of blocking result partitions

wanglijie95 commented on code in PR #21111:
URL: https://github.com/apache/flink/pull/21111#discussion_r1014852386


##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptivebatch/AdaptiveBatchScheduler.java:
##########
@@ -149,13 +159,45 @@ protected void startSchedulingInternal() {
         super.startSchedulingInternal();
     }
 
+    public boolean updateTaskExecutionState(TaskExecutionState taskExecutionState) {
+        updateResultPartitionBytesMetrics(taskExecutionState.getIOMetrics());

Review Comment:
   > Looks to me it's not right because the data produced by an execution vertex can change if 
   
   You are right. It looks like we have to record the subpartition bytes for each subpartition (no aggregation), then we can reset the bytes when a partition is reset. 
   However, once all partitions are finished, we can convert the subpartiton bytes to aggregated value(to reduce the space usage). Because the distribution of source splits does not affect the distribution of data consumed by downstream tasks of ALL_TO_ALL edges(Hashing or Rebalancing, we do not consider rare cases such as custom partitions here). 
   
   > Because `taskExecutionState.getExecutionState() == ExecutionState.FINISHED` does not mean the task truly transitions to FINISHED at JM side.
   
   Yes, we should check whether the state update was successful.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org