You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Lijie Wang (Jira)" <ji...@apache.org> on 2023/02/15 07:06:00 UTC

[jira] [Created] (FLINK-31079) Release Testing: Verify FLINK-29663 Further improvements of adaptive batch scheduler

Lijie Wang created FLINK-31079:
----------------------------------

             Summary: Release Testing: Verify FLINK-29663 Further improvements of adaptive batch scheduler
                 Key: FLINK-31079
                 URL: https://issues.apache.org/jira/browse/FLINK-31079
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
            Reporter: Lijie Wang
             Fix For: 1.17.0


This task aims to verify FLINK-29663 which improves the adaptive batch scheduler.

Before the change of FLINK-29663, adaptive batch scheduler will distribute subpartitoins according to the number of subpartitions, make different downstream subtasks consume roughly the same number of subpartitions. This will lead to imbalance loads of different downstream tasks when the subpartitions contain different amounts of data.

To solve this problem, in FLINK-29663, we let the adaptive batch scheduler distribute subpartitoins according to the amount of data, so that different downstream subtasks consume roughly the same amount of data. Note that currently it only takes effect for All-To-All edges.

The documentation of adaptive scheduler can be found [here|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#adaptive-batch-scheduler]

One can verify it by creating intended data skew on All-To-All edges.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)