You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Aeden Jameson <ae...@gmail.com> on 2021/03/18 16:40:55 UTC
Understanding Max Parallelism
I'm trying to get my head around the impact of setting max parallelism.
* Does max parallelism primarily serve as a reservation for future
increases to parallelism? The reservation being the ability to restore
from checkpoints and savepoints after increases to parallelism.
* Does it serve as a runtime suggestion for how many instances of an
operator the job could spin up? Or is it just a reservation like I
asked above?
* It also appears to impact the distribution of key groups among
subtasks from what I've read and seen from testing. Is that
understanding correct?
* What are the other important implications?
Thank you,
Aeden
Re: Understanding Max Parallelism
Posted by Dawid Wysakowicz <dw...@apache.org>.
Hi Aeden,
The maxParallelism option defines the number of key groups that will be
created within the keyed state and thus define the maximum parallelism
that a Flink keyed job can scale up to as each key group must be
atomically assigned to a single task. You can read more on how the
rescaling works in this blogpost[1].
Following up on your other questions it is mainly a reservation as of
now, but it will definitely be a cap in case of a reactive/auto scaling
because of the above.
Best,
Dawid
[1] https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html
On 18/03/2021 17:40, Aeden Jameson wrote:
> I'm trying to get my head around the impact of setting max parallelism.
>
> * Does max parallelism primarily serve as a reservation for future
> increases to parallelism? The reservation being the ability to restore
> from checkpoints and savepoints after increases to parallelism.
>
> * Does it serve as a runtime suggestion for how many instances of an
> operator the job could spin up? Or is it just a reservation like I
> asked above?
>
> * It also appears to impact the distribution of key groups among
> subtasks from what I've read and seen from testing. Is that
> understanding correct?
>
> * What are the other important implications?
>
>
> Thank you,
> Aeden