You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Aeden Jameson <ae...@gmail.com> on 2021/03/18 16:40:55 UTC

Understanding Max Parallelism

I'm trying to get my head around the impact of setting max parallelism.

* Does max parallelism primarily serve as a reservation for future
increases to parallelism? The reservation being the ability to restore
from checkpoints and savepoints after increases to parallelism.

* Does it serve as a runtime suggestion for how many instances of an
operator the job could spin up? Or is it just a reservation like I
asked above?

* It also appears to impact the distribution of key groups among
subtasks from what I've read and seen from testing. Is that
understanding correct?

* What are the other important implications?


Thank you,
Aeden

Re: Understanding Max Parallelism

Posted by Dawid Wysakowicz <dw...@apache.org>.
Hi Aeden,

The maxParallelism option defines the number of key groups that will be
created within the keyed state and thus define the maximum parallelism
that a Flink keyed job can scale up to as each key group must be
atomically assigned to a single task. You can read more on how the
rescaling works in this blogpost[1].

Following up on your other questions it is mainly a reservation as of
now, but it will definitely be a cap in case of a reactive/auto scaling
because of the above.

Best,

Dawid

[1] https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html

On 18/03/2021 17:40, Aeden Jameson wrote:
> I'm trying to get my head around the impact of setting max parallelism.
>
> * Does max parallelism primarily serve as a reservation for future
> increases to parallelism? The reservation being the ability to restore
> from checkpoints and savepoints after increases to parallelism.
>
> * Does it serve as a runtime suggestion for how many instances of an
> operator the job could spin up? Or is it just a reservation like I
> asked above?
>
> * It also appears to impact the distribution of key groups among
> subtasks from what I've read and seen from testing. Is that
> understanding correct?
>
> * What are the other important implications?
>
>
> Thank you,
> Aeden