You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Akshay Iyangar <ai...@godaddy.com> on 2019/09/26 20:20:18 UTC

** Help need w.r.t parallelism settings in flink **

Hi
So we are running a beam pipeline that uses flink as its execution engine. We are currently on flink1.8
So per the flink documentation I see that there is an option that allows u to set

Parallelism
and
maxParallelism.

We actually want to set both so that we can dynamically scale the pipeline if there is back pressure on it.

I want to clarify a few doubts I had –


  1.  Does the maxParallelism only work for stateful operators or does it work for all the operators?


  1.  Also is the setting either or ? Like we can only set parallelism or maxParallelism? Or is it that we can set both?


  1.  Because, Currently I have the parallelism at 8 and maxparallelism at 32 but the UI only shows me that it is using parallelism of 8. It would be great if someone helps me understand the exact behavior of these parameters.

Thanks
Akshay Iyangar


Re: ** Help need w.r.t parallelism settings in flink **

Posted by Zhu Zhu <re...@gmail.com>.
Hi Akshay,

For your questions,
1. One main purpose of maxParallelism is to decide the count of
keyGroup. keyGroup is the bucket for keys when doing keyBy partitioning.
So a larger maxParallelism indicates a finer granularity for key
distribution. No matter it's a stateful operator or not.

2. You can only set the parallelism and Flink can automatically decide the
maxParallelism for it.
But it is recommended to set the maxParallelism to a fixed proper value.

3. The parallelism is the actually parallelism used.
maxParallelism is the upper bound limit of parallelism when you tries to
change the parallelism via manually rescaling.

Thanks,
Zhu Zhu

Akshay Iyangar <ai...@godaddy.com> 于2019年9月27日周五 上午4:20写道:

> Hi
>
> So we are running a beam pipeline that uses flink as its execution engine.
> We are currently on flink1.8
>
> So per the flink documentation I see that there is an option that allows u
> to set
>
>
>
> *Parallelism *
>
> and
>
> *maxParallelism*.
>
>
>
> We actually want to set both so that we can dynamically scale the pipeline
> if there is back pressure on it.
>
>
>
> I want to clarify a few doubts I had –
>
>
>
>    1. Does the maxParallelism only work for stateful operators or does it
>    work for all the operators?
>
>
>
>    1. Also is the setting either or ? Like we can only set parallelism or
>    maxParallelism? Or is it that we can set both?
>
>
>
>    1. Because, Currently I have the parallelism at 8 and maxparallelism
>    at 32 but the UI only shows me that it is using parallelism of 8. It would
>    be great if someone helps me understand the exact behavior of these
>    parameters.
>
>
>
> Thanks
>
> Akshay Iyangar
>
>
>