You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Prasanna kumar <pr...@gmail.com> on 2020/07/06 17:01:36 UTC

Flink Parallelism for various type of transformation

Hi ,

I used t2.medium machines for the task manager nodes. It has 2 CPU and 4GB
memory.

But the task manager screen shows that there are 4 slots.

Generally we should match the number of slots to the number of cores.

[image: image.png]

Our pipeline is Source -> Simple Transform -> Sink.

What happens when we have more slots than cores in following scenarios?
1) The transform is just changing of json format.

2)  When the transformation is done by hitting another server (HTTP
request)

Thanks,
Prasanna.

Re: Flink Parallelism for various type of transformation

Posted by Arvid Heise <ar...@ververica.com>.
Hi Prasanna,

overcommitting cores was actually a recommended technique a while ago to
counter-balance I/O. So it's not bad per se.

However, with slot sharing each core is already doing the work for source,
transform, sink, so it's not necessary. So I'd go with slots = cores and I
rather strongly suggest to switch to async I/O to perform the external
transformation. [1]

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html

On Mon, Jul 6, 2020 at 7:01 PM Prasanna kumar <pr...@gmail.com>
wrote:

> Hi ,
>
> I used t2.medium machines for the task manager nodes. It has 2 CPU and 4GB
> memory.
>
> But the task manager screen shows that there are 4 slots.
>
> Generally we should match the number of slots to the number of cores.
>
> [image: image.png]
>
> Our pipeline is Source -> Simple Transform -> Sink.
>
> What happens when we have more slots than cores in following scenarios?
> 1) The transform is just changing of json format.
>
> 2)  When the transformation is done by hitting another server (HTTP
> request)
>
> Thanks,
> Prasanna.
>


-- 

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng

Re: Flink Parallelism for various type of transformation

Posted by Arvid Heise <ar...@ververica.com>.
Hi Prasanna,

overcommitting cores was actually a recommended technique a while ago to
counter-balance I/O. So it's not bad per se.

However, with slot sharing each core is already doing the work for source,
transform, sink, so it's not necessary. So I'd go with slots = cores and I
rather strongly suggest to switch to async I/O to perform the external
transformation. [1]

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html

On Mon, Jul 6, 2020 at 7:01 PM Prasanna kumar <pr...@gmail.com>
wrote:

> Hi ,
>
> I used t2.medium machines for the task manager nodes. It has 2 CPU and 4GB
> memory.
>
> But the task manager screen shows that there are 4 slots.
>
> Generally we should match the number of slots to the number of cores.
>
> [image: image.png]
>
> Our pipeline is Source -> Simple Transform -> Sink.
>
> What happens when we have more slots than cores in following scenarios?
> 1) The transform is just changing of json format.
>
> 2)  When the transformation is done by hitting another server (HTTP
> request)
>
> Thanks,
> Prasanna.
>


-- 

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng