You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Zain Haider Nemati <za...@retailo.co> on 2022/05/20 08:51:14 UTC

Flatmap node at 100%

Hi,
Im seeing this behaviour in my flink job, what can I do to remove this
bottleneck

[image: image.png]

Re: Flatmap node at 100%

Posted by JasonLee <17...@163.com>.
Hi


Your picture is not visible here. You can put it on one of the Drawing bed, can you send out the memory stack information 


Best
JasonLee


---- Replied Message ----
| From | Zain Haider Nemati<za...@retailo.co> |
| Date | 05/22/2022 13:41 |
| To | yuxia<lu...@alumni.sjtu.edu.cn> |
| Cc | User<us...@flink.apache.org> ,
dev<de...@flink.apache.org> |
| Subject | Re: Flatmap node at 100% |
Hi Yuxia,
I did increase the parallelism to 16 but that is causing memory overflowing issues. Task manager heap memory collapses after a certain point when the job has run.
I'm attaching the metrics, the flatmap converts jsons and parses them to comma separated strings. Could you suggest how to optimize it?






On Fri, May 20, 2022 at 2:39 PM yuxia <lu...@alumni.sjtu.edu.cn> wrote:

HI, I think you can increase the parallelism of the flat map operator. For SQL job, you can refer the doc[1] to set parallelism. For datastream job, you can set the parallelism in your code.




Also, if possible,  you can try optimize  your code in the flatmap node .


[1]: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-resource-default-parallelism


Best regards,
Yuxia


发件人: "Zain Haider Nemati" <za...@retailo.co>
收件人: "User" <us...@flink.apache.org>, "dev" <de...@flink.apache.org>
发送时间: 星期五, 2022年 5 月 20日 下午 4:51:14
主题: Flatmap node at 100%



Hi, 

Im seeing this behaviour in my flink job, what can I do to remove this bottleneck




Re: Flatmap node at 100%

Posted by Zain Haider Nemati <za...@retailo.co>.
Hi Yuxia,
I did increase the parallelism to 16 but that is causing memory overflowing
issues. Task manager heap memory collapses after a certain point when the
job has run.
I'm attaching the metrics, the flatmap converts jsons and parses them to
comma separated strings. Could you suggest how to optimize it?

[image: image.png]

On Fri, May 20, 2022 at 2:39 PM yuxia <lu...@alumni.sjtu.edu.cn> wrote:

> HI, I think you can increase the parallelism of the flat map operator. For
> SQL job, you can refer the doc[1] to set parallelism. For datastream job,
> you can set the parallelism in your code.
>
>
> Also, if possible,  you can try optimize  your code in the flatmap node .
>
> [1]:
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-resource-default-parallelism
>
> Best regards,
> Yuxia
>
> ------------------------------
> *发件人: *"Zain Haider Nemati" <za...@retailo.co>
> *收件人: *"User" <us...@flink.apache.org>, "dev" <de...@flink.apache.org>
> *发送时间: *星期五, 2022年 5 月 20日 下午 4:51:14
> *主题: *Flatmap node at 100%
>
> Hi,
> Im seeing this behaviour in my flink job, what can I do to remove this
> bottleneck
>
> [image: image.png]
>
>

Re: Flatmap node at 100%

Posted by Zain Haider Nemati <za...@retailo.co>.
Hi Yuxia,
I did increase the parallelism to 16 but that is causing memory overflowing
issues. Task manager heap memory collapses after a certain point when the
job has run.
I'm attaching the metrics, the flatmap converts jsons and parses them to
comma separated strings. Could you suggest how to optimize it?

[image: image.png]

On Fri, May 20, 2022 at 2:39 PM yuxia <lu...@alumni.sjtu.edu.cn> wrote:

> HI, I think you can increase the parallelism of the flat map operator. For
> SQL job, you can refer the doc[1] to set parallelism. For datastream job,
> you can set the parallelism in your code.
>
>
> Also, if possible,  you can try optimize  your code in the flatmap node .
>
> [1]:
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-resource-default-parallelism
>
> Best regards,
> Yuxia
>
> ------------------------------
> *发件人: *"Zain Haider Nemati" <za...@retailo.co>
> *收件人: *"User" <us...@flink.apache.org>, "dev" <de...@flink.apache.org>
> *发送时间: *星期五, 2022年 5 月 20日 下午 4:51:14
> *主题: *Flatmap node at 100%
>
> Hi,
> Im seeing this behaviour in my flink job, what can I do to remove this
> bottleneck
>
> [image: image.png]
>
>

Re: Flatmap node at 100%

Posted by yuxia <lu...@alumni.sjtu.edu.cn>.
HI, I think you can increase the parallelism of the flat map operator. For SQL job, you can refer the doc[1] to set parallelism. For datastream job, you can set the parallelism in your code. 


Also, if possible, you can try optimize your code in the f latmap node . 

[1]: [ https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-resource-default-parallelism | https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-resource-default-parallelism ] 

Best regards, 
Yuxia 


发件人: "Zain Haider Nemati" <za...@retailo.co> 
收件人: "User" <us...@flink.apache.org>, "dev" <de...@flink.apache.org> 
发送时间: 星期五, 2022年 5 月 20日 下午 4:51:14 
主题: Flatmap node at 100% 

Hi, 
Im seeing this behaviour in my flink job, what can I do to remove this bottleneck 



Re: Flatmap node at 100%

Posted by yuxia <lu...@alumni.sjtu.edu.cn>.
HI, I think you can increase the parallelism of the flat map operator. For SQL job, you can refer the doc[1] to set parallelism. For datastream job, you can set the parallelism in your code. 


Also, if possible, you can try optimize your code in the f latmap node . 

[1]: [ https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-resource-default-parallelism | https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-resource-default-parallelism ] 

Best regards, 
Yuxia 


发件人: "Zain Haider Nemati" <za...@retailo.co> 
收件人: "User" <us...@flink.apache.org>, "dev" <de...@flink.apache.org> 
发送时间: 星期五, 2022年 5 月 20日 下午 4:51:14 
主题: Flatmap node at 100% 

Hi, 
Im seeing this behaviour in my flink job, what can I do to remove this bottleneck