You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Gyula Fora (Jira)" <ji...@apache.org> on 2024/03/19 07:42:00 UTC
[jira] [Commented] (FLINK-34726) Flink Kubernetes Operator has some room for optimizing performance.
[ https://issues.apache.org/jira/browse/FLINK-34726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828210#comment-17828210 ]
Gyula Fora commented on FLINK-34726:
------------------------------------
Thanks for the detailed analysis [~Fei Feng] . You are completely right that we don't optimise the rest client usage and that may add a significant overhead. We have done similar optimisation in the past for config access/generation by using the FlinkResourceContext class.
We could probably move the rest client generation logic there instead of hiding it under the FlinkService completely. This will be however a bigger change as it will affect the methods of the FlinkService interface as well.
Sounds a bit strange that getSecondaryResource is so expensive as that should happen from a cache. We should look into it while it's expensive in the first place because passing the FlinkDeployment objects around will make the code a bit more complicated, but I guess that could also be hidden under the FlinkSessionJobContext
> Flink Kubernetes Operator has some room for optimizing performance.
> -------------------------------------------------------------------
>
> Key: FLINK-34726
> URL: https://issues.apache.org/jira/browse/FLINK-34726
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Affects Versions: kubernetes-operator-1.5.0, kubernetes-operator-1.6.0, kubernetes-operator-1.7.0
> Reporter: Fei Feng
> Priority: Major
> Attachments: operator_no_submit_no_kill.flamegraph.html
>
>
> When there is a huge number of FlinkDeployment and FlinkSessionJob in a kubernetes cluster, there will be a significant delay between event submit into reconcile thread pool and event is processed.
> this is our test:we give operator enough resource(cpu: 10core, memory: 20g, reconcile thread pool size was 200 ) and we deployed 10000 jobs firstly (one FlinkDeployment and one SessionJob per job) , then we do submit/delete job tests. we found that
> 1. it cost about 2min between create new FlinkDeployment and FlinkSessionJob CR to k8s and the flink job submited to jobmanager.
> 2. it cost about 1min between delete a FlinkDeployment and FlinkSessionJob CR and the flink job and session cluster cleared.
>
> I use async-profiler to get flamegraph when there is a huge number FlinkDeployment and FlinkSessionJob. I found two obvious areas for optimization
> 1. For Flinkdeployment: in the observe step, we call AbstractFlinkService.getClusterInfo/listJobs/getTaskManagerInfo , every time we call these method we need create RestClusterClient/ send requests/ close, I think we should reuse RestClusterClient as much as possible to avoid frequently creating objects to reduce GC pressure
> 2. For FlinkSessionJob (This issue is more obvious): in the whole reconcile loop, we call getSecondaryResource 5 times to get FlinkDeployement resource info. Based on my current understanding of the Flink Operator, I think we do not need to call it 5 times in a single reconcile loop, calling it once is enough. If yes, we cloud save 30% cpu usage (every getSecondaryResource cost 6% cpu usage)
> [^operator_no_submit_no_kill.flamegraph.html]
> I hope we can discuss solutions to address this problem together. I'm very willing to optimize and resolve this issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)