You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Fei Feng (Jira)" <ji...@apache.org> on 2023/04/28 05:21:00 UTC
[jira] [Comment Edited] (FLINK-24947) Support host network for native K8s integration

    [ https://issues.apache.org/jira/browse/FLINK-24947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17717462#comment-17717462 ] 

Fei Feng edited comment on FLINK-24947 at 4/28/23 5:20 AM:
-----------------------------------------------------------

Hi,  [Yang Wang|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=wangyang0918] , I am studying the communication of Flink on k8s and have a question here. I don't quite understand your comment:

"{+}I believe the TaskManager pods could not be reused with host network enabled when JobManager failover. Because we are using the headless service for internal service.{+} "

 

I don't understand this reason。I think, when Flink cluster deployed with host network enabled and  in non-ha mode, the TaskManager could communicate with JobManager via internal service, and if the JobManager crashes and restarts a new one, the TaskManager have the ability to retrieve the new service address and port information from the internal service, then reconnect to new JobManager, regardless of whether the internal service is of clusterIP type or headless type。Because the service name would not change, only the address of the endpoint changes.  so why do you think the reason is "using the headless service" ? 

 

My current understanding is that you may want to say： Flink has the ability that reuse the TaskManager pods, but we don't use it, because it's not necessary or standalone ha service should not update the leader address。I'm not sure if my understanding is correct, could you please kindly help clarify my confusion?


was (Author: fei feng):
Hi,  [Yang Wang|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=wangyang0918] , I am studying the communication of Flink on k8s and have a question here. I don't quite understand your comment:

"{+}I believe the TaskManager pods could not be reused with host network enabled when JobManager failover. Because we are using the headless service for internal service.{+} "

 

I don't understand this reason。I think, when Flink cluster deployed with host network enabled and  in non-ha mode, the TaskManager could communicate with JobManager via internal service, and if the JobManager crashes and restarts a new one, the TaskManager have the ability to retrieve the new service address and port information from the internal service, regardless of whether the internal service is of clusterIP type or headless type。Because the service name would not change, only the address of the endpoint changes.  so why do you think the reason is "using the headless service" ? 

 

My current understanding is that you may want to say： Flink has the ability that reuse the TaskManager pods, but we don't use it, because it's not necessary or standalone ha service should not update the leader address。I'm not sure if my understanding is correct, could you please kindly help clarify my confusion?

> Support host network for native K8s integration
> -----------------------------------------------
>
>                 Key: FLINK-24947
>                 URL: https://issues.apache.org/jira/browse/FLINK-24947
>             Project: Flink
>          Issue Type: New Feature
>          Components: Deployment / Kubernetes
>            Reporter: liuzhuo
>            Assignee: liuzhuo
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.15.0
>
>
> For the use of flink on k8s, for performance considerations, it is important to choose a CNI plug-in. Usually we have two environments: Managed and UnManaged.
>   Managed: Cloud vendors usually provide very efficient CNI plug-ins, we don’t need to care about network performance issues
>   UnManaged: On self-built K8s clusters, CNI plug-ins are usually optional, similar to Flannel and Calcico, but such software network cards usually lose some performance or require some additional network strategies.
> For an unmanaged environment, if we also want to achieve the best network performance, should we support the *HostNetWork* model?
> Use the host network to achieve the best performance



--
This message was sent by Atlassian Jira
(v8.20.10#820010)