You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Canbin Zheng (Jira)" <ji...@apache.org> on 2020/05/11 03:49:00 UTC
[jira] [Comment Edited] (FLINK-17598) Implement FileSystemHAServices for native K8s setups

    [ https://issues.apache.org/jira/browse/FLINK-17598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104040#comment-17104040 ] 

Canbin Zheng edited comment on FLINK-17598 at 5/11/20, 3:48 AM:
----------------------------------------------------------------

[~fly_in_gis] Not exactly. The *StatefulSet* could ensure at-most-one semantics for each individual pod. *StatefulSet* have strong guarantees regarding the existence of the pods. Pods are uniquely identified with an ordinal index. These pods are guaranteed to be spun up in ascending order of their index and taken down in descending order. 
Here is one of the most important features of *StatefulSet* regarding failover:  *StatefulSet* have a guarantee that there will never be more than 1 instance of a pod at any given time, which is different from a deployment.


was (Author: felixzheng):
[~fly_in_gis] Not exactly. The *StatefulSet* could ensure at-most-one semantics for each individual pod. *Statefulsets* have strong guarantees regarding the existence of the pods. Pods are uniquely identified with an ordinal index. These pods are guaranteed to be spun up in ascending order of their index and taken down in descending order. 
Here is one of the most important features of *StatefulSet* regarding failover:  *StatefulSets* have a guarantee that there will never be more than 1 instance of a pod at any given time, which is different from a deployment.

> Implement FileSystemHAServices for native K8s setups
> ----------------------------------------------------
>
>                 Key: FLINK-17598
>                 URL: https://issues.apache.org/jira/browse/FLINK-17598
>             Project: Flink
>          Issue Type: New Feature
>          Components: Deployment / Kubernetes, Runtime / Coordination
>            Reporter: Canbin Zheng
>            Priority: Major
>
> At the moment we use Zookeeper as a distributed coordinator for implementing JobManager high availability services. But in the cloud-native environment, there is a trend that more and more users prefer to use *Kubernetes* as the underlying scheduler backend while *Storage Object* as the Storage medium, both of these two services don't require Zookeeper deployment.
> As a result, in the K8s setups, people have to deploy and maintain their Zookeeper clusters for solving JobManager SPOF. This ticket proposes to provide a simplified FileSystem HA implementation with the leader-election removed, which saves the efforts of Zookeeper deployment.
> To achieve this, we plan to 
> # Introduce a {{FileSystemHaServices}} which implements the {{HighAvailabilityServices}}.
> # Replace Deployment with StatefulSet to ensure *at most one* semantics, preventing potential concurrent access to the underlying FileSystem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)