You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Aitozi (Jira)" <ji...@apache.org> on 2022/05/04 06:02:00 UTC

[jira] [Commented] (FLINK-25865) Support to set restart policy of TaskManager pod for native K8s integration

    [ https://issues.apache.org/jira/browse/FLINK-25865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17531499#comment-17531499 ] 

Aitozi commented on FLINK-25865:
--------------------------------

Hi [~wangyang0918] are you working on this now ? If not, I would like to work on this.

> Support to set restart policy of TaskManager pod for native K8s integration
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-25865
>                 URL: https://issues.apache.org/jira/browse/FLINK-25865
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / Kubernetes
>            Reporter: Yang Wang
>            Priority: Major
>
> After FLIP-201, Flink's TaskManagers will be able to be restarted without losing its local state. So it is reasonable to make the restart policy[1] of TaskManager pod could be configured.
> The current restart policy is {{{}Never{}}}. Flink will always delete the failed TaskManager pod directly and create a new one instead. This ticket could help to decrease the recovery time of TaskManager failure.
>  
> Please note that the working directory needs to be located in the emptyDir[1], which is retained in different restarts.
>  
> [1]. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy
> [2]. https://kubernetes.io/docs/concepts/storage/volumes/#emptydir



--
This message was sent by Atlassian Jira
(v8.20.7#820007)