You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Yang Wang (Jira)" <ji...@apache.org> on 2020/04/14 05:05:00 UTC
[jira] [Created] (FLINK-17127) Make pod creating retry interval
configurable
Yang Wang created FLINK-17127:
---------------------------------
Summary: Make pod creating retry interval configurable
Key: FLINK-17127
URL: https://issues.apache.org/jira/browse/FLINK-17127
Project: Flink
Issue Type: New Feature
Components: Deployment / Kubernetes
Reporter: Yang Wang
Follow the discussion in this PR[1].
In the current implementation, the {{POD_CREATION_RETRY_INTERVAL}} is set to fixed value with "3s", which means when creating a taskmanager pod failed, we will schedule a delay retry in 3s. It could work for most cases. However, we still have a risk that too many retried of different Flink clusters will flood to Kubernetes api server. So we need to add an initial and max setting for retry interval, similar to {{NETWORK_REQUEST_BACKOFF_INITIAL/NETWORK_REQUEST_BACKOFF_MAX}}.
[1]. https://github.com/apache/flink/pull/11427#discussion_r406318451
--
This message was sent by Atlassian Jira
(v8.3.4#803005)