You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/09/27 14:40:51 UTC

[GitHub] [airflow] hterik opened a new pull request, #26710: Add config to control Kubernetes Client retry behaviour

hterik opened a new pull request, #26710:
URL: https://github.com/apache/airflow/pull/26710

   Occasionally, a request to the Kubernetes API might fail due to temporary network glitches. By default, such requests are retried 3 times, without any delay between.
   On the final failure, the entire scheduler crashes.
   
   This configuration allows the urllib retry behaviour to be adjusted, mainly to allow some backoff in between each retry, giving the network time to recover before the final attempt.
   
   Fixes #24748


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dinedal commented on pull request #26710: Add config to control Kubernetes Client retry behaviour

Posted by "dinedal (via GitHub)" <gi...@apache.org>.
dinedal commented on PR #26710:
URL: https://github.com/apache/airflow/pull/26710#issuecomment-1583047467

   Hi, this is a problem for us in production still, can we make forward progress here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] closed pull request #26710: Add config to control Kubernetes Client retry behaviour

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #26710: Add config to control Kubernetes Client retry behaviour
URL: https://github.com/apache/airflow/pull/26710


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #26710: Add config to control Kubernetes Client retry behaviour

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #26710:
URL: https://github.com/apache/airflow/pull/26710#issuecomment-1368130455

   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hterik commented on a diff in pull request #26710: Add config to control Kubernetes Client retry behaviour

Posted by GitBox <gi...@apache.org>.
hterik commented on code in PR #26710:
URL: https://github.com/apache/airflow/pull/26710#discussion_r981342716


##########
airflow/kubernetes/kube_client.py:
##########
@@ -19,25 +19,18 @@
 
 import logging
 
+import urllib3
+
 from airflow.configuration import conf
 
 log = logging.getLogger(__name__)
 
 try:
     from kubernetes import client, config
-    from kubernetes.client import Configuration
+    from kubernetes.client import ApiClient, Configuration
     from kubernetes.client.rest import ApiException
 
     has_kubernetes = True
-
-    def _disable_verify_ssl() -> None:
-        if hasattr(Configuration, 'get_default_copy'):
-            configuration = Configuration.get_default_copy()
-        else:
-            configuration = Configuration()
-        configuration.verify_ssl = False
-        Configuration.set_default(configuration)

Review Comment:
   Is it ok to remove this `set_default` and only rely on every other code path going through `get_kube_client`?
   
   I see in `pod_generator` and `TaskInstance` creates `ApiClient` in many places, but only uses it for offline operations.
   
   It's also created by `hooks.kubernetes`, haven't looked into what that does yet.
   
   -----
   
   Maybe it's safer to keep it this way and incorporate the new config using this same method?



##########
airflow/kubernetes/kube_client.py:
##########
@@ -104,16 +97,25 @@ def get_kube_client(
     if conf.getboolean('kubernetes', 'enable_tcp_keepalive'):
         _enable_tcp_keepalive()
 
+    client_config = Configuration.get_default_copy()
+
+    retryparams = conf.getjson('kubernetes', 'client_retry_configuration_kwargs', fallback={})
+    if retryparams != {}:
+        client_config.retries = urllib3.util.Retry(**retryparams)

Review Comment:
   Is this level of configuration granularity good? Or is it enough to only expose the backoff and number?
   I could even go as far as saying some kind of backoff should be enabled by default, without configuration. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hterik commented on pull request #26710: Add config to control Kubernetes Client retry behaviour

Posted by GitBox <gi...@apache.org>.
hterik commented on PR #26710:
URL: https://github.com/apache/airflow/pull/26710#issuecomment-1313158050

   stale ping


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #26710: Add config to control Kubernetes Client retry behaviour

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #26710:
URL: https://github.com/apache/airflow/pull/26710#issuecomment-1312279239

   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add config to control Kubernetes Client retry behaviour [airflow]

Posted by "hterik (via GitHub)" <gi...@apache.org>.
hterik commented on PR #26710:
URL: https://github.com/apache/airflow/pull/26710#issuecomment-1748064603

   > Hi, this is a problem for us in production still, can we make forward progress here?
   
   It was fixed in https://github.com/apache/airflow/pull/29809 instead


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org