You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/07/18 22:03:00 UTC

[jira] [Commented] (AIRFLOW-4730) Startup-timeout for launching pods on k8s executor/operator should be configurable

    [ https://issues.apache.org/jira/browse/AIRFLOW-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888367#comment-16888367 ] 

ASF GitHub Bot commented on AIRFLOW-4730:
-----------------------------------------

leahecole commented on pull request #5608: [AIRFLOW-4730]: WIP: Add a failing test to test_pod_launcher
URL: https://github.com/apache/airflow/pull/5608
 
 
   WIP - DO NOT MERGE, SEEKING FEEDBACK
   
   This PR is a super duper work in progress, hence the lack of squashed commits and terrible variable names. As part of a live coding session at OSCON where we demonstrated pair programming and TDD, @holdenk and I looked at issue [4730](https://issues.apache.org/jira/browse/AIRFLOW-4730) and wrote a failing test to address what we think the potential solution is. 
   
   Before implementing a solution (or someone else implementing it) we'd love feedback on:
   
   * If this is how we should be testing this
   * More helpful variable names
   * Any other changes you may notice
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Startup-timeout for launching pods on k8s executor/operator should be configurable
> ----------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-4730
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4730
>             Project: Apache Airflow
>          Issue Type: Task
>          Components: executors
>    Affects Versions: 1.10.3
>            Reporter: Daniel Imberman
>            Priority: Minor
>              Labels: beginner, kubernetes, starter
>         Attachments: Screen Shot 2019-06-04 at 9.04.31 AM.png
>
>
> !Screen Shot 2019-06-04 at 9.04.31 AM.png!
> Currently users that have affinities for their DAGs are getting failures when k8s fails to schedule due to lack of available nodes.
> It looks like this might have to do with the fact that the k8s executor uses run_pod_async meaning that it attempts it once and then fails on any failure from the API. Could probably add logic to read the API section for affinity failures
> https://github.com/apache/airflow/blob/05c06b0f6669f677495328c68c2bd05f6d0e69db/airflow/kubernetes/pod_launcher.py#L59



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)