You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/30 19:06:29 UTC

[GitHub] [airflow] andormarkus commented on pull request #16571: Implemented Basic EKS Integration

andormarkus commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-890095899


   Hi @ferruzzi,
   
   I would like to describe a special problem which are having problem when we are create EKS managed nodegroups with boto3.
   
   We got 3 environment on 3 separate AWS account. By AWS design, AZ are randomly assigned. If an instance type is available on account A in AZ 1 it might be not available in account B an AZ1. We are running into this issue:
   ```bash
   Your requested instance type (m5ad.4xlarge) is not supported in your requested Availability Zone (eu-central-1c). Please retry your request by not specifying an Availability Zone or choosing eu-central-1a, eu-central-1b.
   ```
   
   In this case, node group creation will fail with `CREATE_FAILED` error and the node group be available on EKS. When Airflow retry come, the second job will fail with the following error:
   ```bash
   botocore.errorfactory.ResourceInUseException: An error occurred (ResourceInUseException) when calling the CreateNodegroup operation: NodeGroup already exists with name [my_node] and cluster name [my_cluster]
   ```
   
   It would be great is there would be an option in this integration if create jobs fails with  `CREATE_FAILED` than it would delete the failed node group. 
    
   I hope I was clear, if not feel free to ask any question.
   
   Thanks,
   Andor
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org