You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/06/21 18:56:56 UTC

[GitHub] [airflow] ferruzzi opened a new pull request #16571: Implemented Basic EKS Integration

ferruzzi opened a new pull request #16571:
URL: https://github.com/apache/airflow/pull/16571


   Added hooks and operators for initial EKS support 
   
   Includes Hooks and operators for the following EKS endpoints, along with tests, and documentation:
    - Create Cluster
    - Create Nodegroup
    - Delete Cluster
    - Delete Nodegroup
    - Describe Cluster
    - Describe Nodegroup
    - List Cluster
    - List Nodegroup
    
    Increases the required version for the `moto` package from 2.0.8 to 2.0.10
   
   closes: #8544
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272197



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "

Review comment:
       AWS CLI dependency removed in https://github.com/apache/airflow/pull/16571/commits/a6ee2c369227212e12b8d5320c91fd63a6dc769d




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669060626



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+
+    # Set the filename to something which can be found later if needed.
+    filename_prefix = KUBE_CONFIG_FILE_PREFIX + pod_name
+    with tempfile.NamedTemporaryFile(prefix=filename_prefix, mode='w', delete=False) as config_file:

Review comment:
       Should be addressed by https://github.com/apache/airflow/pull/16571/commits/8687b736994d1911011dd8bd2b2903d82ad30a8c




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r664171804



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(

Review comment:
       Moved into the hook module in a coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r686265077



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,420 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from contextlib import contextmanager
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import yaml
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.create_cluster(
+            name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+        )
+
+        self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+        return response
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+        # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+        # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+        # The 'shared' value allows more than one resource to use the subnet.
+        tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+        if "tags" in kwargs:
+            tags = {**tags, **kwargs["tags"]}
+            kwargs.pop("tags")
+
+        response = eks_client.create_nodegroup(
+            clusterName=clusterName,
+            nodegroupName=nodegroupName,
+            subnets=subnets,
+            nodeRole=nodeRole,
+            tags=tags,
+            **kwargs,
+        )
+
+        self.log.info(
+            "Created a managed nodegroup named %s in cluster %s",
+            response.get('nodegroup').get('nodegroupName'),
+            response.get('nodegroup').get('clusterName'),
+        )
+        return response
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.delete_cluster(name=name)
+
+        self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+        return response
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+
+        self.log.info(
+            "Deleted nodegroup named %s from cluster %s.",
+            response.get('nodegroup').get('nodegroupName'),
+            response.get('nodegroup').get('clusterName'),
+        )
+        return response
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:

Review comment:
       Found a few other places where I think I misused Optional, I'll be dropping a commit in here with those corrections later today or tomorrow.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660446997



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "

Review comment:
       Your test doesn't check for v2 specifically -- the pip-installable v1 of awscli would also pass this check.
   
   Do we _need_ v2 here?
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660456184



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.test_eks_utils import convert_keys, random_names

Review comment:
       Same here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272107



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")

Review comment:
       AWS CLI dependency removed in https://github.com/apache/airflow/pull/16571/commits/a6ee2c369227212e12b8d5320c91fd63a6dc769d




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r656034542



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> str:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return:  A JSON serialized string of the API call results.

Review comment:
       And follow-up question: why does `list_clusters` not follow this pattern? (it returns a dict)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r678510136



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)

Review comment:
       It's in the 15-20 minute range, in my experience.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r668388339



##########
File path: airflow/providers/amazon/aws/example_dags/example_eks_pod_operation.py
##########
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from datetime import datetime
+
+from airflow.models.dag import DAG
+from airflow.providers.amazon.aws.operators.eks import EKSPodOperator
+from airflow.utils.dates import days_ago
+
+####
+# NOTE: This example requires an existing EKS Cluster with a compute backend.
+# see: example_eks_create_cluster_with_nodegroup.py
+####
+
+CLUSTER_NAME = 'existing-cluster-with-nodegroup-ready-for-pod'
+BUCKET_SUFFIX = datetime.now().strftime("-%Y%b%d-%H%M").lower()

Review comment:
       This is dropped already in a coming revision; a previous comment asked that sample dags not generate external artifacts; this dag now just performs `cmds=["sh", "-c", "ls"]`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659253140



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon
+EKS Cluster, ``EKSListClustersOperator`` and ``EKSDescribeClusterOperator`` to verify creation, then
+``EKSDeleteClusterOperator`` to delete the Cluster.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster]
+    :end-before: [END howto_operator_eks_create_cluster]
+
+
+.. _howto/operator:EKSListClustersOperator:
+.. _howto/operator:EKSDescribeClusterOperator:
+
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we list all Amazon EKS Clusters.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_clusters]
+    :end-before: [END howto_operator_eks_list_clusters]
+
+In the following code we retrieve details for a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_cluster]
+    :end-before: [END howto_operator_eks_describe_cluster]
+
+
+.. _howto/operator:EKSDeleteClusterOperator:
+
+Deleting Amazon EKS Clusters
+----------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_cluster]
+    :end-before: [END howto_operator_eks_delete_cluster]
+
+
+.. _howto/operator:EKSCreateNodegroupOperator:
+
+Creating Amazon EKS Managed NodeGroups
+--------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_nodegroup.py`` uses ``EKSCreateNodegroupOperator``
+to create an Amazon EKS Managed Nodegroup using an existing cluster, ``EKSListNodegroupsOperator``
+and ``EKSDescribeNodegroupOperator`` to verify creation, then ``EKSDeleteNodegroupOperator``
+to delete the nodegroup.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Managed Nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_nodegroup]
+    :end-before: [END howto_operator_eks_create_nodegroup]
+
+
+.. _howto/operator:EKSListNodegroupsOperator:
+.. _howto/operator:EKSDescribeNodegroupOperator:
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we retrieve details for a given Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_nodegroup]
+    :end-before: [END howto_operator_eks_describe_nodegroup]
+
+
+In the following code we list all Amazon EKS Nodegroups in a given EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_nodegroup]
+    :end-before: [END howto_operator_eks_list_nodegroup]
+
+
+.. _howto/operator:EKSDeleteNodegroupOperator:
+
+Deleting Amazon EKS Managed Nodegroups
+--------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete an Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_nodegroup]
+    :end-before: [END howto_operator_eks_delete_nodegroup]
+
+
+Creating Amazon EKS Clusters and Node Groups Together
+------------------------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_stack.py`` demonstrates using
+``EKSCreateClusterOperator`` to create an Amazon EKS cluster and underlying
+Amazon EKS node group in one command.  ``EKSDescribeClustersOperator`` and
+``EKSDescribeNodegroupsOperator`` verify creation, then ``EKSDeleteClusterOperator``
+deletes all created resources.
+
+Prerequisites
+"""""""""""""
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached

Review comment:
       ```suggestion
     `ec2.amazon.aws.com` must be in the Trusted Relationships
     `eks.amazonaws.com` must be added to the Trusted Relationships
     `AmazonEC2ContainerRegistryReadOnly` IAM Policy must be attached
     `AmazonEKSClusterPolicy` IAM Policy must be attached
     `AmazonEKSWorkerNodePolicy` IAM Policy must be attached
   ```
   To avoid spelling errors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #16571:
URL: https://github.com/apache/airflow/pull/16571


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660137693



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       All fields created in ctor. We should opt-out approach use. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r656027992



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> str:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return:  A JSON serialized string of the API call results.

Review comment:
       Why return it as a string, not the Python object/dict? 

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,646 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+        with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.eks_hook = EKSHook(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        self.eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while self.eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            self.eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.eks_hook = EKSHook(**kwargs)

Review comment:
       Plese don't create hooks in Operator constructors -- we generally try to avoid it as it _can_ end up creating network requests during DAG parse time which we want to avoid.

##########
File path: airflow/providers/amazon/aws/hooks/base_aws.py
##########
@@ -347,6 +347,7 @@ def __init__(
         client_type: Optional[str] = None,
         resource_type: Optional[str] = None,
         config: Optional[Config] = None,
+        **kwargs,

Review comment:
       If we don't _do_ anything with these kwargs why do we need to accept them here?

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> str:

Review comment:
       Interesting conundrum here. The Python naming style says these args should be `role_arn` and `resources_vpc_config` -- but _maybe_ it makes sense to mirror the Boto/AWS API names here.
   
   What do other think, and what do we do in other AWS hooks?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-877533314


   I still have to sort out the docs changes that @mik-laj asked for, but I think this batch of commits should address most/all of the other issues.
   
   I misunderstood the contributing guidelines and thought you wanted PRs to be squashed before submission. I've been advised otherwise, so I'll not do that in the future.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660163361



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Then I am missing what makes it ok for GKE users to have to install the gcloud cli tool but not for EKS users to have to install the aws cli tool.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi closed pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi closed pull request #16571:
URL: https://github.com/apache/airflow/pull/16571


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659255470



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Here is a real-world example for usage gcloud access token to access kubernetes cluster:
   https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/using_gke_with_terraform#using-the-kubernetes-and-helm-providers




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670141477



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,452 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from contextlib import contextmanager
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+    @contextmanager
+    def generate_config_file(
+        self,
+        eks_cluster_name: str,
+        pod_namespace: str,
+        pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+        pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    ) -> str:
+        """
+        Writes the kubeconfig file given an EKS Cluster.
+
+        :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type eks_cluster_name: str
+        :param pod_namespace: The namespace to run within kubernetes.
+        :type pod_namespace: str
+        :param pod_username: The username under which to execute the pod.
+        :type pod_username: str
+        :param pod_context: The name of the context access parameters to use.
+        :type pod_context: str

Review comment:
       Nothing here has to do with a filename; it is a randomly generated name.  Those values are passed into the kubeconfig file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r672566540



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       Removing in coming revision.

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/3d8cc9379ab9e7179f837f45bacda64fa6f2a44e




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660873123



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()

Review comment:
       ACK, I'll correct this in the next revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660126371



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Clusters in your AWS account with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListClustersOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_clusters(
+            maxResults=self.maxResults, nextToken=self.nextToken, verbose=self.verbose
+        )
+
+
+class EKSListNodegroupsOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Nodegroups associated with the specified EKS Cluster with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListNodegroupsOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+
+
+class EKSPodOperator(KubernetesPodOperator):
+    """
+    Executes a task in a Kubernetes pod on the specified Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSPodOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to execute the task on.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+       for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param kube_config_file_path: Path to save the generated kube_config file to.
+    :type kube_config_file_path: str
+    :param in_cluster: If True, look for config inside the cluster; if False look for a local file path.
+    :type in_cluster: bool
+    :param namespace: The namespace in which to execute the pod.
+    :type namespace: str
+    :param pod_context: The security context to use while executing the pod.
+    :type pod_context: str
+    :param pod_name: The unique name to give the pod.
+    :type pod_name: str
+    :param pod_username: The username to use while executing the pod.
+    :type pod_username: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :param aws_profile: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(  # pylint: disable=too-many-arguments,too-many-locals
+        self,
+        cluster_name: str,
+        cluster_role_arn: Optional[str] = None,
+        # A default path will be used if none is provided.
+        kube_config_file_path: Optional[str] = os.environ.get(KUBE_CONFIG_ENV_VAR, DEFAULT_KUBE_CONFIG_PATH),

Review comment:
       As it is written, the credential file will be overwritten each pod, yes.  I will look into alternatives.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660455299



##########
File path: setup.py
##########
@@ -500,7 +500,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'jira',
     'jsondiff',
     'mongomock',
-    'moto~=2.0',
+    'moto~=2.0.10',

Review comment:
       ```suggestion
       'moto~=2.0,>=2.0.10',
   ```
   
   `~=2.0.10`  is the same as `>=2.0.10,<2.1` which is more restrictive than we want I think.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272550



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)

Review comment:
       we should use the Hook to create a session because this will provide a uniform way to manage credential, which will allow this operator to be used in any environment i.e.  on-premiss, on GCP, on Cloud Composer and other.  Now, rhis operator can only be used in AWS environment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670140426



##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()
+REGION: str = Session().region_name

Review comment:
       Removed both in upcoming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669904718



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.delete_nodegroup_operator = EKSDeleteNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "delete_nodegroup")
+    def test_existing_nodegroup(self, mock_delete_nodegroup):
+        self.delete_nodegroup_operator.execute({})
+
+        mock_delete_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name
+        )
+
+
+class TestEKSDescribeAllClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.describe_all_clusters_operator = EKSDescribeAllClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_clusters_exist_returns_all_cluster_details(self, mock_describe_cluster, mock_list_clusters):
+        cluster_names: List[str] = NAME_LIST
+        response = dict(clusters=cluster_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_cluster.return_value = EMPTY_CLUSTER
+        mock_list_clusters.return_value = response
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        assert mock_describe_cluster.call_count == len(cluster_names)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_no_clusters_exist(self, mock_describe_cluster, mock_list_clusters):
+        mock_list_clusters.return_value = dict(clusters=list(), token=DEFAULT_NEXT_TOKEN)
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        mock_describe_cluster.assert_not_called()
+
+
+class TestEKSDescribeAllNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_all_nodegroups_operator = EKSDescribeAllNodegroupsOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_nodegroups_exist_returns_all_nodegroup_details(
+        self, mock_describe_nodegroup, mock_list_nodegroups
+    ):
+        nodegroup_names: List[str] = NAME_LIST
+        cluster_name: str = random_names()
+        response = dict(cluster=cluster_name, nodegroups=nodegroup_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_nodegroup.return_value = EMPTY_NODEGROUP
+        mock_list_nodegroups.return_value = response
+
+        self.describe_all_nodegroups_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once()
+        assert mock_describe_nodegroup.call_count == len(nodegroup_names)
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_no_nodegroups_exist(self, mock_describe_nodegroup, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list(), token="")
+
+        self.describe_all_nodegroups_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once()
+        mock_describe_nodegroup.assert_not_called()
+
+
+class TestEKSDescribeClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_cluster_operator = EKSDescribeClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_describe_cluster(self, mock_describe_cluster):
+        mock_describe_cluster.return_value = DESCRIBE_CLUSTER_RESULT
+
+        self.describe_cluster_operator.execute({})
+
+        mock_describe_cluster.assert_called_once_with(name=self.cluster_name, verbose=False)
+
+
+class TestEKSDescribeNodegroupOperator(unittest.TestCase):
+    def setUp(self):
+        self.cluster_name = random_names()
+        self.nodegroup_name = random_names()
+
+        self.describe_nodegroup_operator = EKSDescribeNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_describe_nodegroup(self, mock_describe_nodegroup):
+        mock_describe_nodegroup.return_value = DESCRIBE_NODEGROUP_RESULT
+
+        self.describe_nodegroup_operator.execute({})
+
+        mock_describe_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name, verbose=False
+        )
+
+
+class TestEKSListClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_names: List[str] = NAME_LIST
+
+        self.list_clusters_operator = EKSListClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    def test_list_clusters(self, mock_list_clusters):
+        mock_list_clusters.return_value = self.cluster_names
+
+        self.list_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+
+
+class TestEKSListNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()

Review comment:
       Is it needed? Why can't you use a constant value?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667282634



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")

Review comment:
       We should accept conn_id to ensure unify credential management.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661849276



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Early tests with the proposed get_bearer_token (https://github.com/apache/airflow/pull/16571#discussion_r659252727) are promising.  I'm going to play with some edge cases and see how it holds up, but I like the direction that is taking.
   
   In general, are we happy with the security practices used there?  Any concerns we need to pay attention to or address with that approach?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670163837



##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()
+REGION: str = Session().region_name
+SUBNET_IDS: List[str] = ["subnet-12345ab", "subnet-67890cd"]
+TASK_ID = os.environ.get("TASK_ID", "test-eks-operator")
+
+
+AMI_TYPE_KEY: str = "amiType"
+AMI_TYPE_VALUE: str = "AL2_x86_64"
+
+CLIENT_REQUEST_TOKEN_KEY: str = "clientRequestToken"
+CLIENT_REQUEST_TOKEN_VALUE: str = "test_request_token"
+
+DISK_SIZE_KEY: str = "diskSize"
+DISK_SIZE_VALUE: int = 30
+
+ENCRYPTION_CONFIG_KEY: str = "encryptionConfig"
+ENCRYPTION_CONFIG_VALUE: List[Dict] = [{"resources": ["secrets"], "provider": {"keyArn": "arn:of:the:key"}}]
+
+INSTANCE_TYPES_KEY: str = "instanceTypes"
+INSTANCE_TYPES_VALUE: List[str] = ["t3.medium"]
+
+KUBERNETES_NETWORK_CONFIG_KEY: str = "kubernetesNetworkConfig"
+KUBERNETES_NETWORK_CONFIG_VALUE: Dict = {"serviceIpv4Cidr": "172.20.0.0/16"}
+
+LABELS_KEY: str = "labels"

Review comment:
       Collapsed key/value and tuple declarations in coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659253383



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py

Review comment:
       ```suggestion
    - ``example_eks_create_cluster.py``
    - ``example_eks_create_cluster_with_nodegroup.py``
    - ``example_eks_create_nodegroup.py``
    - ``example_eks_pod_operator.py``
   ```
   To avoid spelling check errors
   
   We do not try to teach specific examples in these guides, so we do not have to focus on them. On the other hand, when a user is interested, each example has a link that takes it to the source code.
   

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       All fields created in ctor. 

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       All fields created in ctor. We should opt-out approach use. 

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")

Review comment:
       How to pass the credentials to AWS CLI when we have Airflow installed at alternative cloud provider or on-permise? In the case of ``GKEStartPodOperator``, we use [the ``GoogleBaseHook.provide_authorized_gcloud`` method](https://github.com/apache/airflow/blob/2625007c8aeca9ed98dea361ba13c2622482d71f/airflow/providers/google/common/hooks/base_google.py#L483) to pass credential from Airflow to Cloud SDK.

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       When defining a DAG, the user should define all tasks. Here, however, there is a case where the number of tasks is not known until the first page is downloaded, and therefore cannot be defined in Airflow.
   
   In the case of Google, we've always fetched all items from all pages. This way, the user could access all elements and did not have to define dynamic DAGs.

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       >  the lesser evil was assuming that someone using an AWS service would be able to install an AWS tool.
   
   For me, this is very problematic for three reasons: 
   - Passing credentials from Airflow to AWS CLI.
   - AWS CLI is a native library because it cannot install the latest version via PIP, so in many cases it will be very difficult to use.
   - Official Docker image doesn't have preinstalled AWS CLI.

##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py

Review comment:
       Yes. It ignores any code literal or code block.

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")

Review comment:
       This is important because you probably used the default credentials from the CLI in Breeze, as you were logged in there before.
   
   In a production environment, these credentials will be provided by Airflow e.g. via the Secret backend, So AWS CLI must be able to share the same credentials. Some Airflow users have a lot of AWS credentials for different accounts and we should be able to use them correctly in this operator as well. The user should not be aware in any way that a CLI is being used to get the token.

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       For more inspiration, you can also look at [aws-cli/awscli/customizations/eks/get_token.py](https://github.com/aws/aws-cli/blob/develop/awscli/customizations/eks/get_token.py), [kubergrunt eks token](https://github.com/gruntwork-io/kubergrunt), [kubernetes-sigs/aws-iam-authenticator](https://github.com/kubernetes-sigs/aws-iam-authenticator)

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       No. Gcloud is not preinstalled. We have a guide that explains how to add `gcloud` to a Docker image. See: http://airflow.apache.org/docs/docker-stack/recipes.html#google-cloud-sdk-installation

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       The KubernetesEngine operator requires an update. It was added when I didn't know a better way to authenticate. it should use google access-token for authentication: https://github.com/apache/airflow/pull/16571#discussion_r659255470
   
   If you want, you can keep a dependency on AWS CLI (though I'd prefer not), but you definitely should pass credentials from Airflow to AWS CLI and that's the main issue. Now, this operator cannot be used on Cloud Composer, GKE, Astronomer, or many other environments that do not have set up the default AWS credentials in AWS CLI.

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       The best way to integrate is when the token generation code would also use the Airflow hook because then we would have the greatest certainty that all credentials were passed from Airflow correctly. ;-)

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       > that is asking for a problem, later IMHO.
   
   I agree that this could bring unexpected problems in the future, but having to install AWS CLI by users and integrating Airflow with AWS CLI itself can be very difficult in edge cases too. We do not have a perfect solution here. The authorization method rarely changes compared to the product API, so I believe it will be enough stable approach.
   
   Especially since it is described in the AWS documentation.
   > Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through [the AWS IAM authenticator for Kubernetes](https://github.com/kubernetes-sigs/aws-iam-authenticator). 
   
   
   or
   
   > This can be used as an alternative to the aws-iam-authenticator.
   
   https://docs.aws.amazon.com/cli/latest/reference/eks/get-token.html
   
   In the aws-iam-authenticator documentation, we have the code snippet I cited above.
   https://github.com/kubernetes-sigs/aws-iam-authenticator#api-authorization-from-outside-a-cluster

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       > that is asking for a problem, later IMHO.
   
   I agree that this could bring unexpected problems in the future, but having to install AWS CLI by users and integrating Airflow with AWS CLI itself can be very difficult in edge cases too. We do not have a perfect solution here. The authorization method rarely changes compared to the product API, so I believe it will be enough stable approach.
   
   Especially since it is described in the AWS documentation.
   > Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through [the AWS IAM authenticator for Kubernetes](https://github.com/kubernetes-sigs/aws-iam-authenticator). 
   
   https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html
   
   or
   
   > This can be used as an alternative to the aws-iam-authenticator.
   
   https://docs.aws.amazon.com/cli/latest/reference/eks/get-token.html
   
   In the aws-iam-authenticator documentation, we have the code snippet I cited above.
   https://github.com/kubernetes-sigs/aws-iam-authenticator#api-authorization-from-outside-a-cluster

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       If you are concerned about maintaining this code, why not add it to AWS SDK for Python? It could have been helpful for other people as well. Have you thought to create an internal ticket on this topic?

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       If you are concerned about maintaining this code, why not add it to AWS SDK for Python? It could have been helpful for other people as well. Have you thought to create an internal ticket on this topic? We may not solve this problem now, but we may hope to improve it in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669060547



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+
+    # Set the filename to something which can be found later if needed.
+    filename_prefix = KUBE_CONFIG_FILE_PREFIX + pod_name
+    with tempfile.NamedTemporaryFile(prefix=filename_prefix, mode='w', delete=False) as config_file:
+        config_file.write(config_text)
+
+    return config_file.name

Review comment:
       Should be addressed by https://github.com/apache/airflow/pull/16571/commits/8687b736994d1911011dd8bd2b2903d82ad30a8c




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660183381



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       If you are concerned about maintaining this code, why not add it to AWS SDK for Python? It could have been helpful for other people as well. Have you thought to create an internal ticket on this topic? We may not solve this problem now, but we may hope to improve it in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670886339



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`

Review comment:
       Should be addressed with https://github.com/apache/airflow/pull/16571/commits/5743f3469c8257948b9b73e2a7b2bdafb2a1d832




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272550



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)

Review comment:
       We should use the Hook to create a session because this will provide a uniform way to manage credentials and will allow us to use this operator in any environment i.e. on-premiss, on GCP, on Cloud Composer and others.  Now, this operator can only be used in AWS environment, because it supports only [Default Credential Provider Chain](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html). 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r668319594



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.

Review comment:
       If the file is being deleted as you requested, then this isn't really relevant anymore.  I'll change this to just use the default NamedTemporaryFile naming convention instead of overwriting that.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r664171916



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.test_eks_utils import convert_keys, random_names

Review comment:
       Renamed in a coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659255181



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       In the case of Google, we should be using normal Google tokens sometime.
   
   ``hook.get_credentials().token`` should be enough to generate a valid access token, but it is not clearly documented in public docs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r656508469



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> str:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return:  A JSON serialized string of the API call results.

Review comment:
       🤣 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670780446



##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/76c7ad635948191764a323a9be03fa965fc91ff5

##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()
+REGION: str = Session().region_name
+SUBNET_IDS: List[str] = ["subnet-12345ab", "subnet-67890cd"]
+TASK_ID = os.environ.get("TASK_ID", "test-eks-operator")
+
+
+AMI_TYPE_KEY: str = "amiType"
+AMI_TYPE_VALUE: str = "AL2_x86_64"
+
+CLIENT_REQUEST_TOKEN_KEY: str = "clientRequestToken"
+CLIENT_REQUEST_TOKEN_VALUE: str = "test_request_token"
+
+DISK_SIZE_KEY: str = "diskSize"
+DISK_SIZE_VALUE: int = 30
+
+ENCRYPTION_CONFIG_KEY: str = "encryptionConfig"
+ENCRYPTION_CONFIG_VALUE: List[Dict] = [{"resources": ["secrets"], "provider": {"keyArn": "arn:of:the:key"}}]
+
+INSTANCE_TYPES_KEY: str = "instanceTypes"
+INSTANCE_TYPES_VALUE: List[str] = ["t3.medium"]
+
+KUBERNETES_NETWORK_CONFIG_KEY: str = "kubernetesNetworkConfig"
+KUBERNETES_NETWORK_CONFIG_VALUE: Dict = {"serviceIpv4Cidr": "172.20.0.0/16"}
+
+LABELS_KEY: str = "labels"

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/76c7ad635948191764a323a9be03fa965fc91ff5

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.delete_nodegroup_operator = EKSDeleteNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "delete_nodegroup")
+    def test_existing_nodegroup(self, mock_delete_nodegroup):
+        self.delete_nodegroup_operator.execute({})
+
+        mock_delete_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name
+        )
+
+
+class TestEKSDescribeAllClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.describe_all_clusters_operator = EKSDescribeAllClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_clusters_exist_returns_all_cluster_details(self, mock_describe_cluster, mock_list_clusters):
+        cluster_names: List[str] = NAME_LIST
+        response = dict(clusters=cluster_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_cluster.return_value = EMPTY_CLUSTER
+        mock_list_clusters.return_value = response
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        assert mock_describe_cluster.call_count == len(cluster_names)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_no_clusters_exist(self, mock_describe_cluster, mock_list_clusters):
+        mock_list_clusters.return_value = dict(clusters=list(), token=DEFAULT_NEXT_TOKEN)
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        mock_describe_cluster.assert_not_called()
+
+
+class TestEKSDescribeAllNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_all_nodegroups_operator = EKSDescribeAllNodegroupsOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_nodegroups_exist_returns_all_nodegroup_details(
+        self, mock_describe_nodegroup, mock_list_nodegroups
+    ):
+        nodegroup_names: List[str] = NAME_LIST
+        cluster_name: str = random_names()

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/76c7ad635948191764a323a9be03fa965fc91ff5

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.delete_nodegroup_operator = EKSDeleteNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "delete_nodegroup")
+    def test_existing_nodegroup(self, mock_delete_nodegroup):
+        self.delete_nodegroup_operator.execute({})
+
+        mock_delete_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name
+        )
+
+
+class TestEKSDescribeAllClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.describe_all_clusters_operator = EKSDescribeAllClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_clusters_exist_returns_all_cluster_details(self, mock_describe_cluster, mock_list_clusters):
+        cluster_names: List[str] = NAME_LIST
+        response = dict(clusters=cluster_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_cluster.return_value = EMPTY_CLUSTER
+        mock_list_clusters.return_value = response
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        assert mock_describe_cluster.call_count == len(cluster_names)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_no_clusters_exist(self, mock_describe_cluster, mock_list_clusters):
+        mock_list_clusters.return_value = dict(clusters=list(), token=DEFAULT_NEXT_TOKEN)
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        mock_describe_cluster.assert_not_called()
+
+
+class TestEKSDescribeAllNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_all_nodegroups_operator = EKSDescribeAllNodegroupsOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_nodegroups_exist_returns_all_nodegroup_details(
+        self, mock_describe_nodegroup, mock_list_nodegroups
+    ):
+        nodegroup_names: List[str] = NAME_LIST
+        cluster_name: str = random_names()
+        response = dict(cluster=cluster_name, nodegroups=nodegroup_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_nodegroup.return_value = EMPTY_NODEGROUP
+        mock_list_nodegroups.return_value = response
+
+        self.describe_all_nodegroups_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once()
+        assert mock_describe_nodegroup.call_count == len(nodegroup_names)
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_no_nodegroups_exist(self, mock_describe_nodegroup, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list(), token="")
+
+        self.describe_all_nodegroups_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once()
+        mock_describe_nodegroup.assert_not_called()
+
+
+class TestEKSDescribeClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_cluster_operator = EKSDescribeClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_describe_cluster(self, mock_describe_cluster):
+        mock_describe_cluster.return_value = DESCRIBE_CLUSTER_RESULT
+
+        self.describe_cluster_operator.execute({})
+
+        mock_describe_cluster.assert_called_once_with(name=self.cluster_name, verbose=False)
+
+
+class TestEKSDescribeNodegroupOperator(unittest.TestCase):
+    def setUp(self):
+        self.cluster_name = random_names()
+        self.nodegroup_name = random_names()
+
+        self.describe_nodegroup_operator = EKSDescribeNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_describe_nodegroup(self, mock_describe_nodegroup):
+        mock_describe_nodegroup.return_value = DESCRIBE_NODEGROUP_RESULT
+
+        self.describe_nodegroup_operator.execute({})
+
+        mock_describe_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name, verbose=False
+        )
+
+
+class TestEKSListClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_names: List[str] = NAME_LIST
+
+        self.list_clusters_operator = EKSListClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    def test_list_clusters(self, mock_list_clusters):
+        mock_list_clusters.return_value = self.cluster_names
+
+        self.list_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+
+
+class TestEKSListNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/76c7ad635948191764a323a9be03fa965fc91ff5




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r678375084



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.cluster_name,
+                nodegroupName=self.nodegroup_name,
+                subnets=self.resources_vpc_config.get('subnetIds'),
+                nodeRole=self.nodegroup_role_arn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+        :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "nodegroup_subnets",
+        "nodegroup_role_arn",
+        "nodegroup_name",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.nodegroup_subnets = nodegroup_subnets
+        self.nodegroup_role_arn = nodegroup_role_arn
+        self.nodegroup_name = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.cluster_name,
+            nodegroupName=self.nodegroup_name,
+            subnets=self.nodegroup_subnets,
+            nodeRole=self.nodegroup_role_arn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+        :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.cluster_name).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.cluster_name, nodegroupName=group)
+

Review comment:
       Yeah, you are probably right that having it behind a `force=True` flag would be better.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-882793231


   Would you please rebase to latest main @ferruzzi ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660181057



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       > that is asking for a problem, later IMHO.
   
   I agree that this could bring unexpected problems in the future, but having to install AWS CLI by users and integrating Airflow with AWS CLI itself can be very difficult in edge cases too. We do not have a perfect solution here. The authorization method rarely changes compared to the product API, so I believe it will be enough stable approach.
   
   Especially since it is described in the AWS documentation.
   > Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through [the AWS IAM authenticator for Kubernetes](https://github.com/kubernetes-sigs/aws-iam-authenticator). 
   
   https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html
   
   or
   
   > This can be used as an alternative to the aws-iam-authenticator.
   
   https://docs.aws.amazon.com/cli/latest/reference/eks/get-token.html
   
   In the aws-iam-authenticator documentation, we have the code snippet I cited above.
   https://github.com/kubernetes-sigs/aws-iam-authenticator#api-authorization-from-outside-a-cluster




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660440183



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:

Review comment:
       I don't think this will render correctly -- you'll have to express this in prose instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660447167



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)

Review comment:
       ```suggestion
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659253799



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")

Review comment:
       How to pass the credentials to AWS CLI when we have Airflow installed at another cloud provider or on-permise? In the case of ``GKEStartPodOperator``, we use [the ``GoogleBaseHook.provide_authorized_gcloud`` method](https://github.com/apache/airflow/blob/2625007c8aeca9ed98dea361ba13c2622482d71f/airflow/providers/google/common/hooks/base_google.py#L483) to pass credential from Airflow to Cloud SDK.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659254410



##########
File path: airflow/providers/amazon/aws/sensors/eks.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+"""Tracking the state of EKS Clusters and Nodegroups."""
+
+from typing import Optional
+
+from boto3 import Session
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.sensors.base import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+CONN_ID = "eks"
+REGION = Session().region_name

Review comment:
       Does this have the effect of making a request to the metaserver when loading the file? It seems to me that this value can also be overwritten by the connection configuration. https://github.com/apache/airflow/blob/2625007c8aeca9ed98dea361ba13c2622482d71f/airflow/providers/amazon/aws/hooks/base_aws.py#L83




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660433624



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       Generally the hook should handle the pagination/return a paginator (see the Athena and Glue hooks) and then the operator just iterates over it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660987975



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):

Review comment:
       See below




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660758563



##########
File path: setup.py
##########
@@ -500,7 +500,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'jira',
     'jsondiff',
     'mongomock',
-    'moto~=2.0',
+    'moto~=2.0.10',

Review comment:
       Or `moto>=2.0.10,<3` which may be straightforward to understand.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670886300



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - ``example_eks_create_cluster.py``
+ - ``example_eks_create_cluster_with_nodegroup.py``
+ - ``example_eks_create_nodegroup.py``
+ - ``example_eks_pod_operator.py``
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon
+EKS Cluster, ``EKSListClustersOperator`` and ``EKSDescribeClusterOperator`` to verify creation, then
+``EKSDeleteClusterOperator`` to delete the Cluster.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster]
+    :end-before: [END howto_operator_eks_create_cluster]
+
+
+.. _howto/operator:EKSListClustersOperator:
+.. _howto/operator:EKSDescribeClusterOperator:
+
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we list all Amazon EKS Clusters.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_clusters]
+    :end-before: [END howto_operator_eks_list_clusters]
+
+In the following code we retrieve details for a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_cluster]
+    :end-before: [END howto_operator_eks_describe_cluster]
+
+
+.. _howto/operator:EKSDeleteClusterOperator:
+
+Deleting Amazon EKS Clusters
+----------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_cluster]
+    :end-before: [END howto_operator_eks_delete_cluster]
+
+
+.. _howto/operator:EKSCreateNodegroupOperator:
+
+Creating Amazon EKS Managed NodeGroups
+--------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_nodegroup.py`` uses ``EKSCreateNodegroupOperator``
+to create an Amazon EKS Managed Nodegroup using an existing cluster, ``EKSListNodegroupsOperator``
+and ``EKSDescribeNodegroupOperator`` to verify creation, then ``EKSDeleteNodegroupOperator``
+to delete the nodegroup.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Managed Nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_nodegroup]
+    :end-before: [END howto_operator_eks_create_nodegroup]
+
+
+.. _howto/operator:EKSListNodegroupsOperator:
+.. _howto/operator:EKSDescribeNodegroupOperator:
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we retrieve details for a given Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_nodegroup]
+    :end-before: [END howto_operator_eks_describe_nodegroup]
+
+
+In the following code we list all Amazon EKS Nodegroups in a given EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_nodegroup]
+    :end-before: [END howto_operator_eks_list_nodegroup]
+
+
+.. _howto/operator:EKSDeleteNodegroupOperator:
+
+Deleting Amazon EKS Managed Nodegroups
+--------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete an Amazon EKS nodegroup.

Review comment:
       Should be addressed with https://github.com/apache/airflow/pull/16571/commits/5743f3469c8257948b9b73e2a7b2bdafb2a1d832




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661031604



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "

Review comment:
       Per Kamil's comments this may be moot.   I'll either correct or remove this depending on what comes of that convo.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-866095317


   Hi Ash, thanks for the quick review.  I'll hit your comments inline, but as far as this one:
   
   > Can you explain why you had to change the label-when-reviewed action in this PR?
   
   I actually never manually changed that file, but it kept getting changed on me.  I thought those changes kept coming from upstream.  If that isn't the case, and it sounds like it wasn't since you are asking, I'll pull that file from the commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272168



##########
File path: airflow/providers/amazon/aws/sensors/eks.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+"""Tracking the state of EKS Clusters and Nodegroups."""
+
+from typing import Optional
+
+from boto3 import Session
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.sensors.base import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+CONN_ID = "eks"
+REGION = Session().region_name

Review comment:
       This should be addressed in https://github.com/apache/airflow/pull/16571/commits/5726815e0e07376449d3938a15f31fc4cbca162a




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-878659314


   > It turns out that writing the EKSPodOperator operator correctly is not trivial. What do you think about contributing it as a separate PR?
   
   I'm not against doing this in multiple PRs.  My primary concern is that without the EKSPodOperator, the other Operators and Hooks are not all that useful.  I'm not a huge fan of adding a chunk of "useless" code on the promise that "it will make sense later", but if you are good with it in this case (and presumably you are or you wouldn't have suggested it) then I can live with it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r664171804



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(

Review comment:
       Moved into the operator class in a coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-866095317


   Hi Ash, thanks for the quick look.  I'll hit your comments inline, but as far as this one:
   
   > Can you explain why you had to change the label-when-reviewed action in this PR?
   
   I actually never manually changed that file.  It kept getting flagged as a changed file but git would never show me the diff or allow me to roll it back so I thought those changes kept coming from upstream.  If that isn't the case, and it sounds like it wasn't since you are asking, I'll pull that file from the commit.
   
   [EDIT]
   Did some digging and it looks like that submodule got tied to......... something.
   
   ```
   ferruzzi:~/workplace/airflow (eksP0)
   $ cat .gitmodules
   [submodule ".github/actions/get-workflow-origin"]
   	path = .github/actions/get-workflow-origin
   	url = https://github.com/potiuk/get-workflow-origin
   [submodule ".github/actions/checks-action"]
   	path = .github/actions/checks-action
   	url = https://github.com/LouisBrunner/checks-action
   [submodule ".github/actions/configure-aws-credentials"]
   	path = .github/actions/configure-aws-credentials
   	url = https://github.com/aws-actions/configure-aws-credentials
   [submodule ".github/actions/codecov-action"]
   	path = .github/actions/codecov-action
   	url = https://github.com/codecov/codecov-action
   [submodule ".github/actions/github-push-action"]
   	path = .github/actions/github-push-action
   	url = https://github.com/ad-m/github-push-action
   [submodule ".github/actions/label-when-approved-action"]
   	path = .github/actions/label-when-approved-action
   	url = https://github.com/TobKed/label-when-approved-action
    ```
   
   I pulled that submodule and it looks like that may have been the issue.
   
   ```
   Updating 4c5190f..0058d00
   Fast-forward
    .github/workflows/test.yml         |    1 +
    .pre-commit-config.yaml            |    2 +-
    README.md                          |   21 +-
    action.yml                         |    5 +-
    dist/index.js                      |   37 ++-
    package-lock.json                  | 3248 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
    package.json                       |    3 +-
    scripts/transpile_if_needed.sh     |   19 ++
    src/main.ts                        |   26 +-
    transpilation_state/main.ts.md5sum |    1 +
    tsconfig.json                      |    2 +-
    11 files changed, 2725 insertions(+), 640 deletions(-)
    create mode 100755 scripts/transpile_if_needed.sh
    create mode 100644 transpilation_state/main.ts.md5sum
    ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661853127



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(

Review comment:
       Yes, it is only used in the EKSPodOperator.  I had kept it to a separate utils file because it felt cleaner to me and the Hooks file was/is strictly one API endpoint, one method.  I'm happy to adjust if you feel the EKSHook class is a better home.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-878659314


   I'm not against doing this in multiple PRs.  My primary concern is that without the EKSPodOperator, the other Operators and Hooks are not all that useful.  I'm not a huge fan of adding a chunk of "useless" code on the promise that "it will make sense later", but if you are good with it in this case (and presumably you are or you wouldn't have suggested it) then I can live with it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660987623



##########
File path: setup.py
##########
@@ -500,7 +500,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'jira',
     'jsondiff',
     'mongomock',
-    'moto~=2.0',
+    'moto~=2.0.10',

Review comment:
       Included in the next revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r677891630



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.cluster_name,
+                nodegroupName=self.nodegroup_name,
+                subnets=self.resources_vpc_config.get('subnetIds'),
+                nodeRole=self.nodegroup_role_arn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+        :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "nodegroup_subnets",
+        "nodegroup_role_arn",
+        "nodegroup_name",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.nodegroup_subnets = nodegroup_subnets
+        self.nodegroup_role_arn = nodegroup_role_arn
+        self.nodegroup_name = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.cluster_name,
+            nodegroupName=self.nodegroup_name,
+            subnets=self.nodegroup_subnets,
+            nodeRole=self.nodegroup_role_arn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+        :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.cluster_name).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.cluster_name, nodegroupName=group)
+

Review comment:
       @mik-laj @ashb @o-nikolas  - I added this as a convenience but now I'm having second thoughts and want your opinion.  By default, the API will not allow you to delete a Cluster which has attached Nodegroups.  I added this block to check for nodegroups and delete them for you rather than throwing an exception, but I now wonder if I should put that behind a `force delete` flag rather than circumvent the system behavior by default?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660858512



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       I like logging my exception messages.  You don't find them useful?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-870067649


   > I think we should think a little more about authorization to a cluster in EKSPodOperator operator.
   
   In what way?  Are you talking specifically about getting rid of the AWS CLI tool as you mentioned above, or did you have something else in mind?   If that is what you are thinking, then please see the reply to your comment above and let's come up with something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r664099854



##########
File path: airflow/providers/amazon/aws/sensors/eks.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+"""Tracking the state of EKS Clusters and Nodegroups."""
+
+from typing import Optional
+
+from boto3 import Session
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.sensors.base import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+CONN_ID = "eks"
+REGION = Session().region_name

Review comment:
       Corrected in coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660444860



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):

Review comment:
       Same here actually -- I can't think of a case where having this appear in a DAG is actually useful.
   
   Did you have a use case/workflow in mind when adding these?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667276015



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,709 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_POD_USERNAME,
+    EKSHook,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.clusterName)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+        :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+        :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, aws_conn_id: Optional[str] = CONN_ID, region: Optional[str] = None, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups'):
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used.  If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(verbose=self.verbose)
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(clusterName=self.clusterName, verbose=self.verbose)
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):
+    """
+    Lists all Amazon EKS Clusters in your AWS account.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListClustersOperator`
+
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_clusters(verbose=self.verbose)
+
+
+class EKSListNodegroupsOperator(BaseOperator):
+    """
+    Lists all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListNodegroupsOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_nodegroups(clusterName=self.clusterName, verbose=self.verbose)
+
+
+class EKSPodOperator(KubernetesPodOperator):
+    """
+    Executes a task in a Kubernetes pod on the specified Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSPodOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to execute the task on.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+       for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param in_cluster: If True, look for config inside the cluster; if False look for a local file path.
+    :type in_cluster: bool
+    :param namespace: The namespace in which to execute the pod.
+    :type namespace: str
+    :param pod_context: The security context to use while executing the pod.
+    :type pod_context: str
+    :param pod_name: The unique name to give the pod.
+    :type pod_name: str
+    :param pod_username: The username to use while executing the pod.
+    :type pod_username: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :param aws_profile: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    def __init__(  # pylint: disable=too-many-arguments,too-many-locals
+        self,
+        cluster_name: str,
+        cluster_role_arn: Optional[str] = None,
+        # Setting in_cluster to False tells the pod that the config
+        # file is stored locally in the worker and not in the cluster.
+        in_cluster: Optional[bool] = False,
+        namespace: Optional[str] = DEFAULT_NAMESPACE_NAME,
+        pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+        pod_name: Optional[str] = DEFAULT_POD_NAME,
+        pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+        aws_profile: Optional[str] = None,
+        region: Optional[str] = None,

Review comment:
       We should read these parameters from connection to provide unified user experience. No other operator supports such a session configuration parameter as an operator parameter.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-877545775


   It turns out that writing the EKSPodOperator operator correctly is not trivial. What do you think about contributing it as a separate PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r664171541



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):

Review comment:
       If nobody has any strong opinions on the matter, I'll leave them in and make the other requested changes to them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r677804559



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)

Review comment:
       @ashb  - Working on removing the redundant catch/log/except blocks. How do you feel about this one? Should I drop this log as well, or is that reasonable?  (here and a very similar use below, on/around L300 )




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r668264776



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.

Review comment:
       Accessing the pod name in the operator makes sense, but it doesn't make sense to strictly associate the configuration file with a specific pod name. One configuration file can be used to create multiple pods or perform other operations, such as creating a secret.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659256591



##########
File path: airflow/providers/amazon/aws/example_dags/example_eks_pod_operation.py
##########
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from datetime import datetime
+
+from airflow.models.dag import DAG
+from airflow.providers.amazon.aws.operators.eks import EKSPodOperator
+from airflow.utils.dates import days_ago
+
+####
+# NOTE: This example requires an existing EKS Cluster with a compute backend.
+# see: example_eks_create_cluster_with_nodegroup.py
+####
+
+CLUSTER_NAME = 'existing-cluster-with-nodegroup-ready-for-pod'
+BUCKET_SUFFIX = datetime.now().strftime("-%Y%b%d-%H%M").lower()
+ROLE_ARN = os.environ.get('ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+with DAG(
+    dag_id='eks_run_pod_dag',
+    schedule_interval=None,
+    start_date=days_ago(2),
+    max_active_runs=1,
+    tags=['example'],
+) as dag:
+
+    # [START howto_operator_eks_pod_operator]
+    start_pod = EKSPodOperator(
+        task_id="run_pod",
+        cluster_name=CLUSTER_NAME,
+        # Optional IAM Role to assume for credentials when signing the token.
+        cluster_role_arn=ROLE_ARN,
+        image="amazon/aws-cli:latest",
+        cmds=["sh", "-c", "aws s3 mb s3://hello-world" + BUCKET_SUFFIX],

Review comment:
       Can you try to avoid side-effects in tests? This way, the tests are safer for the CI environment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670800809



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,709 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"

Review comment:
       All AWS should use ``aws_default`` as the default connection ID. See: http://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670780104



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/76c7ad635948191764a323a9be03fa965fc91ff5

##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()
+REGION: str = Session().region_name

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/76c7ad635948191764a323a9be03fa965fc91ff5




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661849727



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")

Review comment:
       Based on the above discussion, we are likely to drop the AWS CLI requirement.  If we do, does that handle the concerns here or are they still an issue?
   
   See: https://github.com/apache/airflow/pull/16571#discussion_r659252727




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-870100581


   > do you want me to mark things resolved as they are corrected
   
   You can mark any discussion as resolved if you feel the problem has been resolved. I can always open a new discussion or continue with an old one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660134640



##########
File path: airflow/providers/amazon/aws/sensors/eks.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+"""Tracking the state of EKS Clusters and Nodegroups."""
+
+from typing import Optional
+
+from boto3 import Session
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.sensors.base import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+CONN_ID = "eks"
+REGION = Session().region_name

Review comment:
       Upon further digging, it looks like we can set the default here to None and let the base hook do its magic if the user does not provide one.  I will try that and see what happens.
   
   Note to self:  Whatever we do here, also apply the same solution to the Operators.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r664096250



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."

Review comment:
       Teardown added in coming revision.

##########
File path: airflow/providers/amazon/aws/sensors/eks.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+"""Tracking the state of EKS Clusters and Nodegroups."""
+
+from typing import Optional
+
+from boto3 import Session
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.sensors.base import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+CONN_ID = "eks"
+REGION = Session().region_name

Review comment:
       Corrected in coming revision.

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       Looks like they are staying; corrected in coming revision to return entire list.

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):

Review comment:
       If nobody has any strong opinions on the matter, I'll leave them in and make the other requested changes to them.

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(

Review comment:
       Moved into the operator class in a coming revision.

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (

Review comment:
       Renamed in a coming revision.

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.test_eks_utils import convert_keys, random_names

Review comment:
       Renamed in a coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659253383



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py

Review comment:
       ```suggestion
    - ``example_eks_create_cluster.py``
    - ``example_eks_create_cluster_with_nodegroup.py``
    - ``example_eks_create_nodegroup.py``
    - ``example_eks_pod_operator.py``
   ```
   To avoid spelling check errors
   
   We do not try to teach specific examples in these guides, so we do not have to focus on them. On the other hand, when a user is interested, each example has a link that takes it to the source code.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r668266640



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.

Review comment:
       You can keep this argument if you want, but it should have a generic name like `tmpfile_prefix` and be optional. Most users don't pay much attention to names for temporary files, so we shouldn't force it to pass a value for this argument.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r668106130



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)

Review comment:
       ACK.

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")

Review comment:
       This may be going back towards your previous request/point, but would a more simple alternative to this be to move the `generate_config_file` and `_get_bearer_token` methods into the `EKSHook` class itself?  i think that was your initial suggestion and I missed the mark on this.  I think if we do it that way, then the Operator creates the hook as usual and calls it with `eks_hook.generate_config_file()`, then those three lines you highlighted are just `eks_client = self.conn` like any of the hook methods, and inside `_get_bearer_token`, we can use:
   
   ```
           session = self.get_session()
           client = self.conn
   ```
   
   and I _believe_ that solution would share the conn_id between them freely instead of having to pass it around?  Is that right?

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.

Review comment:
       Absolutely, that would be easy enough.   Is there no use case where the user would want to know the pod name?

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.

Review comment:
       Alright, sounds good.  I had thought there may be value in the user being able to dictate the name of the pod and be able to look up the file in the future, but if it is being deleted in the context manager, then you are correct and this is not useful.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669908594



##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()
+REGION: str = Session().region_name
+SUBNET_IDS: List[str] = ["subnet-12345ab", "subnet-67890cd"]
+TASK_ID = os.environ.get("TASK_ID", "test-eks-operator")
+
+
+AMI_TYPE_KEY: str = "amiType"
+AMI_TYPE_VALUE: str = "AL2_x86_64"
+
+CLIENT_REQUEST_TOKEN_KEY: str = "clientRequestToken"
+CLIENT_REQUEST_TOKEN_VALUE: str = "test_request_token"
+
+DISK_SIZE_KEY: str = "diskSize"
+DISK_SIZE_VALUE: int = 30
+
+ENCRYPTION_CONFIG_KEY: str = "encryptionConfig"
+ENCRYPTION_CONFIG_VALUE: List[Dict] = [{"resources": ["secrets"], "provider": {"keyArn": "arn:of:the:key"}}]
+
+INSTANCE_TYPES_KEY: str = "instanceTypes"
+INSTANCE_TYPES_VALUE: List[str] = ["t3.medium"]
+
+KUBERNETES_NETWORK_CONFIG_KEY: str = "kubernetesNetworkConfig"
+KUBERNETES_NETWORK_CONFIG_VALUE: Dict = {"serviceIpv4Cidr": "172.20.0.0/16"}
+
+LABELS_KEY: str = "labels"

Review comment:
       This variable has only one usage. Can we inline it to simplify this code




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670140497



##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()

Review comment:
       Removed in upcoming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660994380



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (

Review comment:
       Here and below, can we compromise on `eks_test_{foo}.py`?  They don't contain tests, so I see what you are saying, but they are only used by the EKS tests and I think that should be inherent in the name.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659252384



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Why we need to use external tool?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r673574757



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> str:

Review comment:
       I think it is worth following the conventions used by boto to avoid any inaccuracies or mistake. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659253993



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Clusters in your AWS account with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListClustersOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_clusters(
+            maxResults=self.maxResults, nextToken=self.nextToken, verbose=self.verbose
+        )
+
+
+class EKSListNodegroupsOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Nodegroups associated with the specified EKS Cluster with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListNodegroupsOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+
+
+class EKSPodOperator(KubernetesPodOperator):
+    """
+    Executes a task in a Kubernetes pod on the specified Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSPodOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to execute the task on.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+       for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param kube_config_file_path: Path to save the generated kube_config file to.
+    :type kube_config_file_path: str
+    :param in_cluster: If True, look for config inside the cluster; if False look for a local file path.
+    :type in_cluster: bool
+    :param namespace: The namespace in which to execute the pod.
+    :type namespace: str
+    :param pod_context: The security context to use while executing the pod.
+    :type pod_context: str
+    :param pod_name: The unique name to give the pod.
+    :type pod_name: str
+    :param pod_username: The username to use while executing the pod.
+    :type pod_username: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :param aws_profile: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(  # pylint: disable=too-many-arguments,too-many-locals
+        self,
+        cluster_name: str,
+        cluster_role_arn: Optional[str] = None,
+        # A default path will be used if none is provided.
+        kube_config_file_path: Optional[str] = os.environ.get(KUBE_CONFIG_ENV_VAR, DEFAULT_KUBE_CONFIG_PATH),

Review comment:
       What happens if one worker is running multiple pods? Will these credentials be overwritten? It looks like a race condition bug.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi closed pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi closed pull request #16571:
URL: https://github.com/apache/airflow/pull/16571


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-870097190


   > > In what way? Are you talking specifically about getting rid of the AWS CLI tool as you mentioned above, or did you have something else in mind? If that is what you are thinking, then please see the reply to your comment above and let's come up with something.
   > 
   > I couldn't give a negative review without leaving this comment. I didn't mean one issue, but I did have more comments.
   
   That makes sense.   My apologies, I misread it as another concern being raised and thought I missed the context.   I'm not sure what etiquette is on here, do you want me to mark things resolved as they are corrected, or do you want to do that as you see them satisfied?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660139268



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py

Review comment:
       Does the spell check know to ignore anything inside double back ticks?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660433624



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       Generally the hook should handle the pagination/return a paginator (see the Athena and Glue hooks) and then the operator just iterates over it.

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       Is this except useful? Given we are re-raising the error I would suggest not -- let this be handled by the caller and remove the try/except from all of these methods.

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()

Review comment:
       ```suggestion
               eks_client = self.conn
   ```
   
   `get_conn()` is qausi-deprecated, and the cached `conn` property is preferred.

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:

Review comment:
       I don't think this will render correctly -- you'll have to express this in prose instead.

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."

Review comment:
       Should we tear it down in this case?

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:

Review comment:
       ```suggestion
               while eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups'):
   ```
   
   No need to check the length, just see if it's "truthy"

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):

Review comment:
       Can you think of a case where having this as an _operator_ to use in a DAG is actually useful?
   
   My gut says that _this_ one doesn't need to be an operator.

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):

Review comment:
       Can you think of a case where having this as an _operator_ to use in a DAG is actually useful?
   
   My gut says that all these Describe* ones don't need to be operators.

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):

Review comment:
       Same here actually -- I can't think of a case where having this appear in a DAG is actually useful.
   
   Did you have a use case/workflow in mind when adding these?

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "exec": {
+                        "apiVersion": "client.authentication.k8s.io/v1alpha1",
+                        "args": cli_args,
+                        "command": "aws",
+                    }
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+    with open(kube_config_file_location, "w") as config_file:
+        config_file.write(config_text)
+
+
+class UnmetDependency(BaseException):

Review comment:
       ```suggestion
   class UnmetDependency(Exception):
   ```
   
   You should almost never use BaseException.

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "

Review comment:
       Your test doesn't check for v2 specifically -- the pip-installable v1 of awscli would also pass this check.
   
   Do we _need_ v2 here?
   
   

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)

Review comment:
       ```suggestion
   ```

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Is https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/eks.html#EKS.Client.generate_presigned_url of any help here?

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(

Review comment:
       This appears to be used only in one place, EKSPodOperator, so this possibly doesn't need to be in a separate file
   
   Perhaps this could be a method on the EksHook? (Given it connects to EKS and does describe cluster, I think that is the right place for it)

##########
File path: setup.py
##########
@@ -500,7 +500,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'jira',
     'jsondiff',
     'mongomock',
-    'moto~=2.0',
+    'moto~=2.0.10',

Review comment:
       ```suggestion
       'moto~=2.0,>=2.0.10',
   ```
   
   `~=2.0.10`  is the same as `>=2.0.10,<2.1` which is more restrictive than we want I think.

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (

Review comment:
       ```suggestion
   from tests.providers.amazon.aws.utils.eks_constants import (
   ```
   
   If that file doesn't contain any tests we shouldn't prefix it with `test_`

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.test_eks_utils import convert_keys, random_names

Review comment:
       Same here.

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):

Review comment:
       Given it's an acronym keeping it as `EKSHook` is allowed, and just because we made the wrong decision in the past with AwsBaseHook doesn't mean we _have to_ stick by it here :) 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661030386



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):

Review comment:
       > Note: When using acronyms in CapWords, capitalize all the letters of the acronym. Thus HTTPServerError is better than HttpServerError.
   
   https://www.python.org/dev/peps/pep-0008/
   
   According to PEP-8, we should use `EKSHook`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661729298



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):

Review comment:
       Reverted in https://github.com/apache/airflow/commit/667401ff7d1037edf52d17e9e84695edaee52cb5




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660158988



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       I see.  I implemented the hooks to follow the API as closely as possible.   To make sure I am on the same page before I change anything, you think it would be better for the hook to just return everything and not allow pagination at all?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660154954



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon
+EKS Cluster, ``EKSListClustersOperator`` and ``EKSDescribeClusterOperator`` to verify creation, then
+``EKSDeleteClusterOperator`` to delete the Cluster.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster]
+    :end-before: [END howto_operator_eks_create_cluster]
+
+
+.. _howto/operator:EKSListClustersOperator:
+.. _howto/operator:EKSDescribeClusterOperator:
+
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we list all Amazon EKS Clusters.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_clusters]
+    :end-before: [END howto_operator_eks_list_clusters]
+
+In the following code we retrieve details for a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_cluster]
+    :end-before: [END howto_operator_eks_describe_cluster]
+
+
+.. _howto/operator:EKSDeleteClusterOperator:
+
+Deleting Amazon EKS Clusters
+----------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_cluster]
+    :end-before: [END howto_operator_eks_delete_cluster]
+
+
+.. _howto/operator:EKSCreateNodegroupOperator:
+
+Creating Amazon EKS Managed NodeGroups
+--------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_nodegroup.py`` uses ``EKSCreateNodegroupOperator``
+to create an Amazon EKS Managed Nodegroup using an existing cluster, ``EKSListNodegroupsOperator``
+and ``EKSDescribeNodegroupOperator`` to verify creation, then ``EKSDeleteNodegroupOperator``
+to delete the nodegroup.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Managed Nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_nodegroup]
+    :end-before: [END howto_operator_eks_create_nodegroup]
+
+
+.. _howto/operator:EKSListNodegroupsOperator:
+.. _howto/operator:EKSDescribeNodegroupOperator:
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we retrieve details for a given Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_nodegroup]
+    :end-before: [END howto_operator_eks_describe_nodegroup]
+
+
+In the following code we list all Amazon EKS Nodegroups in a given EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_nodegroup]
+    :end-before: [END howto_operator_eks_list_nodegroup]
+
+
+.. _howto/operator:EKSDeleteNodegroupOperator:
+
+Deleting Amazon EKS Managed Nodegroups
+--------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete an Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_nodegroup]
+    :end-before: [END howto_operator_eks_delete_nodegroup]
+
+
+Creating Amazon EKS Clusters and Node Groups Together
+------------------------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_stack.py`` demonstrates using
+``EKSCreateClusterOperator`` to create an Amazon EKS cluster and underlying
+Amazon EKS node group in one command.  ``EKSDescribeClustersOperator`` and
+``EKSDescribeNodegroupsOperator`` verify creation, then ``EKSDeleteClusterOperator``
+deletes all created resources.
+
+Prerequisites
+"""""""""""""
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached

Review comment:
       Fixed in 616d249

##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py

Review comment:
       Fixed in 616d249




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669905302



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.delete_nodegroup_operator = EKSDeleteNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "delete_nodegroup")
+    def test_existing_nodegroup(self, mock_delete_nodegroup):
+        self.delete_nodegroup_operator.execute({})
+
+        mock_delete_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name
+        )
+
+
+class TestEKSDescribeAllClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.describe_all_clusters_operator = EKSDescribeAllClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_clusters_exist_returns_all_cluster_details(self, mock_describe_cluster, mock_list_clusters):
+        cluster_names: List[str] = NAME_LIST
+        response = dict(clusters=cluster_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_cluster.return_value = EMPTY_CLUSTER
+        mock_list_clusters.return_value = response
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        assert mock_describe_cluster.call_count == len(cluster_names)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_no_clusters_exist(self, mock_describe_cluster, mock_list_clusters):
+        mock_list_clusters.return_value = dict(clusters=list(), token=DEFAULT_NEXT_TOKEN)
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        mock_describe_cluster.assert_not_called()
+
+
+class TestEKSDescribeAllNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_all_nodegroups_operator = EKSDescribeAllNodegroupsOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_nodegroups_exist_returns_all_nodegroup_details(
+        self, mock_describe_nodegroup, mock_list_nodegroups
+    ):
+        nodegroup_names: List[str] = NAME_LIST
+        cluster_name: str = random_names()

Review comment:
       Is it needed? Why can't you use a constant value?

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.delete_nodegroup_operator = EKSDeleteNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "delete_nodegroup")
+    def test_existing_nodegroup(self, mock_delete_nodegroup):
+        self.delete_nodegroup_operator.execute({})
+
+        mock_delete_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name
+        )
+
+
+class TestEKSDescribeAllClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.describe_all_clusters_operator = EKSDescribeAllClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_clusters_exist_returns_all_cluster_details(self, mock_describe_cluster, mock_list_clusters):
+        cluster_names: List[str] = NAME_LIST
+        response = dict(clusters=cluster_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_cluster.return_value = EMPTY_CLUSTER
+        mock_list_clusters.return_value = response
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        assert mock_describe_cluster.call_count == len(cluster_names)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_no_clusters_exist(self, mock_describe_cluster, mock_list_clusters):
+        mock_list_clusters.return_value = dict(clusters=list(), token=DEFAULT_NEXT_TOKEN)
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        mock_describe_cluster.assert_not_called()
+
+
+class TestEKSDescribeAllNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_all_nodegroups_operator = EKSDescribeAllNodegroupsOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_nodegroups_exist_returns_all_nodegroup_details(
+        self, mock_describe_nodegroup, mock_list_nodegroups
+    ):
+        nodegroup_names: List[str] = NAME_LIST
+        cluster_name: str = random_names()

Review comment:
       Is it needed? Can you use a constant value?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667279605



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,

Review comment:
       This token is only valid for an hour, but this operator can be used to run a longer job. We should take care of its rotation of this key to ensure it's valid.
   
   Unfortunately, it won't be trivial as we have to write a new [client-go-credential-plugin](https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins). To do this, you should write a new CLI tool that will do the generate a new credential.
   
   Some tips on adding a new CLI tool to providers:
   - You cannot modify the Airflow core so that this operator can be used on different Airflow versions and not create a dependency between the vendor and the core. You cannot modify any file in `airflow.cli` package.
   - You can create a new [executable module](https://docs.python.org/3/tutorial/modules.html#executing-modules-as-scripts). To run this module, it is best to use python module resolution i.e. you should use [python -m](https://docs.python.org/3/using/cmdline.html#cmdoption-m). BTW, To run airflow, you can use `airflow` or `python -m airflow`. See: https://github.com/apache/airflow/pull/7808
   - The `python` program may not refer to the current interpreter. You should [`sys.executable`](https://docs.python.org/3/library/sys.html#sys.executable) instead.
   
   For example, see: [`airflow.providers.google.common.utils.id_token_credentials`](https://github.com/apache/airflow/blob/866a601b76e219b3c043e1dbbc8fb22300866351/airflow/providers/google/common/utils/id_token_credentials.py#L24])




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-879372031


   I see.  Thanks.  It is definitely a bigger nut to crack than I initially expected.
   
   There is always so much to learn. :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661038171



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.test_eks_utils import convert_keys, random_names

Review comment:
       discussing [here](https://github.com/apache/airflow/pull/16571#discussion_r660994380)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r668260966



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")

Review comment:
       A good idea. Writing this function as a hook method will follow how GoogleBaseHook works. In  `GoogleBaseHook`, we have `provide_gcp_credential_file_as_context` method, so we can create a similar method here e.g. `provide_eks_credential_file_as_context`.
   https://github.com/apache/airflow/blob/83cb237031dfe5b7cb5238cc1409ce71fd9507b7/airflow/providers/google/common/hooks/base_google.py#L448




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670886177



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - ``example_eks_create_cluster.py``
+ - ``example_eks_create_cluster_with_nodegroup.py``
+ - ``example_eks_create_nodegroup.py``
+ - ``example_eks_pod_operator.py``
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon

Review comment:
       Should be addressed with https://github.com/apache/airflow/pull/16571/commits/5743f3469c8257948b9b73e2a7b2bdafb2a1d832




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670125356



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()

Review comment:
       Removed in upcoming revision.

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()

Review comment:
       Removed in upcoming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660181057



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       > that is asking for a problem, later IMHO.
   
   I agree that this could bring unexpected problems in the future, but having to install AWS CLI by users and integrating Airflow with AWS CLI itself can be very difficult in edge cases too. We do not have a perfect solution here. The authorization method rarely changes compared to the product API, so I believe it will be enough stable approach.
   
   Especially since it is described in the AWS documentation.
   > Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through [the AWS IAM authenticator for Kubernetes](https://github.com/kubernetes-sigs/aws-iam-authenticator). 
   
   
   or
   
   > This can be used as an alternative to the aws-iam-authenticator.
   
   https://docs.aws.amazon.com/cli/latest/reference/eks/get-token.html
   
   In the aws-iam-authenticator documentation, we have the code snippet I cited above.
   https://github.com/kubernetes-sigs/aws-iam-authenticator#api-authorization-from-outside-a-cluster




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] subashcanapathy commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
subashcanapathy commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-870281108


   > What do you think about adding system tests for these operators? We are constantly striving to improve test coverage and we have impressive results with Google integration, where the lack of testing is the exception, not the daily routine. See: #8280
   
   We do have integration test coverage as part of this check-in. The only difference is that they aren't system tests. We are not sure if these qualify there as we build these into separate provider packages these days. Please advise if I have misunderstood that assumption.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-882793231


   Would you please rebase to latest main @ferruzzi ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660147152



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")

Review comment:
       The code as submitted is working on the Breeze test environment, I don't think it should matter what's hosting it.  The line you highlighted specifically is only there as a check to make sure the AWS CLI tool is installed as that is (currently) a prerequisite.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660446226



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "exec": {
+                        "apiVersion": "client.authentication.k8s.io/v1alpha1",
+                        "args": cli_args,
+                        "command": "aws",
+                    }
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+    with open(kube_config_file_location, "w") as config_file:
+        config_file.write(config_text)
+
+
+class UnmetDependency(BaseException):

Review comment:
       ```suggestion
   class UnmetDependency(Exception):
   ```
   
   You should almost never use BaseException.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-866095317


   Hi Ash, thanks for the quick look.  I'll hit your comments inline, but as far as this one:
   
   > Can you explain why you had to change the label-when-reviewed action in this PR?
   
   I actually never manually changed that file.  It kept getting flagged as a changed file but git would never show me the diff or allow me to roll it back so I thought those changes kept coming from upstream.  If that isn't the case, and it sounds like it wasn't since you are asking, I'll pull that file from the commit.
   
   [EDIT]
   Did some digging and it looks like that submodule got tied to......... something.
   
   ```
   ferruzzi:~/workplace/airflow (eksP0)
   $ cat .gitmodules
   [submodule ".github/actions/get-workflow-origin"]
   	path = .github/actions/get-workflow-origin
   	url = https://github.com/potiuk/get-workflow-origin
   [submodule ".github/actions/checks-action"]
   	path = .github/actions/checks-action
   	url = https://github.com/LouisBrunner/checks-action
   [submodule ".github/actions/configure-aws-credentials"]
   	path = .github/actions/configure-aws-credentials
   	url = https://github.com/aws-actions/configure-aws-credentials
   [submodule ".github/actions/codecov-action"]
   	path = .github/actions/codecov-action
   	url = https://github.com/codecov/codecov-action
   [submodule ".github/actions/github-push-action"]
   	path = .github/actions/github-push-action
   	url = https://github.com/ad-m/github-push-action
   [submodule ".github/actions/label-when-approved-action"]
   	path = .github/actions/label-when-approved-action
   	url = https://github.com/TobKed/label-when-approved-action
   
   ferruzzi:~/workplace/airflow (eksP0)
   $ git submodule status
    9f02872da71b6f558c6a6f190f925dde5e4d8798 .github/actions/checks-action (v1.1.0)
    1fc7722ded4708880a5aea49f2bfafb9336f0c8d .github/actions/codecov-action (v1.1.1)
    e97d7fbc8e0e5af69631c13daa0f4b5a8d88165b .github/actions/configure-aws-credentials (v1.5.5)
    588cc14f9f1cdf1b8be3db816855e96422204fec .github/actions/get-workflow-origin (v1_3)
    40bf560936a8022e68a3c00e7d2abefaf01305a6 .github/actions/github-push-action (v0.6.0)
    4c5190fec5661e98d83f50bbd4ef9ebb48bd1194 .github/actions/label-when-approved-action (latest)
   
   $ git remote -v
   origin	https://github.com/TobKed/label-when-approved-action (fetch)
   origin	https://github.com/TobKed/label-when-approved-action (push)
    ```
   
   I pulled that submodule and it looks like that may have been the issue.
   
   ```
   Updating 4c5190f..0058d00
   Fast-forward
    .github/workflows/test.yml         |    1 +
    .pre-commit-config.yaml            |    2 +-
    README.md                          |   21 +-
    action.yml                         |    5 +-
    dist/index.js                      |   37 ++-
    package-lock.json                  | 3248 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
    package.json                       |    3 +-
    scripts/transpile_if_needed.sh     |   19 ++
    src/main.ts                        |   26 +-
    transpilation_state/main.ts.md5sum |    1 +
    tsconfig.json                      |    2 +-
    11 files changed, 2725 insertions(+), 640 deletions(-)
    create mode 100755 scripts/transpile_if_needed.sh
    create mode 100644 transpilation_state/main.ts.md5sum
    ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660151177



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       When defining a DAG, the user should define all tasks. Here, however, there is a case where the number of tasks is not known until the first page is downloaded, and therefore cannot be defined in Airflow.
   
   In the case of Google, we've always fetched all items from all pages. This way, the user could access all elements and did not have to define dynamic DAGs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r656033425



##########
File path: tests/providers/amazon/aws/hooks/test_eks.py
##########
@@ -0,0 +1,872 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+import json
+import unittest
+from copy import deepcopy
+from typing import Dict, List, Optional, Tuple, Type
+from unittest import mock
+from urllib.parse import ParseResult, urlparse
+
+import pytest
+from _pytest._code import ExceptionInfo
+from botocore.exceptions import ClientError
+from freezegun import freeze_time
+from moto.core import ACCOUNT_ID
+from moto.core.exceptions import AWSError
+from moto.eks.exceptions import (
+    InvalidParameterException,
+    InvalidRequestException,
+    ResourceInUseException,
+    ResourceNotFoundException,
+)
+from moto.eks.models import (
+    CLUSTER_EXISTS_MSG,
+    CLUSTER_IN_USE_MSG,
+    CLUSTER_NOT_FOUND_MSG,
+    CLUSTER_NOT_READY_MSG,
+    LAUNCH_TEMPLATE_WITH_DISK_SIZE_MSG,
+    LAUNCH_TEMPLATE_WITH_REMOTE_ACCESS_MSG,
+    NODEGROUP_EXISTS_MSG,
+    NODEGROUP_NOT_FOUND_MSG,
+)
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+
+from ..utils.test_eks_constants import (
+    CONN_ID,
+    DEFAULT_MAX_RESULTS,
+    DISK_SIZE,
+    FROZEN_TIME,
+    INSTANCE_TYPES,
+    LAUNCH_TEMPLATE,
+    PACKAGE_NOT_PRESENT_MSG,
+    PARTITIONS,
+    REGION,
+    REMOTE_ACCESS,
+    BatchCountSize,
+    ClusterAttributes,
+    ClusterInputs,
+    ErrorAttributes,
+    NodegroupAttributes,
+    NodegroupInputs,
+    PageCount,
+    PossibleTestResults,
+    RegExTemplates,
+    ResponseAttribute,
+)
+from ..utils.test_eks_utils import (
+    attributes_to_test,
+    generate_clusters,
+    generate_nodegroups,
+    random_names,
+    region_matches_partition,
+)
+
+try:
+    from moto import mock_eks
+except ImportError:
+    mock_eks = None
+
+
+@pytest.fixture(scope="function")
+def cluster_builder():
+    """A fixture to generate a batch of EKS Clusters on the mocked backend for testing."""
+
+    class ClusterTestDataFactory:
+        """A Factory class for building the Cluster objects."""
+
+        def __init__(self, count: int, minimal: bool) -> None:
+            # Generate 'count' number of random Cluster objects.
+            self.cluster_names: List[str] = generate_clusters(
+                eks_hook=eks_hook, num_clusters=count, minimal=minimal
+            )
+
+            # Get the name of the first generated Cluster.
+            first_name: str = self.cluster_names[0]
+
+            # Collect the output of describe_cluster() for the first Cluster.
+            self.cluster_describe_output: str = json.loads(eks_hook.describe_cluster(name=first_name))[
+                ResponseAttribute.CLUSTER
+            ]
+
+            # Pick a random Cluster name from the list and a name guaranteed not to be on the list.
+            self.existing_cluster_name, self.nonexistent_cluster_name = random_names(
+                name_list=self.cluster_names
+            )
+
+            # Generate a list of the Cluster attributes to be tested when validating results.
+            self.attributes_to_test: List[Tuple] = attributes_to_test(
+                inputs=ClusterInputs, cluster_name=self.existing_cluster_name
+            )
+
+    def _execute(
+        count: Optional[int] = 1, minimal: Optional[bool] = True
+    ) -> Tuple[EKSHook, ClusterTestDataFactory]:
+        return eks_hook, ClusterTestDataFactory(count=count, minimal=minimal)
+
+    mock_eks().start()
+    eks_hook = EKSHook(
+        aws_conn_id=CONN_ID,
+        region_name=REGION,
+    )
+    yield _execute
+    mock_eks().stop()
+
+
+@pytest.fixture(scope="function")
+def nodegroup_builder(cluster_builder):
+    """A fixture to generate a batch of EKSManaged Nodegroups on the mocked backend for testing."""
+
+    class NodegroupTestDataFactory:
+        """A Factory class for building the Cluster objects."""
+
+        def __init__(self, count: int, minimal: bool) -> None:
+            self.cluster_name: str = cluster.existing_cluster_name
+
+            # Generate 'count' number of random Nodegroup objects.
+            self.nodegroup_names: List[str] = generate_nodegroups(
+                eks_hook=eks_hook, cluster_name=self.cluster_name, num_nodegroups=count, minimal=minimal
+            )
+
+            # Get the name of the first generated Nodegroup.
+            self.first_name: str = self.nodegroup_names[0]
+
+            # Collect the output of describe_nodegroup() for the first Nodegroup.
+            self.nodegroup_describe_output: Dict = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.cluster_name, nodegroupName=self.first_name)
+            )[ResponseAttribute.NODEGROUP]
+
+            # Pick a random Nodegroup name from the list and a name guaranteed not to be on the list.
+            self.existing_nodegroup_name, self.nonexistent_nodegroup_name = random_names(
+                name_list=self.nodegroup_names
+            )
+            _, self.nonexistent_cluster_name = random_names(name_list=[self.cluster_name])
+
+            # Generate a list of the Nodegroup attributes to be tested when validating results.
+            self.attributes_to_test: List[Tuple] = attributes_to_test(
+                inputs=NodegroupInputs,
+                cluster_name=self.cluster_name,
+                nodegroup_name=self.existing_nodegroup_name,
+            )
+
+    def _execute(
+        count: Optional[int] = 1, minimal: Optional[bool] = True
+    ) -> Tuple[EKSHook, NodegroupTestDataFactory]:
+        return eks_hook, NodegroupTestDataFactory(count=count, minimal=minimal)
+
+    eks_hook, cluster = cluster_builder()
+    return _execute
+
+
+@unittest.skipIf(mock_eks is None, reason=PACKAGE_NOT_PRESENT_MSG)

Review comment:
       Since this is not a unittest subclass
   ```suggestion
   @pytest.mark.skipif(mock_eks is None, reason=PACKAGE_NOT_PRESENT_MSG)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667273782



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+
+    # Set the filename to something which can be found later if needed.
+    filename_prefix = KUBE_CONFIG_FILE_PREFIX + pod_name
+    with tempfile.NamedTemporaryFile(prefix=filename_prefix, mode='w', delete=False) as config_file:

Review comment:
       We should delete the credentials when they are not used. To achieve this, we should make the following changes:
   - convert this method into context manager i.e. add [`contextlib.contextmanager`](https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager) decorator, yield on a new file name (before we should flush content also), update usages,
   - remove delete parameter in this method invocation
   - remove pod_name parameters from this method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669058722



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str

Review comment:
       renamed in https://github.com/apache/airflow/pull/16571/commits/8687b736994d1911011dd8bd2b2903d82ad30a8c




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r684804950



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,420 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from contextlib import contextmanager
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import yaml
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.create_cluster(
+            name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+        )
+
+        self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+        return response
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+        # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+        # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+        # The 'shared' value allows more than one resource to use the subnet.
+        tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+        if "tags" in kwargs:
+            tags = {**tags, **kwargs["tags"]}
+            kwargs.pop("tags")
+
+        response = eks_client.create_nodegroup(
+            clusterName=clusterName,
+            nodegroupName=nodegroupName,
+            subnets=subnets,
+            nodeRole=nodeRole,
+            tags=tags,
+            **kwargs,
+        )
+
+        self.log.info(
+            "Created a managed nodegroup named %s in cluster %s",
+            response.get('nodegroup').get('nodegroupName'),
+            response.get('nodegroup').get('clusterName'),
+        )
+        return response
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.delete_cluster(name=name)
+
+        self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+        return response
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+
+        self.log.info(
+            "Deleted nodegroup named %s from cluster %s.",
+            response.get('nodegroup').get('nodegroupName'),
+            response.get('nodegroup').get('clusterName'),
+        )
+        return response
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:

Review comment:
       Sorry for the delay, I'll make the change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667276322



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.

Review comment:
       The value of this parameter has no effect on behavior. This is just a temporary profile name. We can use a different name each time or use a constant string and the user won't notice the difference because we only have one context at a time.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659253993



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Clusters in your AWS account with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListClustersOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_clusters(
+            maxResults=self.maxResults, nextToken=self.nextToken, verbose=self.verbose
+        )
+
+
+class EKSListNodegroupsOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Nodegroups associated with the specified EKS Cluster with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListNodegroupsOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+
+
+class EKSPodOperator(KubernetesPodOperator):
+    """
+    Executes a task in a Kubernetes pod on the specified Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSPodOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to execute the task on.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+       for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param kube_config_file_path: Path to save the generated kube_config file to.
+    :type kube_config_file_path: str
+    :param in_cluster: If True, look for config inside the cluster; if False look for a local file path.
+    :type in_cluster: bool
+    :param namespace: The namespace in which to execute the pod.
+    :type namespace: str
+    :param pod_context: The security context to use while executing the pod.
+    :type pod_context: str
+    :param pod_name: The unique name to give the pod.
+    :type pod_name: str
+    :param pod_username: The username to use while executing the pod.
+    :type pod_username: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :param aws_profile: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(  # pylint: disable=too-many-arguments,too-many-locals
+        self,
+        cluster_name: str,
+        cluster_role_arn: Optional[str] = None,
+        # A default path will be used if none is provided.
+        kube_config_file_path: Optional[str] = os.environ.get(KUBE_CONFIG_ENV_VAR, DEFAULT_KUBE_CONFIG_PATH),

Review comment:
       What happens if several workers are running multiple pods? Will these credentials be overwritten? It looks like a race condition bug.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r656502071



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> str:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return:  A JSON serialized string of the API call results.

Review comment:
       Would you believe I had a Really Good Reason for it, but I can't remember why now?  It isn't how the other hooks are handled so I must have had a reason at the time, but I can't think of why.  I'll switch these to all return Dicts like the rest of the AWS/boto hooks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660137693



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       All fields created in ctor. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659250816



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):

Review comment:
       EKSHook or EksHook? Which one do you prefer? Which is recommended by PEP? I asking, because it don't match to base class style.

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "exec": {
+                        "apiVersion": "client.authentication.k8s.io/v1alpha1",
+                        "args": cli_args,
+                        "command": "aws",
+                    }
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+    open(kube_config_file_location, "w").write(config_text)

Review comment:
       Please use context manager. This way we are sure that the file will be closed when not in use.

##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Why we need to use it?

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       How is the user to pass this parameter?  Airflow does not support dynamic workflows, but the page count is dynamic.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-901268924






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-896187912


   Hey folks, sorry for the delay.  I wanted to get a better understanding of Jinja before I implemented/fixed it.  I should have another round of updates for you today or tomorrow.  
   
   [EDIT: Updates pushed.  I think they address all concerns from @ashb  and @mik-laj so far; I have another one coming to correct the misuse of `Optional` in a couple places which @zkan caught.]
   
   [EDIT 2:  I was over 100 commits behind main so I rebased and had to force push.]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272411



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (

Review comment:
       Addressed in https://github.com/apache/airflow/pull/16571/commits/66679787a678bcce6563f9a600a2ccadd3a5d269 before I found out you don't want everything squashed, apologies for the confusion/complication.

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.test_eks_utils import convert_keys, random_names

Review comment:
       Addressed in https://github.com/apache/airflow/pull/16571/commits/66679787a678bcce6563f9a600a2ccadd3a5d269 before I found out you don't want everything squashed, apologies for the confusion/complication.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-870978582


   Oh what have I done.  I did a `git pull --rebase` to get up to date before pushing the changes and it included all of them in this PR...n 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660455978



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (

Review comment:
       ```suggestion
   from tests.providers.amazon.aws.utils.eks_constants import (
   ```
   
   If that file doesn't contain any tests we shouldn't prefix it with `test_`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667273782



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+
+    # Set the filename to something which can be found later if needed.
+    filename_prefix = KUBE_CONFIG_FILE_PREFIX + pod_name
+    with tempfile.NamedTemporaryFile(prefix=filename_prefix, mode='w', delete=False) as config_file:

Review comment:
       We should delete the credentials when they are not used. To achieve this, we should make the following changes:
   - convert this method into context manager i.e. add [`contextlib.contextmanager`](https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager) decorator,
   - yield on a new file name
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r676641087



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)

Review comment:
       We don't need the log and to raise an error -- by raising the error the message will bubble up anyway, so logging here would just lead to the message appearing twice.
   
   I have also "inverted" the check to have the main flow be "un-intendented" (A _rough_ rule I follow is to try and avoid having code in a block, with the thinking being that the deeper in a code the harder it is to follow.)
   
   ```suggestion
               if not nodegroup_role_arn:
                   raise ValueError("Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in.")
               self.nodegroup_role_arn = nodegroup_role_arn
   ```

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,

Review comment:
       This shouldn't be optional -- passing `None` isn't valid is it?
   
   ```suggestion
           aws_conn_id: str = DEFAULT_CONN_ID,
   ```

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,

Review comment:
       (This applies to all operators added here)

##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):

Review comment:
       Operators should be higher level than just a single API call.
   
   Please remove all of the Describe/List operators as I cannot see them being all that useful.
   
   If we want to support a pattern such as Kamil has suggested then we should have a Sensor or two (EKSClusterSensor/EKSNodeGroupSesor etc) that has some higher level filtering built in to it. Such a sensor should return the cluster/node group ID (returning makes it accessible via XCom) and error if more than a single value is returned, as I think returning more than one would be a mistake?
   
   (I'd vote for the sake of getting this merged that you remove the operators and add such sensors in a new PR, but I don't mind if you want to include those sensors in this PR)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669909198



##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()

Review comment:
       Is it needed?  Can you use a fixed value here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660436996



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       Is this except useful? Given we are re-raising the error I would suggest not -- let this be handled by the caller and remove the try/except from all of these methods.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659253799



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")

Review comment:
       How to pass the credentials to AWS CLI when we have Airflow installed at another cloud provider? In the case of ``GKEStartPodOperator``, we use [the ``GoogleBaseHook.provide_authorized_gcloud`` method](https://github.com/apache/airflow/blob/2625007c8aeca9ed98dea361ba13c2622482d71f/airflow/providers/google/common/hooks/base_google.py#L483).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667276587



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str

Review comment:
       I don't understand the name of this parameter. Did you mean kubernetes(`k8s`) namespace or does the EKS service have different namespaces?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670125459



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.delete_nodegroup_operator = EKSDeleteNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "delete_nodegroup")
+    def test_existing_nodegroup(self, mock_delete_nodegroup):
+        self.delete_nodegroup_operator.execute({})
+
+        mock_delete_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name
+        )
+
+
+class TestEKSDescribeAllClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.describe_all_clusters_operator = EKSDescribeAllClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_clusters_exist_returns_all_cluster_details(self, mock_describe_cluster, mock_list_clusters):
+        cluster_names: List[str] = NAME_LIST
+        response = dict(clusters=cluster_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_cluster.return_value = EMPTY_CLUSTER
+        mock_list_clusters.return_value = response
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        assert mock_describe_cluster.call_count == len(cluster_names)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_no_clusters_exist(self, mock_describe_cluster, mock_list_clusters):
+        mock_list_clusters.return_value = dict(clusters=list(), token=DEFAULT_NEXT_TOKEN)
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        mock_describe_cluster.assert_not_called()
+
+
+class TestEKSDescribeAllNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_all_nodegroups_operator = EKSDescribeAllNodegroupsOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_nodegroups_exist_returns_all_nodegroup_details(
+        self, mock_describe_nodegroup, mock_list_nodegroups
+    ):
+        nodegroup_names: List[str] = NAME_LIST
+        cluster_name: str = random_names()

Review comment:
       Removed in upcoming revision.

##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.delete_nodegroup_operator = EKSDeleteNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "delete_nodegroup")
+    def test_existing_nodegroup(self, mock_delete_nodegroup):
+        self.delete_nodegroup_operator.execute({})
+
+        mock_delete_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name
+        )
+
+
+class TestEKSDescribeAllClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.describe_all_clusters_operator = EKSDescribeAllClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_clusters_exist_returns_all_cluster_details(self, mock_describe_cluster, mock_list_clusters):
+        cluster_names: List[str] = NAME_LIST
+        response = dict(clusters=cluster_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_cluster.return_value = EMPTY_CLUSTER
+        mock_list_clusters.return_value = response
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        assert mock_describe_cluster.call_count == len(cluster_names)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_no_clusters_exist(self, mock_describe_cluster, mock_list_clusters):
+        mock_list_clusters.return_value = dict(clusters=list(), token=DEFAULT_NEXT_TOKEN)
+
+        self.describe_all_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+        mock_describe_cluster.assert_not_called()
+
+
+class TestEKSDescribeAllNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_all_nodegroups_operator = EKSDescribeAllNodegroupsOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_nodegroups_exist_returns_all_nodegroup_details(
+        self, mock_describe_nodegroup, mock_list_nodegroups
+    ):
+        nodegroup_names: List[str] = NAME_LIST
+        cluster_name: str = random_names()
+        response = dict(cluster=cluster_name, nodegroups=nodegroup_names, nextToken=DEFAULT_NEXT_TOKEN)
+        mock_describe_nodegroup.return_value = EMPTY_NODEGROUP
+        mock_list_nodegroups.return_value = response
+
+        self.describe_all_nodegroups_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once()
+        assert mock_describe_nodegroup.call_count == len(nodegroup_names)
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_no_nodegroups_exist(self, mock_describe_nodegroup, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list(), token="")
+
+        self.describe_all_nodegroups_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once()
+        mock_describe_nodegroup.assert_not_called()
+
+
+class TestEKSDescribeClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.describe_cluster_operator = EKSDescribeClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "describe_cluster")
+    def test_describe_cluster(self, mock_describe_cluster):
+        mock_describe_cluster.return_value = DESCRIBE_CLUSTER_RESULT
+
+        self.describe_cluster_operator.execute({})
+
+        mock_describe_cluster.assert_called_once_with(name=self.cluster_name, verbose=False)
+
+
+class TestEKSDescribeNodegroupOperator(unittest.TestCase):
+    def setUp(self):
+        self.cluster_name = random_names()
+        self.nodegroup_name = random_names()
+
+        self.describe_nodegroup_operator = EKSDescribeNodegroupOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name, nodegroup_name=self.nodegroup_name
+        )
+
+    @mock.patch.object(EKSHook, "describe_nodegroup")
+    def test_describe_nodegroup(self, mock_describe_nodegroup):
+        mock_describe_nodegroup.return_value = DESCRIBE_NODEGROUP_RESULT
+
+        self.describe_nodegroup_operator.execute({})
+
+        mock_describe_nodegroup.assert_called_once_with(
+            clusterName=self.cluster_name, nodegroupName=self.nodegroup_name, verbose=False
+        )
+
+
+class TestEKSListClustersOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_names: List[str] = NAME_LIST
+
+        self.list_clusters_operator = EKSListClustersOperator(task_id=TASK_ID)
+
+    @mock.patch.object(EKSHook, "list_clusters")
+    def test_list_clusters(self, mock_list_clusters):
+        mock_list_clusters.return_value = self.cluster_names
+
+        self.list_clusters_operator.execute({})
+
+        mock_list_clusters.assert_called_once()
+
+
+class TestEKSListNodegroupsOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()

Review comment:
       Removed in upcoming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660888709



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):

Review comment:
       For this and above, I didn't have a particular usecase in mind, but they were API endpoints and I didn't feel it was my place to say nobody would find them useful.  You want them gone, I can remove them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r686203932



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)

Review comment:
       Addressed in https://github.com/apache/airflow/pull/16571/commits/4d59785e903699d74d884a9137197baef3fe0fde




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r656355483



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,646 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+        with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.eks_hook = EKSHook(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        self.eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while self.eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            self.eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.eks_hook = EKSHook(**kwargs)

Review comment:
       AHA.  Is that what those error messages were getting at in the CI tests that are failing?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-901726240


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-901726053


   Yup, image looks good now.
   
   Previously it was a 98B file, now it's 11KiB :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r664171893



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EksHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.test_eks_constants import (

Review comment:
       Renamed in a coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272550



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)

Review comment:
       We should use the Hook to create a session because this will provide a uniform way to manage credentials and will allow us to use this operator in any environment i.e. on-premiss, on GCP, on Cloud Composer and others.  Now, this operator can only be used in AWS environment, because it supports only [Default Credential Provider Chain](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660888709



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):

Review comment:
       For this and above, I didn't have a particular usecase in mind, but they were API endpoints and I didn't feel it was my place to say nobody would find them useful.  You want them gone, I can remove them unless someone likes them in there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667281246



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,

Review comment:
       For more info about kubectl exec auth, see: 
   https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667276401



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.

Review comment:
       Can we generate this pod name for each run?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-890424892


   @andormarkus Hi Andor, can you please submit that as a new Issue?  This PR is pretty far along to be changing direction, but I don't see why we couldn't add that for nodegroups later on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660448846



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Is https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/eks.html#EKS.Client.generate_presigned_url of any help here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661039637



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - ``example_eks_create_cluster.py``
+ - ``example_eks_create_cluster_with_nodegroup.py``
+ - ``example_eks_create_nodegroup.py``
+ - ``example_eks_pod_operator.py``
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon
+EKS Cluster, ``EKSListClustersOperator`` and ``EKSDescribeClusterOperator`` to verify creation, then
+``EKSDeleteClusterOperator`` to delete the Cluster.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster]
+    :end-before: [END howto_operator_eks_create_cluster]
+
+
+.. _howto/operator:EKSListClustersOperator:
+.. _howto/operator:EKSDescribeClusterOperator:
+
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we list all Amazon EKS Clusters.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_clusters]
+    :end-before: [END howto_operator_eks_list_clusters]
+
+In the following code we retrieve details for a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_cluster]
+    :end-before: [END howto_operator_eks_describe_cluster]
+
+
+.. _howto/operator:EKSDeleteClusterOperator:
+
+Deleting Amazon EKS Clusters
+----------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_cluster]
+    :end-before: [END howto_operator_eks_delete_cluster]
+
+
+.. _howto/operator:EKSCreateNodegroupOperator:
+
+Creating Amazon EKS Managed NodeGroups
+--------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_nodegroup.py`` uses ``EKSCreateNodegroupOperator``
+to create an Amazon EKS Managed Nodegroup using an existing cluster, ``EKSListNodegroupsOperator``
+and ``EKSDescribeNodegroupOperator`` to verify creation, then ``EKSDeleteNodegroupOperator``
+to delete the nodegroup.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Managed Nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_nodegroup]
+    :end-before: [END howto_operator_eks_create_nodegroup]
+
+
+.. _howto/operator:EKSListNodegroupsOperator:
+.. _howto/operator:EKSDescribeNodegroupOperator:
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we retrieve details for a given Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_nodegroup]
+    :end-before: [END howto_operator_eks_describe_nodegroup]
+
+
+In the following code we list all Amazon EKS Nodegroups in a given EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_nodegroup]
+    :end-before: [END howto_operator_eks_list_nodegroup]
+
+
+.. _howto/operator:EKSDeleteNodegroupOperator:
+
+Deleting Amazon EKS Managed Nodegroups
+--------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete an Amazon EKS nodegroup.

Review comment:
       Ideally, you should use the class name in the operator description. The class name should be usable as a link, so you must use a reference:
   ``:class:`~airfflow.providers.amazon.aws.operators.EKSDeleteNodegroupOperator`. ``




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660168966



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       The best way to integrate is when the token generation code would also use the Airflow hook because then we would have the greatest certainty that all credentials were passed from Airflow correctly. ;-)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660987975



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):

Review comment:
       See [below](https://github.com/apache/airflow/pull/16571#discussion_r660444860)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660439107



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()

Review comment:
       ```suggestion
               eks_client = self.conn
   ```
   
   `get_conn()` is qausi-deprecated, and the cached `conn` property is preferred.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661038418



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - ``example_eks_create_cluster.py``
+ - ``example_eks_create_cluster_with_nodegroup.py``
+ - ``example_eks_create_nodegroup.py``
+ - ``example_eks_pod_operator.py``
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon
+EKS Cluster, ``EKSListClustersOperator`` and ``EKSDescribeClusterOperator`` to verify creation, then
+``EKSDeleteClusterOperator`` to delete the Cluster.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster]
+    :end-before: [END howto_operator_eks_create_cluster]
+
+
+.. _howto/operator:EKSListClustersOperator:
+.. _howto/operator:EKSDescribeClusterOperator:
+
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we list all Amazon EKS Clusters.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_clusters]
+    :end-before: [END howto_operator_eks_list_clusters]
+
+In the following code we retrieve details for a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_cluster]
+    :end-before: [END howto_operator_eks_describe_cluster]
+
+
+.. _howto/operator:EKSDeleteClusterOperator:
+
+Deleting Amazon EKS Clusters
+----------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_cluster]
+    :end-before: [END howto_operator_eks_delete_cluster]
+
+
+.. _howto/operator:EKSCreateNodegroupOperator:
+
+Creating Amazon EKS Managed NodeGroups
+--------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_nodegroup.py`` uses ``EKSCreateNodegroupOperator``
+to create an Amazon EKS Managed Nodegroup using an existing cluster, ``EKSListNodegroupsOperator``
+and ``EKSDescribeNodegroupOperator`` to verify creation, then ``EKSDeleteNodegroupOperator``
+to delete the nodegroup.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Managed Nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_nodegroup]
+    :end-before: [END howto_operator_eks_create_nodegroup]
+
+
+.. _howto/operator:EKSListNodegroupsOperator:
+.. _howto/operator:EKSDescribeNodegroupOperator:
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we retrieve details for a given Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_nodegroup]
+    :end-before: [END howto_operator_eks_describe_nodegroup]
+
+
+In the following code we list all Amazon EKS Nodegroups in a given EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_nodegroup]
+    :end-before: [END howto_operator_eks_list_nodegroup]
+
+
+.. _howto/operator:EKSDeleteNodegroupOperator:
+
+Deleting Amazon EKS Managed Nodegroups
+--------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete an Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_nodegroup]
+    :end-before: [END howto_operator_eks_delete_nodegroup]
+
+
+Creating Amazon EKS Clusters and Node Groups Together
+------------------------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_stack.py`` demonstrates using
+``EKSCreateClusterOperator`` to create an Amazon EKS cluster and underlying
+Amazon EKS node group in one command.  ``EKSDescribeClustersOperator`` and
+``EKSDescribeNodegroupsOperator`` verify creation, then ``EKSDeleteClusterOperator``
+deletes all created resources.
+
+Prerequisites
+"""""""""""""
+
+  ``ec2.amazon.aws.com`` must be in the Trusted Relationships
+  ``eks.amazonaws.com`` must be added to the Trusted Relationships
+  ``AmazonEC2ContainerRegistryReadOnly`` IAM Policy must be attached
+  ``AmazonEKSClusterPolicy`` IAM Policy must be attached
+  ``AmazonEKSWorkerNodePolicy`` IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS cluster and node group, verify creation,
+then delete both resources.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster_with_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster_with_compute]
+    :end-before: [END howto_operator_eks_create_cluster_with_compute]
+
+
+.. _howto/operator:EKSPodOperator:
+
+Perform a Task on an Amazon EKS Cluster
+---------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_pod_operator.py`` demonstrates using
+``EKSStartPodOperator`` to perform a command on Amazon EKS cluster.
+
+Prerequisites
+"""""""""""""
+
+  1. An Amazon EKS Cluster with underlying compute infrastructure.
+  2. The AWS CLI version 2 must be installed on the worker.
+
+  see: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html

Review comment:
       We should use descriptive links. See: https://developers.google.com/style/link-text




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-865390141


   Not done looking ta all the test results yet, but looks like most/all were caused by either a single misnamed param that slipped through in an example dag, or whitespace issues in the docs.  Correcting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660988704



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:

Review comment:
       How do you feel about this:
   
   `If compute is assigned the value of ``nodegroup``, the following are required:`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272550



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)

Review comment:
       We should use the Hook to create a session because this will provide a uniform way to manage credentials and will allow us to use this operator in any environment i.e. on-premiss, on GCP, on Cloud Composer and others.  Now, this operator can only be used in AWS environment, because it supports only [Default Credential Provider Chain](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660890114



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)

Review comment:
       I'm just gonna hang my head in shame on this one.   Sorry about that.  Will be corrected in the next revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669907616



##########
File path: tests/providers/amazon/aws/hooks/test_eks.py
##########
@@ -0,0 +1,765 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+from copy import deepcopy
+from typing import Dict, List, Optional, Tuple, Type
+from unittest import mock
+from urllib.parse import ParseResult, urlparse
+
+import pytest
+from _pytest._code import ExceptionInfo
+from botocore.exceptions import ClientError
+from freezegun import freeze_time
+from moto.core import ACCOUNT_ID
+from moto.core.exceptions import AWSError
+from moto.eks.exceptions import (
+    InvalidParameterException,
+    InvalidRequestException,
+    ResourceInUseException,
+    ResourceNotFoundException,
+)
+from moto.eks.models import (
+    CLUSTER_EXISTS_MSG,
+    CLUSTER_IN_USE_MSG,
+    CLUSTER_NOT_FOUND_MSG,
+    CLUSTER_NOT_READY_MSG,
+    LAUNCH_TEMPLATE_WITH_DISK_SIZE_MSG,
+    LAUNCH_TEMPLATE_WITH_REMOTE_ACCESS_MSG,
+    NODEGROUP_EXISTS_MSG,
+    NODEGROUP_NOT_FOUND_MSG,
+)
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+
+from ..utils.eks_test_constants import (
+    CONN_ID,
+    DISK_SIZE,
+    FROZEN_TIME,
+    INSTANCE_TYPES,
+    LAUNCH_TEMPLATE,
+    PACKAGE_NOT_PRESENT_MSG,
+    PARTITIONS,
+    REGION,
+    REMOTE_ACCESS,
+    BatchCountSize,
+    ClusterAttributes,
+    ClusterInputs,
+    ErrorAttributes,
+    NodegroupAttributes,
+    NodegroupInputs,
+    PossibleTestResults,
+    RegExTemplates,
+    ResponseAttribute,
+)
+from ..utils.eks_test_utils import (
+    attributes_to_test,
+    generate_clusters,
+    generate_nodegroups,
+    iso_date,
+    random_names,
+    region_matches_partition,
+)
+
+try:
+    from moto import mock_eks
+except ImportError:
+    mock_eks = None
+
+
+@pytest.fixture(scope="function")
+def cluster_builder():
+    """A fixture to generate a batch of EKS Clusters on the mocked backend for testing."""
+
+    class ClusterTestDataFactory:
+        """A Factory class for building the Cluster objects."""
+
+        def __init__(self, count: int, minimal: bool) -> None:
+            # Generate 'count' number of random Cluster objects.
+            self.cluster_names: List[str] = generate_clusters(
+                eks_hook=eks_hook, num_clusters=count, minimal=minimal
+            )
+
+            # Get the name of the first generated Cluster.
+            first_name: str = self.cluster_names[0]
+
+            # Collect the output of describe_cluster() for the first Cluster.
+            self.cluster_describe_output: Dict = eks_hook.describe_cluster(name=first_name)[
+                ResponseAttribute.CLUSTER
+            ]
+
+            # Pick a random Cluster name from the list and a name guaranteed not to be on the list.
+            self.existing_cluster_name, self.nonexistent_cluster_name = random_names(
+                name_list=self.cluster_names
+            )
+
+            # Generate a list of the Cluster attributes to be tested when validating results.
+            self.attributes_to_test: List[Tuple] = attributes_to_test(
+                inputs=ClusterInputs, cluster_name=self.existing_cluster_name
+            )
+
+    def _execute(
+        count: Optional[int] = 1, minimal: Optional[bool] = True
+    ) -> Tuple[EKSHook, ClusterTestDataFactory]:
+        return eks_hook, ClusterTestDataFactory(count=count, minimal=minimal)
+
+    mock_eks().start()
+    eks_hook = EKSHook(
+        aws_conn_id=CONN_ID,
+        region_name=REGION,
+    )
+    yield _execute
+    mock_eks().stop()
+
+
+@pytest.fixture(scope="function")
+def nodegroup_builder(cluster_builder):
+    """A fixture to generate a batch of EKSManaged Nodegroups on the mocked backend for testing."""
+
+    class NodegroupTestDataFactory:
+        """A Factory class for building the Cluster objects."""
+
+        def __init__(self, count: int, minimal: bool) -> None:
+            self.cluster_name: str = cluster.existing_cluster_name
+
+            # Generate 'count' number of random Nodegroup objects.
+            self.nodegroup_names: List[str] = generate_nodegroups(
+                eks_hook=eks_hook, cluster_name=self.cluster_name, num_nodegroups=count, minimal=minimal
+            )
+
+            # Get the name of the first generated Nodegroup.
+            self.first_name: str = self.nodegroup_names[0]
+
+            # Collect the output of describe_nodegroup() for the first Nodegroup.
+            self.nodegroup_describe_output: Dict = eks_hook.describe_nodegroup(
+                clusterName=self.cluster_name, nodegroupName=self.first_name
+            )[ResponseAttribute.NODEGROUP]
+
+            # Pick a random Nodegroup name from the list and a name guaranteed not to be on the list.
+            self.existing_nodegroup_name, self.nonexistent_nodegroup_name = random_names(
+                name_list=self.nodegroup_names
+            )
+            _, self.nonexistent_cluster_name = random_names(name_list=[self.cluster_name])
+
+            # Generate a list of the Nodegroup attributes to be tested when validating results.
+            self.attributes_to_test: List[Tuple] = attributes_to_test(
+                inputs=NodegroupInputs,
+                cluster_name=self.cluster_name,
+                nodegroup_name=self.existing_nodegroup_name,
+            )
+
+    def _execute(
+        count: Optional[int] = 1, minimal: Optional[bool] = True
+    ) -> Tuple[EKSHook, NodegroupTestDataFactory]:
+        return eks_hook, NodegroupTestDataFactory(count=count, minimal=minimal)
+
+    eks_hook, cluster = cluster_builder()
+    return _execute
+
+
+@pytest.mark.skipif(mock_eks is None, reason=PACKAGE_NOT_PRESENT_MSG)
+class TestEKSHooks:
+    def test_hook(self, cluster_builder) -> None:
+        eks_hook, _ = cluster_builder()
+        assert eks_hook.get_conn() is not None
+        assert eks_hook.aws_conn_id == CONN_ID
+        assert eks_hook.region_name == REGION
+
+    ###
+    # This specific test does not use the fixture since
+    # it is intended to verify that there are no clusters
+    # in the list at initialization, which means the mock
+    # decorator must be used manually in this one case.
+    ###
+    @mock_eks
+    def test_list_clusters_returns_empty_by_default(self) -> None:
+        eks_hook: EKSHook = EKSHook(aws_conn_id=CONN_ID, region_name=REGION)
+
+        result: List = eks_hook.list_clusters()
+
+        assert isinstance(result, list)
+        assert len(result) == 0
+
+    def test_list_clusters_returns_sorted_cluster_names(
+        self, cluster_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = cluster_builder(count=initial_batch_size)
+        expected_result: List = sorted(generated_test_data.cluster_names)
+
+        result: List = eks_hook.list_clusters()
+
+        assert_result_matches_expected_list(result, expected_result, initial_batch_size)
+
+    def test_list_clusters_returns_all_results(
+        self, cluster_builder, initial_batch_size: int = BatchCountSize.LARGE
+    ) -> None:
+        eks_hook, generated_test_data = cluster_builder(count=initial_batch_size)
+        expected_result: List = sorted(generated_test_data.cluster_names)
+
+        result: List = eks_hook.list_clusters()
+
+        assert_result_matches_expected_list(result, expected_result)
+
+    def test_create_cluster_throws_exception_when_cluster_exists(
+        self, cluster_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = cluster_builder(count=initial_batch_size)
+        expected_exception: Type[AWSError] = ResourceInUseException
+        expected_msg: str = CLUSTER_EXISTS_MSG.format(
+            clusterName=generated_test_data.existing_cluster_name,
+        )
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.create_cluster(
+                name=generated_test_data.existing_cluster_name, **dict(ClusterInputs.REQUIRED)
+            )
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+        # Verify no new cluster was created.
+        len_after_test: int = len(eks_hook.list_clusters())
+        assert len_after_test == initial_batch_size
+
+    def test_create_cluster_generates_valid_cluster_arn(self, cluster_builder) -> None:
+        _, generated_test_data = cluster_builder()
+        expected_arn_values: List = [
+            PARTITIONS,
+            REGION,
+            ACCOUNT_ID,
+            generated_test_data.cluster_names,
+        ]
+
+        assert_all_arn_values_are_valid(
+            expected_arn_values=expected_arn_values,
+            pattern=RegExTemplates.CLUSTER_ARN,
+            arn_under_test=generated_test_data.cluster_describe_output[ClusterAttributes.ARN],
+        )
+
+    @freeze_time(FROZEN_TIME)
+    def test_create_cluster_generates_valid_cluster_created_timestamp(self, cluster_builder) -> None:
+        _, generated_test_data = cluster_builder()
+
+        result_time: str = generated_test_data.cluster_describe_output[ClusterAttributes.CREATED_AT]
+
+        assert iso_date(result_time) == FROZEN_TIME
+
+    def test_create_cluster_generates_valid_cluster_endpoint(self, cluster_builder) -> None:
+        _, generated_test_data = cluster_builder()
+
+        result_endpoint: str = generated_test_data.cluster_describe_output[ClusterAttributes.ENDPOINT]
+
+        assert_is_valid_uri(result_endpoint)
+
+    def test_create_cluster_generates_valid_oidc_identity(self, cluster_builder) -> None:
+        _, generated_test_data = cluster_builder()
+
+        result_issuer: str = generated_test_data.cluster_describe_output[ClusterAttributes.IDENTITY][
+            ClusterAttributes.OIDC
+        ][ClusterAttributes.ISSUER]
+
+        assert_is_valid_uri(result_issuer)
+
+    def test_create_cluster_saves_provided_parameters(self, cluster_builder) -> None:
+        _, generated_test_data = cluster_builder(minimal=False)
+
+        for key, expected_value in generated_test_data.attributes_to_test:
+            assert generated_test_data.cluster_describe_output[key] == expected_value
+
+    def test_describe_cluster_throws_exception_when_cluster_not_found(
+        self, cluster_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = cluster_builder(count=initial_batch_size)
+        expected_exception: Type[AWSError] = ResourceNotFoundException
+        expected_msg = CLUSTER_NOT_FOUND_MSG.format(
+            clusterName=generated_test_data.nonexistent_cluster_name,
+        )
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.describe_cluster(name=generated_test_data.nonexistent_cluster_name)
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+
+    def test_delete_cluster_returns_deleted_cluster(
+        self, cluster_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = cluster_builder(count=initial_batch_size, minimal=False)
+
+        result: Dict = eks_hook.delete_cluster(name=generated_test_data.existing_cluster_name)[
+            ResponseAttribute.CLUSTER
+        ]
+
+        for key, expected_value in generated_test_data.attributes_to_test:
+            assert result[key] == expected_value
+
+    def test_delete_cluster_removes_deleted_cluster(
+        self, cluster_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = cluster_builder(count=initial_batch_size, minimal=False)
+
+        eks_hook.delete_cluster(name=generated_test_data.existing_cluster_name)
+        result_cluster_list: List = eks_hook.list_clusters()
+
+        assert len(result_cluster_list) == (initial_batch_size - 1)
+        assert generated_test_data.existing_cluster_name not in result_cluster_list
+
+    def test_delete_cluster_throws_exception_when_cluster_not_found(
+        self, cluster_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = cluster_builder(count=initial_batch_size)
+        expected_exception: Type[AWSError] = ResourceNotFoundException
+        expected_msg: str = CLUSTER_NOT_FOUND_MSG.format(
+            clusterName=generated_test_data.nonexistent_cluster_name,
+        )
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.delete_cluster(name=generated_test_data.nonexistent_cluster_name)
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+        # Verify nothing was deleted.
+        cluster_count_after_test: int = len(eks_hook.list_clusters())
+        assert cluster_count_after_test == initial_batch_size
+
+    def test_list_nodegroups_returns_empty_by_default(self, cluster_builder) -> None:
+        eks_hook, generated_test_data = cluster_builder()
+
+        result: List = eks_hook.list_nodegroups(clusterName=generated_test_data.existing_cluster_name)
+
+        assert isinstance(result, list)
+        assert len(result) == 0
+
+    def test_list_nodegroups_returns_sorted_nodegroup_names(
+        self, nodegroup_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = nodegroup_builder(count=initial_batch_size)
+        expected_result: List = sorted(generated_test_data.nodegroup_names)
+
+        result: List = eks_hook.list_nodegroups(clusterName=generated_test_data.cluster_name)
+
+        assert_result_matches_expected_list(result, expected_result, initial_batch_size)
+
+    def test_list_nodegroups_returns_all_results(
+        self, nodegroup_builder, initial_batch_size: int = BatchCountSize.LARGE
+    ) -> None:
+        eks_hook, generated_test_data = nodegroup_builder(count=initial_batch_size)
+        expected_result: List = sorted(generated_test_data.nodegroup_names)
+
+        result: List = eks_hook.list_nodegroups(clusterName=generated_test_data.cluster_name)
+
+        assert_result_matches_expected_list(result, expected_result)
+
+    @mock_eks
+    def test_create_nodegroup_throws_exception_when_cluster_not_found(self) -> None:
+        eks_hook: EKSHook = EKSHook(aws_conn_id=CONN_ID, region_name=REGION)
+        non_existent_cluster_name: str = random_names()
+        non_existent_nodegroup_name: str = random_names()
+        expected_exception: Type[AWSError] = ResourceNotFoundException
+        expected_msg: str = CLUSTER_NOT_FOUND_MSG.format(
+            clusterName=non_existent_cluster_name,
+        )
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.create_nodegroup(
+                clusterName=non_existent_cluster_name,
+                nodegroupName=non_existent_nodegroup_name,
+                **dict(NodegroupInputs.REQUIRED),
+            )
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+
+    def test_create_nodegroup_throws_exception_when_nodegroup_already_exists(
+        self, nodegroup_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = nodegroup_builder(count=initial_batch_size)
+        expected_exception: Type[AWSError] = ResourceInUseException
+        expected_msg: str = NODEGROUP_EXISTS_MSG.format(
+            clusterName=generated_test_data.cluster_name,
+            nodegroupName=generated_test_data.existing_nodegroup_name,
+        )
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.create_nodegroup(
+                clusterName=generated_test_data.cluster_name,
+                nodegroupName=generated_test_data.existing_nodegroup_name,
+                **dict(NodegroupInputs.REQUIRED),
+            )
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+        # Verify no new nodegroup was created.
+        nodegroup_count_after_test = len(
+            eks_hook.list_nodegroups(clusterName=generated_test_data.cluster_name)
+        )
+        assert nodegroup_count_after_test == initial_batch_size
+
+    def test_create_nodegroup_throws_exception_when_cluster_not_active(
+        self, nodegroup_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = nodegroup_builder(count=initial_batch_size)
+        non_existent_nodegroup_name: str = random_names()
+        expected_exception: Type[AWSError] = InvalidRequestException
+        expected_msg: str = CLUSTER_NOT_READY_MSG.format(
+            clusterName=generated_test_data.cluster_name,
+        )
+
+        with mock.patch("moto.eks.models.Cluster.isActive", return_value=False):
+            with pytest.raises(ClientError) as raised_exception:
+                eks_hook.create_nodegroup(
+                    clusterName=generated_test_data.cluster_name,
+                    nodegroupName=non_existent_nodegroup_name,
+                    **dict(NodegroupInputs.REQUIRED),
+                )
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+        # Verify no new nodegroup was created.
+        nodegroup_count_after_test = len(
+            eks_hook.list_nodegroups(clusterName=generated_test_data.cluster_name)
+        )
+        assert nodegroup_count_after_test == initial_batch_size
+
+    def test_create_nodegroup_generates_valid_nodegroup_arn(self, nodegroup_builder) -> None:
+        _, generated_test_data = nodegroup_builder()
+        expected_arn_values: List = [
+            PARTITIONS,
+            REGION,
+            ACCOUNT_ID,
+            generated_test_data.cluster_name,
+            generated_test_data.nodegroup_names,
+            None,
+        ]
+
+        assert_all_arn_values_are_valid(
+            expected_arn_values=expected_arn_values,
+            pattern=RegExTemplates.NODEGROUP_ARN,
+            arn_under_test=generated_test_data.nodegroup_describe_output[NodegroupAttributes.ARN],
+        )
+
+    @freeze_time(FROZEN_TIME)
+    def test_create_nodegroup_generates_valid_nodegroup_created_timestamp(self, nodegroup_builder) -> None:
+        _, generated_test_data = nodegroup_builder()
+
+        result_time: str = generated_test_data.nodegroup_describe_output[NodegroupAttributes.CREATED_AT]
+
+        assert iso_date(result_time) == FROZEN_TIME
+
+    @freeze_time(FROZEN_TIME)
+    def test_create_nodegroup_generates_valid_nodegroup_modified_timestamp(self, nodegroup_builder) -> None:
+        _, generated_test_data = nodegroup_builder()
+
+        result_time: str = generated_test_data.nodegroup_describe_output[NodegroupAttributes.MODIFIED_AT]
+
+        assert iso_date(result_time) == FROZEN_TIME
+
+    def test_create_nodegroup_generates_valid_autoscaling_group_name(self, nodegroup_builder) -> None:
+        _, generated_test_data = nodegroup_builder()
+        result_resources: Dict = generated_test_data.nodegroup_describe_output[NodegroupAttributes.RESOURCES]
+
+        result_asg_name: str = result_resources[NodegroupAttributes.AUTOSCALING_GROUPS][0][
+            NodegroupAttributes.NAME
+        ]
+
+        assert RegExTemplates.NODEGROUP_ASG_NAME_PATTERN.match(result_asg_name)
+
+    def test_create_nodegroup_generates_valid_security_group_name(self, nodegroup_builder) -> None:
+        _, generated_test_data = nodegroup_builder()
+        result_resources: Dict = generated_test_data.nodegroup_describe_output[NodegroupAttributes.RESOURCES]
+
+        result_security_group: str = result_resources[NodegroupAttributes.REMOTE_ACCESS_SG]
+
+        assert RegExTemplates.NODEGROUP_SECURITY_GROUP_NAME_PATTERN.match(result_security_group)
+
+    def test_create_nodegroup_saves_provided_parameters(self, nodegroup_builder) -> None:
+        _, generated_test_data = nodegroup_builder(minimal=False)
+
+        for key, expected_value in generated_test_data.attributes_to_test:
+            assert generated_test_data.nodegroup_describe_output[key] == expected_value
+
+    def test_describe_nodegroup_throws_exception_when_cluster_not_found(self, nodegroup_builder) -> None:
+        eks_hook, generated_test_data = nodegroup_builder()
+        expected_exception: Type[AWSError] = ResourceNotFoundException
+        expected_msg: str = CLUSTER_NOT_FOUND_MSG.format(
+            clusterName=generated_test_data.nonexistent_cluster_name,
+        )
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.describe_nodegroup(
+                clusterName=generated_test_data.nonexistent_cluster_name,
+                nodegroupName=generated_test_data.existing_nodegroup_name,
+            )
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+
+    def test_describe_nodegroup_throws_exception_when_nodegroup_not_found(self, nodegroup_builder) -> None:
+        eks_hook, generated_test_data = nodegroup_builder()
+        expected_exception: Type[AWSError] = ResourceNotFoundException
+        expected_msg: str = NODEGROUP_NOT_FOUND_MSG.format(
+            nodegroupName=generated_test_data.nonexistent_nodegroup_name,
+        )
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.describe_nodegroup(
+                clusterName=generated_test_data.cluster_name,
+                nodegroupName=generated_test_data.nonexistent_nodegroup_name,
+            )
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+
+    def test_delete_cluster_throws_exception_when_nodegroups_exist(self, nodegroup_builder) -> None:
+        eks_hook, generated_test_data = nodegroup_builder()
+        expected_exception: Type[AWSError] = ResourceInUseException
+        expected_msg: str = CLUSTER_IN_USE_MSG
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.delete_cluster(name=generated_test_data.cluster_name)
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+        # Verify no clusters were deleted.
+        cluster_count_after_test: int = len(eks_hook.list_clusters())
+        assert cluster_count_after_test == BatchCountSize.SINGLE
+
+    def test_delete_nodegroup_removes_deleted_nodegroup(
+        self, nodegroup_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = nodegroup_builder(count=initial_batch_size)
+
+        eks_hook.delete_nodegroup(
+            clusterName=generated_test_data.cluster_name,
+            nodegroupName=generated_test_data.existing_nodegroup_name,
+        )
+        result_nodegroup_list: List = eks_hook.list_nodegroups(clusterName=generated_test_data.cluster_name)
+
+        assert len(result_nodegroup_list) == (initial_batch_size - 1)
+        assert generated_test_data.existing_nodegroup_name not in result_nodegroup_list
+
+    def test_delete_nodegroup_returns_deleted_nodegroup(
+        self, nodegroup_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = nodegroup_builder(count=initial_batch_size, minimal=False)
+
+        result: Dict = eks_hook.delete_nodegroup(
+            clusterName=generated_test_data.cluster_name,
+            nodegroupName=generated_test_data.existing_nodegroup_name,
+        )[ResponseAttribute.NODEGROUP]
+
+        for key, expected_value in generated_test_data.attributes_to_test:
+            assert result[key] == expected_value
+
+    def test_delete_nodegroup_throws_exception_when_cluster_not_found(self, nodegroup_builder) -> None:
+        eks_hook, generated_test_data = nodegroup_builder()
+        expected_exception: Type[AWSError] = ResourceNotFoundException
+        expected_msg: str = CLUSTER_NOT_FOUND_MSG.format(
+            clusterName=generated_test_data.nonexistent_cluster_name,
+        )
+
+        with pytest.raises(ClientError) as raised_exception:
+            eks_hook.delete_nodegroup(
+                clusterName=generated_test_data.nonexistent_cluster_name,
+                nodegroupName=generated_test_data.existing_nodegroup_name,
+            )
+
+        assert_client_error_exception_thrown(
+            expected_exception=expected_exception,
+            expected_msg=expected_msg,
+            raised_exception=raised_exception,
+        )
+
+    def test_delete_nodegroup_throws_exception_when_nodegroup_not_found(
+        self, nodegroup_builder, initial_batch_size: int = BatchCountSize.SMALL
+    ) -> None:
+        eks_hook, generated_test_data = nodegroup_builder(count=initial_batch_size)

Review comment:
       Is it needed? Can you use a fixed value here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669060823



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")

Review comment:
       Should be addressed by https://github.com/apache/airflow/pull/16571/commits/8687b736994d1911011dd8bd2b2903d82ad30a8c

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:

Review comment:
       Should be addressed by https://github.com/apache/airflow/pull/16571/commits/8687b736994d1911011dd8bd2b2903d82ad30a8c

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)

Review comment:
       Should be addressed by https://github.com/apache/airflow/pull/16571/commits/8687b736994d1911011dd8bd2b2903d82ad30a8c




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-865780807


   ❤️ 
   
   Can you explain why you had to change the label-when-reviewed action in this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667273451



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")

Review comment:
       ```suggestion
       hook = AwsBaaseHook(aws_conn_id=aws_conn_id, clien_type='eks')
       session, _ = hook._get_credentials()
       eks_client = hook.conn
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661850237



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):

Review comment:
       I guess we can use this to get the newest cluster that has a given tag to limit the use of hardcoded cluster names.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667052549



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Clusters in your AWS account with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListClustersOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_clusters(
+            maxResults=self.maxResults, nextToken=self.nextToken, verbose=self.verbose
+        )
+
+
+class EKSListNodegroupsOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Nodegroups associated with the specified EKS Cluster with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListNodegroupsOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+
+
+class EKSPodOperator(KubernetesPodOperator):
+    """
+    Executes a task in a Kubernetes pod on the specified Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSPodOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to execute the task on.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+       for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param kube_config_file_path: Path to save the generated kube_config file to.
+    :type kube_config_file_path: str
+    :param in_cluster: If True, look for config inside the cluster; if False look for a local file path.
+    :type in_cluster: bool
+    :param namespace: The namespace in which to execute the pod.
+    :type namespace: str
+    :param pod_context: The security context to use while executing the pod.
+    :type pod_context: str
+    :param pod_name: The unique name to give the pod.
+    :type pod_name: str
+    :param pod_username: The username to use while executing the pod.
+    :type pod_username: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :param aws_profile: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(  # pylint: disable=too-many-arguments,too-many-locals
+        self,
+        cluster_name: str,
+        cluster_role_arn: Optional[str] = None,
+        # A default path will be used if none is provided.
+        kube_config_file_path: Optional[str] = os.environ.get(KUBE_CONFIG_ENV_VAR, DEFAULT_KUBE_CONFIG_PATH),

Review comment:
       Switched to using a named temp file and tested using a DAG which executes six pods over two clusters concurrently without issue.   I know that's a small test, but it didn't have any issues.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-869095399


   What do you think about adding system tests for these operators? We are constantly striving to improve test coverage and we have impressive results with Google integration, where the lack of testing is the exception, not the daily routine. See: https://github.com/apache/airflow/issues/8280


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660442614



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:

Review comment:
       ```suggestion
               while eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups'):
   ```
   
   No need to check the length, just see if it's "truthy"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r677803402



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)

Review comment:
       @ashb - Working on removing the redundant catch/log/except blocks.  How do you feel about this one?  Should I drop this log as well, or is that reasonable?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r677804559



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)

Review comment:
       @ashb  - Working on removing the redundant catch/log/except blocks. How do you feel about this one? Should I drop this log as well, or is that reasonable?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-869088796


   @dimberman  @mik-laj  @vikramkoka  - All CI tests are currently passing, ready for a proper review at your convenience.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-879367442


   > How common is it for a DAG to run for over an hour?
   
   In the case of KubernetesPodOperator, this is not so uncommon as this operator is just used to run heavy and resource-consuming tasks. We will deal with this later as this is a separate effort, but could you describe this limitation in docs and create a ticket about it after merging?
   
   > What do we gain from rewriting that, other than another chunk of code to maintain and another point of failure to watch?
   
   There are two reasons:
   - Better support for credential management. Airflow has a different chain for establishing credentials which are not compatible with [the Default Credential Provider Chain for AWS CLI](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html). In particular, we support [the secret backend for retrieving credentials](http://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html), [SPEGO Authentication](https://github.com/apache/airflow/blob/a3f5c93806258b5ad396a638ba0169eca7f9d065/airflow/providers/amazon/aws/hooks/base_aws.py#L256) and other. You can keep CLI, but you will find it difficult to handle all of these changes while we continue to use AWS CLI.
   - Reducing system dependencies, which makes it easier to use this operator. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659254457



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       Should we add any template fields?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb merged pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb merged pull request #16571:
URL: https://github.com/apache/airflow/pull/16571


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-896187912


   Hey folks, sorry for the delay.  I wanted to get a better understanding of Jinja before I implemented/fixed it.  I should have another round of updates for you today or tomorrow.  
   
   [EDIT: Updates pushed.  I think they address all concerns from @ashb  and @mik-laj so far; I have another one coming to correct the misuse of `Optional` in a couple places which @zkan caught.]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660166531



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       The KubernetesEngine operator requires an update. It was added when I didn't know a better way to authenticate. it should use google access-token for authentication: https://github.com/apache/airflow/pull/16571#discussion_r659255470
   
   If you want, you can keep a dependency on AWS CLI (though I'd prefer not), but you definitely should pass credentials from Airflow to AWS CLI and that's the main issue. Now, this operator cannot be used on Cloud Composer, GKE, Astronomer, or many other environments that do not have set up the default AWS credentials in AWS CLI.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667273782



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+
+    # Set the filename to something which can be found later if needed.
+    filename_prefix = KUBE_CONFIG_FILE_PREFIX + pod_name
+    with tempfile.NamedTemporaryFile(prefix=filename_prefix, mode='w', delete=False) as config_file:

Review comment:
       We should delete the credentials when they are not used. To achieve this, we should make the following changes:
   - convert this method into context manager i.e. add [`contextlib.contextmanager`](https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager) decorator,
   - yield on a new file name
   - remove delete parameter in this method invocation
   - remove pod_name parameters from this method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667282310



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:

Review comment:
       We should accept conn_id to ensure unify credential managementt.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272213



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(

Review comment:
       Addressed in https://github.com/apache/airflow/pull/16571/commits/2606556296bfeaccabb513397f05a3885ac44431




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-865390141


   Not done looking at all the test results yet, but looks like most/all were caused by either a single misnamed param that slipped through in an example dag, or whitespace issues in the docs.  Correcting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660758563



##########
File path: setup.py
##########
@@ -500,7 +500,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'jira',
     'jsondiff',
     'mongomock',
-    'moto~=2.0',
+    'moto~=2.0.10',

Review comment:
       A couple of other equivalents that may be straightforward to understand.
   
   * `moto>=2.0.10,<3`
   * `moto>=2.0.10,==2.*`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660162472



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       No. Gcloud is not preinstalled. We have a guide that explains how to add `gcloud` to a Docker image. See: http://airflow.apache.org/docs/docker-stack/recipes.html#google-cloud-sdk-installation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661031604



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "

Review comment:
       Per Kamil's comments this may be moot.   I'll either correct or remove this depending on what comes of that convo.
   
   see:  https://github.com/apache/airflow/pull/16571#discussion_r659252384




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r656352329



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> str:

Review comment:
       My thinking, for what it is worth, was that the Hooks should mirror the API as closely as possible.  That would improve the learning curve for those coming to Airflow with some boto experience, and also make it easier for those new to the boto API and looking at their docs to see what is going on since the names match up.  
   
   Existing AWS hooks are a mixed bag, but do lean towards snake_case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670779889



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/76c7ad635948191764a323a9be03fa965fc91ff5




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272982



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:

Review comment:
       ```suggestion
   def generate_config_file(
       eks_cluster_name: str,
       eks_namespace_name: str,
       pod_name: str,
       pod_username: Optional[str] = DEFAULT_POD_USERNAME,
       pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
       aws_conn_id: str = 'aws_default'
   ) -> str:
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272550



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)

Review comment:
       We should use the Hook to create a session because this will provide a uniform way to manage credentials and will allow us to use this operator in any environment i.e. on-premiss, on GCP, on Cloud Composer and others.  Now, this operator can only be used in AWS environment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669060343



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+
+    # Set the filename to something which can be found later if needed.
+    filename_prefix = KUBE_CONFIG_FILE_PREFIX + pod_name
+    with tempfile.NamedTemporaryFile(prefix=filename_prefix, mode='w', delete=False) as config_file:
+        config_file.write(config_text)
+
+    return config_file.name
+
+
+def _get_bearer_token(session: boto3.Session, cluster_id: str, aws_region: str) -> str:

Review comment:
       Should be addressed by https://github.com/apache/airflow/pull/16571/commits/8687b736994d1911011dd8bd2b2903d82ad30a8c




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-879367442


   > How common is it for a DAG to run for over an hour?
   
   In the case of KubernetesPodOperator, this is not so uncommon as this operator is just used to run heavy and resource-consuming tasks. We can deal with this later as this is a separate effort, but could you describe this limitation in docs and create a ticket about it after merging?
   
   > What do we gain from rewriting that, other than another chunk of code to maintain and another point of failure to watch?
   
   There are two reasons:
   - Better support for credential management. Airflow has a different chain for establishing credentials which are not compatible with [the Default Credential Provider Chain for AWS CLI](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html). In particular, we support [the secret backend for retrieving credentials](http://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html), [SPEGO Authentication](https://github.com/apache/airflow/blob/a3f5c93806258b5ad396a638ba0169eca7f9d065/airflow/providers/amazon/aws/hooks/base_aws.py#L256) and other. You can keep CLI, but you will find it difficult to handle all of these changes while we continue to use AWS CLI.
   - Reducing system dependencies, which makes it easier to use this operator. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660160251



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Is the gcloud tool that the GKE operator uses in that link preinstalled in the official Docker image?  If so, could the AWS tool be as well?  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660134951



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       Sure?   Which ones do you think would be valuable to add in here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-869831727






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r688203241



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,420 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from contextlib import contextmanager
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import yaml
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.create_cluster(
+            name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+        )
+
+        self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+        return response
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+        # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+        # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+        # The 'shared' value allows more than one resource to use the subnet.
+        tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+        if "tags" in kwargs:
+            tags = {**tags, **kwargs["tags"]}
+            kwargs.pop("tags")
+
+        response = eks_client.create_nodegroup(
+            clusterName=clusterName,
+            nodegroupName=nodegroupName,
+            subnets=subnets,
+            nodeRole=nodeRole,
+            tags=tags,
+            **kwargs,
+        )
+
+        self.log.info(
+            "Created a managed nodegroup named %s in cluster %s",
+            response.get('nodegroup').get('nodegroupName'),
+            response.get('nodegroup').get('clusterName'),
+        )
+        return response
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.delete_cluster(name=name)
+
+        self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+        return response
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+
+        self.log.info(
+            "Deleted nodegroup named %s from cluster %s.",
+            response.get('nodegroup').get('nodegroupName'),
+            response.get('nodegroup').get('clusterName'),
+        )
+        return response
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:

Review comment:
       @zkan - should be addressed in https://github.com/apache/airflow/pull/16571/commits/d63063932c4e42432a039fc69d9c592ce08b95c6




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-883716102


   Pushed the system tests.  They will all skip because the logic for passing/parsing the credentials is not implemented in `provide_aws_context()`, but that feels like it would be better served with a new PR rather than tacking it on to this one.
   
   It's been a long conversation but I think that covers everything I'm aware of so far except for two discussions:
   
   [1.](https://github.com/apache/airflow/pull/16571#discussion_r656029075)  The hooks use camelCase field/param names to match the API endpoints they call.  That is easy to change if you want it changed, but I did it to keep the hooks as close to a 1:1 map tot he API calls as possible.
   
   [2.](https://github.com/apache/airflow/pull/16571#discussion_r660443571)  ashb had asked if we needed the `list` and `describe` operators.  mik-laj commented with a potential usecase though it may have been a bit of a stretch.  I can remove them if anyone wants them gone, but it feels like a shame since they are already implemented. ((sunken cost fallacy strikes again))
   
   
   In addition, it has already been discussed (here and on the Slack) and we determined that the token rotation will be improved after this is implemented.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669909907



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()

Review comment:
       Is it needed to generate random value?  Can you use a fixed value here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659253140



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon
+EKS Cluster, ``EKSListClustersOperator`` and ``EKSDescribeClusterOperator`` to verify creation, then
+``EKSDeleteClusterOperator`` to delete the Cluster.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster]
+    :end-before: [END howto_operator_eks_create_cluster]
+
+
+.. _howto/operator:EKSListClustersOperator:
+.. _howto/operator:EKSDescribeClusterOperator:
+
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we list all Amazon EKS Clusters.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_clusters]
+    :end-before: [END howto_operator_eks_list_clusters]
+
+In the following code we retrieve details for a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_cluster]
+    :end-before: [END howto_operator_eks_describe_cluster]
+
+
+.. _howto/operator:EKSDeleteClusterOperator:
+
+Deleting Amazon EKS Clusters
+----------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_cluster]
+    :end-before: [END howto_operator_eks_delete_cluster]
+
+
+.. _howto/operator:EKSCreateNodegroupOperator:
+
+Creating Amazon EKS Managed NodeGroups
+--------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_nodegroup.py`` uses ``EKSCreateNodegroupOperator``
+to create an Amazon EKS Managed Nodegroup using an existing cluster, ``EKSListNodegroupsOperator``
+and ``EKSDescribeNodegroupOperator`` to verify creation, then ``EKSDeleteNodegroupOperator``
+to delete the nodegroup.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Managed Nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_nodegroup]
+    :end-before: [END howto_operator_eks_create_nodegroup]
+
+
+.. _howto/operator:EKSListNodegroupsOperator:
+.. _howto/operator:EKSDescribeNodegroupOperator:
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we retrieve details for a given Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_nodegroup]
+    :end-before: [END howto_operator_eks_describe_nodegroup]
+
+
+In the following code we list all Amazon EKS Nodegroups in a given EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_nodegroup]
+    :end-before: [END howto_operator_eks_list_nodegroup]
+
+
+.. _howto/operator:EKSDeleteNodegroupOperator:
+
+Deleting Amazon EKS Managed Nodegroups
+--------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete an Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_nodegroup]
+    :end-before: [END howto_operator_eks_delete_nodegroup]
+
+
+Creating Amazon EKS Clusters and Node Groups Together
+------------------------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_stack.py`` demonstrates using
+``EKSCreateClusterOperator`` to create an Amazon EKS cluster and underlying
+Amazon EKS node group in one command.  ``EKSDescribeClustersOperator`` and
+``EKSDescribeNodegroupsOperator`` verify creation, then ``EKSDeleteClusterOperator``
+deletes all created resources.
+
+Prerequisites
+"""""""""""""
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached

Review comment:
       ```suggestion
     ``ec2.amazon.aws.com`` must be in the Trusted Relationships
     ``eks.amazonaws.com`` must be added to the Trusted Relationships
     ``AmazonEC2ContainerRegistryReadOnly`` IAM Policy must be attached
     ``AmazonEKSClusterPolicy`` IAM Policy must be attached
     ``AmazonEKSWorkerNodePolicy`` IAM Policy must be attached
   ```
   To avoid spelling check errors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660150440



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Using the AWS CLI was not our preference either, but had to settle for this due to other options not working correctly.  The EKS service does not support providing the token directly at all.  Searching online came up with a few workarounds, but the lesser evil was assuming that someone using an AWS service would be able to install an AWS tool.  There is also precedent for it in [Google's Kubernetes Operators](https://github.com/apache/airflow/blob/2625007c8aeca9ed98dea361ba13c2622482d71f/airflow/providers/google/cloud/operators/kubernetes_engine.py#L319) using the gcloud local script to accomplish the same thing.
   
   I'll test using your provided example to see if that works and report back.  Dropping that dependency would be great.

##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon
+EKS Cluster, ``EKSListClustersOperator`` and ``EKSDescribeClusterOperator`` to verify creation, then
+``EKSDeleteClusterOperator`` to delete the Cluster.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster]
+    :end-before: [END howto_operator_eks_create_cluster]
+
+
+.. _howto/operator:EKSListClustersOperator:
+.. _howto/operator:EKSDescribeClusterOperator:
+
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we list all Amazon EKS Clusters.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_clusters]
+    :end-before: [END howto_operator_eks_list_clusters]
+
+In the following code we retrieve details for a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_cluster]
+    :end-before: [END howto_operator_eks_describe_cluster]
+
+
+.. _howto/operator:EKSDeleteClusterOperator:
+
+Deleting Amazon EKS Clusters
+----------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_cluster]
+    :end-before: [END howto_operator_eks_delete_cluster]
+
+
+.. _howto/operator:EKSCreateNodegroupOperator:
+
+Creating Amazon EKS Managed NodeGroups
+--------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_nodegroup.py`` uses ``EKSCreateNodegroupOperator``
+to create an Amazon EKS Managed Nodegroup using an existing cluster, ``EKSListNodegroupsOperator``
+and ``EKSDescribeNodegroupOperator`` to verify creation, then ``EKSDeleteNodegroupOperator``
+to delete the nodegroup.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Managed Nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_nodegroup]
+    :end-before: [END howto_operator_eks_create_nodegroup]
+
+
+.. _howto/operator:EKSListNodegroupsOperator:
+.. _howto/operator:EKSDescribeNodegroupOperator:
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we retrieve details for a given Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_nodegroup]
+    :end-before: [END howto_operator_eks_describe_nodegroup]
+
+
+In the following code we list all Amazon EKS Nodegroups in a given EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_nodegroup]
+    :end-before: [END howto_operator_eks_list_nodegroup]
+
+
+.. _howto/operator:EKSDeleteNodegroupOperator:
+
+Deleting Amazon EKS Managed Nodegroups
+--------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete an Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_nodegroup]
+    :end-before: [END howto_operator_eks_delete_nodegroup]
+
+
+Creating Amazon EKS Clusters and Node Groups Together
+------------------------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_stack.py`` demonstrates using
+``EKSCreateClusterOperator`` to create an Amazon EKS cluster and underlying
+Amazon EKS node group in one command.  ``EKSDescribeClustersOperator`` and
+``EKSDescribeNodegroupsOperator`` verify creation, then ``EKSDeleteClusterOperator``
+deletes all created resources.
+
+Prerequisites
+"""""""""""""
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached

Review comment:
       Does the spell check know to ignore anything inside double back ticks?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661032604



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."

Review comment:
       We could.   I don't suppose there is a use case where they would want it to stay, is there?  I can't think of one now that you point it out.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r677759718



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,

Review comment:
       Ah cool. I just assumed it used the connection named "aws_default", which was "empty".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667279605



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,

Review comment:
       This token is only valid for an hour, but this operator can be used to run a longer job. We should take care of its rotation of this key to ensure it's valid.
   
   Unfortunately, it won't be trivial as we have to write a new [out-of-tree client
       authentication provider](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/kubectl-exec-plugins.md). To do this, you should write a new CLI tool that will generate a new credential.
   
   Some tips on adding a new CLI tool to providers:
   - You cannot modify the Airflow core so that this operator can be used on different Airflow versions and not create a dependency between the vendor and the core. You cannot modify any file in `airflow.cli` package.
   - You can create a new [executable module](https://docs.python.org/3/tutorial/modules.html#executing-modules-as-scripts). To run this module, it is best to use python module resolution i.e. you should use [python -m](https://docs.python.org/3/using/cmdline.html#cmdoption-m). BTW, To run airflow, you can use `airflow` or `python -m airflow`. See: https://github.com/apache/airflow/pull/7808
   - The `python`/`python3` may not refer to the current interpreter. You should [`sys.executable`](https://docs.python.org/3/library/sys.html#sys.executable) instead.
   
   For example, see: [`airflow.providers.google.common.utils.id_token_credentials`](https://github.com/apache/airflow/blob/866a601b76e219b3c043e1dbbc8fb22300866351/airflow/providers/google/common/utils/id_token_credentials.py#L24])




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-865390141


   Not done looking ta all the test results yet, but looks like most/all were caused by either a single misnamed param that slipped through in an example dag, or whitespace issues in the docs.  Correcting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661037536



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`

Review comment:
       Is it work? I think, it should be as follow:
   ```suggestion
    - :mod:`airflow.providers.amazon.aws.operators.eks`
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670801679



##########
File path: airflow/providers/amazon/aws/sensors/eks.py
##########
@@ -0,0 +1,129 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+"""Tracking the state of EKS Clusters and Nodegroups."""
+
+from typing import Optional
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.sensors.base import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+CONN_ID = "eks"
+TARGET_STATE = 'ACTIVE'
+
+
+class EKSClusterStateSensor(BaseSensorOperator):
+    """
+    Check the state of an Amazon EKS Cluster until the state of the Cluster equals the target state.
+
+    :param cluster_name: The name of the Cluster to watch.
+    :type cluster_name: str
+    :param target_state: Target state of the Cluster.
+    :type target_state: str
+    """
+
+    template_fields = ("target_state", "cluster_name")
+    ui_color = "#ff9900"
+    ui_fgcolor = "#232F3E"
+    valid_states = ["CREATING", "ACTIVE", "DELETING", "FAILED", "UPDATING"]
+
+    @apply_defaults
+    def __init__(
+        self,
+        *,
+        cluster_name: str,
+        target_state: Optional[str] = TARGET_STATE,
+        conn_id: Optional[str] = CONN_ID,

Review comment:
       ```suggestion
           aws_conn_id: Optional[str] = CONN_ID,
   ```
   All AWS operators use this name.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660758563



##########
File path: setup.py
##########
@@ -500,7 +500,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'jira',
     'jsondiff',
     'mongomock',
-    'moto~=2.0',
+    'moto~=2.0.10',

Review comment:
       A couple of other equivalents that may be more straightforward to understand.
   
   * `moto>=2.0.10,<3`
   * `moto>=2.0.10,==2.*`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-879367442


   > How common is it for a DAG to run for over an hour?
   
   In the case of KubernetesPodOperator, this is not so uncommon as this operator is just used to run heavy and resource-consuming tasks. We will deal with this later as this is a separate effort, but could you describe this limitation in docs and create a ticket about it after merging?
   
   > What do we gain from rewriting that, other than another chunk of code to maintain and another point of failure to watch?
   
   There are two reasons:
   - Better support for credential management. Airflow has a different chain for establishing credentials which are not compatible with [the Default Credential Provider Chain for AWS CLI](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html). In particular, we support [the secret backend for retrieving credentials](http://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html), [SPEGO Authentication](https://github.com/apache/airflow/blob/a3f5c93806258b5ad396a638ba0169eca7f9d065/airflow/providers/amazon/aws/hooks/base_aws.py#L256) and other. You can keep CLI, but you will find it difficult to handle all of these changes while we continue to use CLI.
   - Reducing system dependencies, which makes it easier to use this operator. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660454001



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(

Review comment:
       This appears to be used only in one place, EKSPodOperator, so this possibly doesn't need to be in a separate file
   
   Perhaps this could be a method on the EksHook? (Given it connects to EKS and does describe cluster, I think that is the right place for it)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660152770



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       >  the lesser evil was assuming that someone using an AWS service would be able to install an AWS tool.
   
   For me, this is very problematic for three reasons: 
   - Passing credentials from Airflow to AWS CLI.
   - AWS CLI is a native library because it cannot install the latest version via PIP, so in many cases it will be very difficult to use.
   - Official Docker image doesn't have preinstalled AWS CLI.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660183381



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       If you are concerned about maintaining this code, why not add it to AWS SDK for Python? It could have been helpful for other people as well. Have you thought to create an internal ticket on this topic?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-883716102


   Pushed the system tests.  They will all skip because the logic for passing/parsing the credentials is not implemented in `provide_aws_context()`, but that feels like it would be better served with a new PR rather than tacking it on to this one.
   
   It's been a long conversation but I think that covers everything I'm aware of so far except for two discussions:
   
   1. (https://github.com/apache/airflow/pull/16571#discussion_r656029075) The hooks use camelCase field/param names to match the API endpoints they call.  That is easy to change if you want it changed, but I did it to keep the hooks as close to a 1:1 map tot he API calls as possible.
   
   2. (https://github.com/apache/airflow/pull/16571#discussion_r660443571) ashb had asked if we needed the `list` and `describe` operators.  mik-laj commented with a potential usecase though it may have been a bit of a stretch.  I can remove them if anyone wants them gone, but it feels like a shame since they are already implemented. ((sunken cost fallacy strikes again))
   
   
   In addition, it has already been discussed (here and on the Slack) and we determined that the token rotation will be improved after this is implemented.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669905985



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - ``example_eks_create_cluster.py``
+ - ``example_eks_create_cluster_with_nodegroup.py``
+ - ``example_eks_create_nodegroup.py``
+ - ``example_eks_pod_operator.py``
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon
+EKS Cluster, ``EKSListClustersOperator`` and ``EKSDescribeClusterOperator`` to verify creation, then
+``EKSDeleteClusterOperator`` to delete the Cluster.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "eks.amazonaws.com" must be added to the Trusted Relationships
+  "AmazonEKSClusterPolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster]
+    :end-before: [END howto_operator_eks_create_cluster]
+
+
+.. _howto/operator:EKSListClustersOperator:
+.. _howto/operator:EKSDescribeClusterOperator:
+
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we list all Amazon EKS Clusters.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_clusters]
+    :end-before: [END howto_operator_eks_list_clusters]
+
+In the following code we retrieve details for a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_cluster]
+    :end-before: [END howto_operator_eks_describe_cluster]
+
+
+.. _howto/operator:EKSDeleteClusterOperator:
+
+Deleting Amazon EKS Clusters
+----------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete a given Amazon EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_cluster]
+    :end-before: [END howto_operator_eks_delete_cluster]
+
+
+.. _howto/operator:EKSCreateNodegroupOperator:
+
+Creating Amazon EKS Managed NodeGroups
+--------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_nodegroup.py`` uses ``EKSCreateNodegroupOperator``
+to create an Amazon EKS Managed Nodegroup using an existing cluster, ``EKSListNodegroupsOperator``
+and ``EKSDescribeNodegroupOperator`` to verify creation, then ``EKSDeleteNodegroupOperator``
+to delete the nodegroup.
+
+Prerequisites
+"""""""""""""
+
+An AWS IAM role with the following permissions:
+
+  "ec2.amazon.aws.com" must be in the Trusted Relationships
+  "AmazonEC2ContainerRegistryReadOnly" IAM Policy must be attached
+  "AmazonEKSWorkerNodePolicy" IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS Managed Nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_nodegroup]
+    :end-before: [END howto_operator_eks_create_nodegroup]
+
+
+.. _howto/operator:EKSListNodegroupsOperator:
+.. _howto/operator:EKSDescribeNodegroupOperator:
+
+Listing and Describing Amazon EKS Clusters
+-------------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we retrieve details for a given Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_describe_nodegroup]
+    :end-before: [END howto_operator_eks_describe_nodegroup]
+
+
+In the following code we list all Amazon EKS Nodegroups in a given EKS Cluster.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_list_nodegroup]
+    :end-before: [END howto_operator_eks_list_nodegroup]
+
+
+.. _howto/operator:EKSDeleteNodegroupOperator:
+
+Deleting Amazon EKS Managed Nodegroups
+--------------------------------------
+
+Defining tasks
+""""""""""""""
+
+In the following code we delete an Amazon EKS nodegroup.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_delete_nodegroup]
+    :end-before: [END howto_operator_eks_delete_nodegroup]
+
+
+Creating Amazon EKS Clusters and Node Groups Together
+------------------------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_stack.py`` demonstrates using
+``EKSCreateClusterOperator`` to create an Amazon EKS cluster and underlying
+Amazon EKS node group in one command.  ``EKSDescribeClustersOperator`` and
+``EKSDescribeNodegroupsOperator`` verify creation, then ``EKSDeleteClusterOperator``
+deletes all created resources.
+
+Prerequisites
+"""""""""""""
+
+  ``ec2.amazon.aws.com`` must be in the Trusted Relationships
+  ``eks.amazonaws.com`` must be added to the Trusted Relationships
+  ``AmazonEC2ContainerRegistryReadOnly`` IAM Policy must be attached
+  ``AmazonEKSClusterPolicy`` IAM Policy must be attached
+  ``AmazonEKSWorkerNodePolicy`` IAM Policy must be attached
+
+Defining tasks
+""""""""""""""
+
+In the following code we create a new Amazon EKS cluster and node group, verify creation,
+then delete both resources.
+
+.. exampleinclude:: /../../airflow/providers/amazon/aws/example_dags/example_eks_create_cluster_with_nodegroup.py
+    :language: python
+    :start-after: [START howto_operator_eks_create_cluster_with_compute]
+    :end-before: [END howto_operator_eks_create_cluster_with_compute]
+
+
+.. _howto/operator:EKSPodOperator:
+
+Perform a Task on an Amazon EKS Cluster
+---------------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_pod_operator.py`` demonstrates using
+``EKSStartPodOperator`` to perform a command on Amazon EKS cluster.
+
+Prerequisites
+"""""""""""""
+
+  1. An Amazon EKS Cluster with underlying compute infrastructure.
+  2. The AWS CLI version 2 must be installed on the worker.
+
+  see: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html

Review comment:
       Link dropped, but will keep this in mind in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660158307



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       For more inspiration, you can also look at [aws-cli/awscli/customizations/eks/get_token.py](https://github.com/aws/aws-cli/blob/develop/awscli/customizations/eks/get_token.py), [kubergrunt eks token](https://github.com/gruntwork-io/kubergrunt), [kubernetes-sigs/aws-iam-authenticator](https://github.com/kubernetes-sigs/aws-iam-authenticator)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660441259



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."

Review comment:
       Should we tear it down in this case?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-869831727


   I can add system tests for sure; let's get the rest of the concerns out of the way first.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659252727



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       Is it impossible to generate this token using STS?
   ```python
   import base64
   import boto3
   import re
   from botocore.signers import RequestSigner
   
   STS_TOKEN_EXPIRES_IN = 60
   
   def get_bearer_token(session, cluster_id, region):
       client = session.client('sts', region_name=region)
       service_id = client.meta.service_model.service_id
   
       signer = RequestSigner(
           service_id,
           region,
           'sts',
           'v4',
           session.get_credentials(),
           session.events
       )
   
       params = {
           'method': 'GET',
           'url': 'https://sts.{}.amazonaws.com/?Action=GetCallerIdentity&Version=2011-06-15'.format(region),
           'body': {},
           'headers': {
               'x-k8s-aws-id': cluster_id
           },
           'context': {}
       }
   
       signed_url = signer.generate_presigned_url(
           params,
           region_name=region,
           expires_in=STS_TOKEN_EXPIRES_IN,
           operation_name=''
       )
   
       base64_url = base64.urlsafe_b64encode(signed_url.encode('utf-8')).decode('utf-8')
   
       # remove any base64 encoding padding:
       return 'k8s-aws-v1.' + re.sub(r'=*', '', base64_url)
   
   # If making a HTTP request you would create the authorization headers as follows:
   
   session = boto3.session.Session()
   headers = {'Authorization': 'Bearer ' + get_bearer_token(session, 'my_cluster', 'us-east-1')}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-879328885


   Honest question:  How common is it for a DAG to run for over an hour?  In my limited experience, the DAGs I run generally finish way faster than that, but I'm obviously not doing enterprise-level workflows either. If we’re getting to where that is the last sticking point, would it be acceptable to get the pod operator up with that caveat and work towards extending the duration?  It shouldn’t be a breaking change since the actual “get token” logic is in an internal “private” method; whatever solution we come up with for token rotation would just get implemented in that method and the user wouldn't ever know, right?  Or perhaps I'm underestimating the scope of the change?
   
   To the suggestion of writing a new CLI tool to handle token rotation:  I'm really not convinced that replacing the official AWS CLI tool with our own is a net gain here.  There is an existing tool which is open source and maintained by the provider.  What do we gain from rewriting that, other than another chunk of code to maintain and another point of failure to watch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660154893



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):

Review comment:
       Fixed in 616d249




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670801679



##########
File path: airflow/providers/amazon/aws/sensors/eks.py
##########
@@ -0,0 +1,129 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+"""Tracking the state of EKS Clusters and Nodegroups."""
+
+from typing import Optional
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.sensors.base import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+CONN_ID = "eks"
+TARGET_STATE = 'ACTIVE'
+
+
+class EKSClusterStateSensor(BaseSensorOperator):
+    """
+    Check the state of an Amazon EKS Cluster until the state of the Cluster equals the target state.
+
+    :param cluster_name: The name of the Cluster to watch.
+    :type cluster_name: str
+    :param target_state: Target state of the Cluster.
+    :type target_state: str
+    """
+
+    template_fields = ("target_state", "cluster_name")
+    ui_color = "#ff9900"
+    ui_fgcolor = "#232F3E"
+    valid_states = ["CREATING", "ACTIVE", "DELETING", "FAILED", "UPDATING"]
+
+    @apply_defaults
+    def __init__(
+        self,
+        *,
+        cluster_name: str,
+        target_state: Optional[str] = TARGET_STATE,
+        conn_id: Optional[str] = CONN_ID,

Review comment:
       ```suggestion
           aws_conn_id: Optional[str] = CONN_ID,
   ```
   All AWS operators use this name. For consistency, we should follow this convention.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667279605



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,

Review comment:
       This token is only valid for an hour, but this operator can be used to run a longer job. We should take care of its rotation of this key to ensure it's valid.
   
   Unfortunately, it won't be trivial as we have to write a new [client-go-credential-plugin](https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins). To do this, you should write a new CLI tool that will do the generate a new credential.
   
   Some tips on adding a new CLI tool to providers:
   - You cannot modify the Airflow core so that this operator can be used on different Airflow versions and not create a dependency between the vendor and the core. You cannot modify any file in `airflow.cli` package.
   - You can create a new [executable module](https://docs.python.org/3/tutorial/modules.html#executing-modules-as-scripts). To run this module, it is best to use python module resolution i.e. you should use [python -m](https://docs.python.org/3/using/cmdline.html#cmdoption-m). BTW, To run airflow, you can use `airflow` or `python -m airflow`. See: https://github.com/apache/airflow/pull/7808
   - The `python`/`pyhton3` may not refer to the current interpreter. You should [`sys.executable`](https://docs.python.org/3/library/sys.html#sys.executable) instead.
   
   For example, see: [`airflow.providers.google.common.utils.id_token_credentials`](https://github.com/apache/airflow/blob/866a601b76e219b3c043e1dbbc8fb22300866351/airflow/providers/google/common/utils/id_token_credentials.py#L24])




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r656370454



##########
File path: airflow/providers/amazon/aws/hooks/base_aws.py
##########
@@ -347,6 +347,7 @@ def __init__(
         client_type: Optional[str] = None,
         resource_type: Optional[str] = None,
         config: Optional[Config] = None,
+        **kwargs,

Review comment:
       That was being used to swallow the unused args when the Operators were creating the hook in their constructors.  I will remove it when I correct those.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667275020



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+
+    # Set the filename to something which can be found later if needed.
+    filename_prefix = KUBE_CONFIG_FILE_PREFIX + pod_name
+    with tempfile.NamedTemporaryFile(prefix=filename_prefix, mode='w', delete=False) as config_file:
+        config_file.write(config_text)
+
+    return config_file.name
+
+
+def _get_bearer_token(session: boto3.Session, cluster_id: str, aws_region: str) -> str:

Review comment:
       Can you give the method a more descriptive name or add method documentation that explains what the purpose of this token is? I am afraid it is not legible yet, but it may not be very clear in the future. It is also worth adding a short note about this algorithm of this token, e.g. source (aws-iam-authenticator)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660155686



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")

Review comment:
       This is important because you probably used the default credentials from the CLI in Breeze, as you were logged in there before.
   
   In a production environment, these credentials will be provided by Airflow e.g. via the Secret backend, So AWS CLI must be able to share the same credentials. Some Airflow users have a lot of AWS credentials for different accounts and we should be able to use them correctly in this operator as well. The user should not be aware in any way that a CLI is being used to get the token.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669059739



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,709 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_POD_USERNAME,
+    EKSHook,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.clusterName)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+        :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+        :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, aws_conn_id: Optional[str] = CONN_ID, region: Optional[str] = None, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups'):
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used.  If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(verbose=self.verbose)
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(clusterName=self.clusterName, verbose=self.verbose)
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):
+    """
+    Lists all Amazon EKS Clusters in your AWS account.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListClustersOperator`
+
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_clusters(verbose=self.verbose)
+
+
+class EKSListNodegroupsOperator(BaseOperator):
+    """
+    Lists all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListNodegroupsOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        aws_conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = aws_conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_nodegroups(clusterName=self.clusterName, verbose=self.verbose)
+
+
+class EKSPodOperator(KubernetesPodOperator):
+    """
+    Executes a task in a Kubernetes pod on the specified Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSPodOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to execute the task on.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+       for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param in_cluster: If True, look for config inside the cluster; if False look for a local file path.
+    :type in_cluster: bool
+    :param namespace: The namespace in which to execute the pod.
+    :type namespace: str
+    :param pod_context: The security context to use while executing the pod.
+    :type pod_context: str
+    :param pod_name: The unique name to give the pod.
+    :type pod_name: str
+    :param pod_username: The username to use while executing the pod.
+    :type pod_username: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :param aws_profile: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    """
+
+    def __init__(  # pylint: disable=too-many-arguments,too-many-locals
+        self,
+        cluster_name: str,
+        cluster_role_arn: Optional[str] = None,
+        # Setting in_cluster to False tells the pod that the config
+        # file is stored locally in the worker and not in the cluster.
+        in_cluster: Optional[bool] = False,
+        namespace: Optional[str] = DEFAULT_NAMESPACE_NAME,
+        pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+        pod_name: Optional[str] = DEFAULT_POD_NAME,
+        pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+        aws_profile: Optional[str] = None,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(
+            in_cluster=in_cluster,
+            namespace=namespace,
+            name=pod_name,
+            **kwargs,
+        )
+        self.aws_profile = os.getenv('AWS_PROFILE', None) if aws_profile is None else aws_profile
+        self.roleArn = cluster_role_arn
+        self.clusterName = cluster_name
+        self.pod_name = pod_name
+        self.pod_username = pod_username
+        self.pod_context = pod_context
+        self.region = region
+
+    def execute(self, context):
+        self.config_file = generate_config_file(

Review comment:
       Should be addressed by https://github.com/apache/airflow/pull/16571/commits/8687b736994d1911011dd8bd2b2903d82ad30a8c




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] andormarkus commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
andormarkus commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-890095899


   Hi @ferruzzi,
   
   I would like to describe a special problem which are having problem when we are create EKS managed nodegroups with boto3.
   
   We got 3 environment on 3 separate AWS account. By AWS design, AZ are randomly assigned. If an instance type is available on account A in AZ 1 it might be not available in account B an AZ1. We are running into this issue:
   ```bash
   Your requested instance type (m5ad.4xlarge) is not supported in your requested Availability Zone (eu-central-1c). Please retry your request by not specifying an Availability Zone or choosing eu-central-1a, eu-central-1b.
   ```
   
   In this case, node group creation will fail with `CREATE_FAILED` error and the node group be available on EKS. When Airflow retry come, the second job will fail with the following error:
   ```bash
   botocore.errorfactory.ResourceInUseException: An error occurred (ResourceInUseException) when calling the CreateNodegroup operation: NodeGroup already exists with name [my_node] and cluster name [my_cluster]
   ```
   
   It would be great is there would be an option in this integration if create jobs fails with  `CREATE_FAILED` than it would delete the failed node group. 
    
   I hope I was clear, if not feel free to ask any question.
   
   Thanks,
   Andor
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272143



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_clusters(
+            verbose=self.verbose, maxResults=self.maxResults, nextToken=self.nextToken
+        )
+        cluster_list = response.get('clusters')
+        next_token = response.get('nextToken')
+
+        result = []
+        for cluster in cluster_list:
+            full_describe = json.loads(eks_hook.describe_cluster(name=cluster))
+            cluster_details = json.dumps(full_describe.get('cluster'))
+            result.append(cluster_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Cluster Details:"] + result))
+
+        return {'nextToken': next_token, 'clusters': result}
+
+
+class EKSDescribeAllNodegroupsOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Nodegroups associated with the specified EKS Cluster.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+        nodegroup_list = response.get('nodegroups')
+        next_token = response.get('nextToken')
+
+        result = []
+        for nodegroup in nodegroup_list:
+            full_describe = json.loads(
+                eks_hook.describe_nodegroup(clusterName=self.clusterName, nodegroupName=nodegroup)
+            )
+            nodegroup_details = json.dumps(full_describe.get('nodegroup'))
+            result.append(nodegroup_details)
+
+        if self.verbose is True:
+            self.log.info("\n\t".join(["Nodegroup Details:"] + result))
+
+        return {'nextToken': next_token, 'nodegroups': result}
+
+
+class EKSDescribeClusterOperator(BaseOperator):
+    """
+    Returns descriptive information about an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to describe.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_cluster(name=self.clusterName, verbose=self.verbose)
+        response_json = json.loads(response)
+        # Extract the cluster data, drop the request metadata
+        cluster_data = response_json.get('cluster')
+        return json.dumps(cluster_data)
+
+
+class EKSDescribeNodegroupOperator(BaseOperator):
+    """
+    Returns descriptive information about the Amazon EKS Nodegroup.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDescribeNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster associated with the nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the Amazon EKS Nodegroup to describe.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        response = eks_hook.describe_nodegroup(
+            clusterName=self.clusterName, nodegroupName=self.nodegroupName, verbose=self.verbose
+        )
+        response_json = json.loads(response)
+        # Extract the nodegroup data, drop the request metadata
+        nodegroup_data = response_json.get('nodegroup')
+        return json.dumps(nodegroup_data)
+
+
+class EKSListClustersOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Clusters in your AWS account with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListClustersOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_clusters(
+            maxResults=self.maxResults, nextToken=self.nextToken, verbose=self.verbose
+        )
+
+
+class EKSListNodegroupsOperator(BaseOperator):
+    """
+    Lists the Amazon EKS Nodegroups associated with the specified EKS Cluster with optional pagination.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSListNodegroupsOperator`
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.
+    :type next_token: str
+    :param cluster_name: The name of the Amazon EKS Cluster to check..
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param verbose: Provides additional logging if set to True.  Defaults to False.
+    :type verbose: bool
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        max_results: Optional[int] = DEFAULT_RESULTS_PER_PAGE,
+        next_token: Optional[str] = DEFAULT_PAGINATION_TOKEN,
+        verbose: Optional[bool] = False,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.maxResults = max_results
+        self.nextToken = next_token
+        self.verbose = verbose
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.list_nodegroups(
+            clusterName=self.clusterName,
+            verbose=self.verbose,
+            maxResults=self.maxResults,
+            nextToken=self.nextToken,
+        )
+
+
+class EKSPodOperator(KubernetesPodOperator):
+    """
+    Executes a task in a Kubernetes pod on the specified Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSPodOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to execute the task on.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+       for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param kube_config_file_path: Path to save the generated kube_config file to.
+    :type kube_config_file_path: str
+    :param in_cluster: If True, look for config inside the cluster; if False look for a local file path.
+    :type in_cluster: bool
+    :param namespace: The namespace in which to execute the pod.
+    :type namespace: str
+    :param pod_context: The security context to use while executing the pod.
+    :type pod_context: str
+    :param pod_name: The unique name to give the pod.
+    :type pod_name: str
+    :param pod_username: The username to use while executing the pod.
+    :type pod_username: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :param aws_profile: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(  # pylint: disable=too-many-arguments,too-many-locals
+        self,
+        cluster_name: str,
+        cluster_role_arn: Optional[str] = None,
+        # A default path will be used if none is provided.
+        kube_config_file_path: Optional[str] = os.environ.get(KUBE_CONFIG_ENV_VAR, DEFAULT_KUBE_CONFIG_PATH),

Review comment:
       This should be addressed in https://github.com/apache/airflow/pull/16571/commits/1c9c54328c30794c824282860c7c5b11dfceca22




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r659944005



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "exec": {
+                        "apiVersion": "client.authentication.k8s.io/v1alpha1",
+                        "args": cli_args,
+                        "command": "aws",
+                    }
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+    open(kube_config_file_location, "w").write(config_text)

Review comment:
       ACK.  Good catch, thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-865390141


   Not done looking at all the test results yet, but looks like most/all were caused by either a single misnamed param that slipped through in an example dag, or whitespace issues in the docs.  Correcting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-882797437


   Will do.  I have another round of fixes coming today, I'll rebase before I do that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r672566540



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       Removing in coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r672618235



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/3d8cc9379ab9e7179f837f45bacda64fa6f2a44e




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670910290



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,709 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"

Review comment:
       addressed in https://github.com/apache/airflow/pull/16571/commits/08c2a3480609d8c0cd902837d1ccb420cc93b4cf




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667272072



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       This should be addressed in https://github.com/apache/airflow/pull/16571/commits/a6ee2c369227212e12b8d5320c91fd63a6dc769d




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r677804559



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)

Review comment:
       @ashb  - Working on removing the redundant catch/log/except blocks. How do you feel about this one? Should I drop this log as well, or is that reasonable?  (here and a very similar use below) on/around L300




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669909257



##########
File path: tests/providers/amazon/aws/utils/eks_test_constants.py
##########
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+"""
+This file should only contain constants used for the EKS tests.
+"""
+import os
+import re
+from enum import Enum
+from typing import Dict, List, Pattern, Tuple
+
+from boto3 import Session
+
+CONN_ID = "eks"
+DEFAULT_MAX_RESULTS = 100
+FROZEN_TIME = "2013-11-27T01:42:00Z"
+PACKAGE_NOT_PRESENT_MSG = "mock_eks package not present"
+PARTITIONS: List[str] = Session().get_available_partitions()
+REGION: str = Session().region_name

Review comment:
       Is it needed?  Can you use a fixed value here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667273782



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+
+    # Set the filename to something which can be found later if needed.
+    filename_prefix = KUBE_CONFIG_FILE_PREFIX + pod_name
+    with tempfile.NamedTemporaryFile(prefix=filename_prefix, mode='w', delete=False) as config_file:

Review comment:
       We should delete the credentials when they are not used. To achieve this, we should make the following changes:
   - convert this method into context manager i.e. add [`contextlib.contextmanager`](https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager) decorator,
   - yield on a new file name (before we should flush content also)
   - remove delete parameter in this method invocation
   - remove pod_name parameters from this method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-879367442


   > How common is it for a DAG to run for over an hour?
   
   In the case of KubernetesPodOperator, this is not so uncommon as this operator is just used to run heavy and resource-consuming tasks. We can deal with this later as this is a separate effort, but could you describe this limitation in docs and create a ticket about it after merging?
   
   > What do we gain from rewriting that, other than another chunk of code to maintain and another point of failure to watch?
   
   There are two reasons:
   - Better support for credential management. Airflow has a different chain for establishing credentials which are not compatible with [the Default Credential Provider Chain for AWS CLI](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html). In particular, we support [the secret backend for retrieving credentials](http://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html), [SPEGO Authentication](https://github.com/apache/airflow/blob/a3f5c93806258b5ad396a638ba0169eca7f9d065/airflow/providers/amazon/aws/hooks/base_aws.py#L256) and other. You can keep CLI, but you will find it difficult to handle all of these feature while we continue to use AWS CLI.
   - Reducing system dependencies, which makes it easier to use this operator. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667275472



##########
File path: airflow/providers/amazon/aws/example_dags/example_eks_pod_operation.py
##########
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from datetime import datetime
+
+from airflow.models.dag import DAG
+from airflow.providers.amazon.aws.operators.eks import EKSPodOperator
+from airflow.utils.dates import days_ago
+
+####
+# NOTE: This example requires an existing EKS Cluster with a compute backend.
+# see: example_eks_create_cluster_with_nodegroup.py
+####
+
+CLUSTER_NAME = 'existing-cluster-with-nodegroup-ready-for-pod'
+BUCKET_SUFFIX = datetime.now().strftime("-%Y%b%d-%H%M").lower()
+ROLE_ARN = os.environ.get('ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+with DAG(
+    dag_id='eks_run_pod_dag',
+    schedule_interval=None,
+    start_date=days_ago(2),
+    max_active_runs=1,
+    tags=['example'],
+) as dag:
+
+    # [START howto_operator_eks_pod_operator]
+    start_pod = EKSPodOperator(
+        task_id="run_pod",
+        cluster_name=CLUSTER_NAME,
+        # Optional IAM Role to assume for credentials when signing the token.
+        cluster_role_arn=ROLE_ARN,
+        image="amazon/aws-cli:latest",
+        cmds=["sh", "-c", "aws s3 mb s3://hello-world" + BUCKET_SUFFIX],

Review comment:
       ```suggestion
           cmds=["sh", "-c", "aws sts get-caller-identity"],
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r667279605



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,461 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import boto3
+import yaml
+from botocore.exceptions import ClientError
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+KUBE_CONFIG_FILE_PREFIX = 'kube_config_'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+            # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+            # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+            # The 'shared' value allows more than one resource to use the subnet.
+            tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+            if "tags" in kwargs:
+                tags = {**tags, **kwargs["tags"]}
+                kwargs.pop("tags")
+
+            response = eks_client.create_nodegroup(
+                clusterName=clusterName,
+                nodegroupName=nodegroupName,
+                subnets=subnets,
+                nodeRole=nodeRole,
+                tags=tags,
+                **kwargs,
+            )
+
+            self.log.info(
+                "Created a managed nodegroup named %s in cluster %s",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_cluster(name=name)
+            self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Deleted nodegroup named %s from cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Cluster.
+
+        :param name: The name of the cluster to describe.
+        :type name: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_cluster(name=name)
+            self.log.info("Retrieved details for cluster named %s.", response.get('cluster').get('name'))
+            if verbose:
+                cluster_data = response.get('cluster')
+                self.log.info("Cluster Details: %s", json.dumps(cluster_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def describe_nodegroup(
+        self, clusterName: str, nodegroupName: str, verbose: Optional[bool] = False
+    ) -> Dict:
+        """
+        Returns descriptive information about an Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to describe.
+        :type nodegroupName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: Returns descriptive information about a specific EKS Nodegroup.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.conn
+
+            response = eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            self.log.info(
+                "Retrieved details for nodegroup named %s in cluster %s.",
+                response.get('nodegroup').get('nodegroupName'),
+                response.get('nodegroup').get('clusterName'),
+            )
+            if verbose:
+                nodegroup_data = response.get('nodegroup')
+                self.log.info("Nodegroup Details: %s", json.dumps(nodegroup_data, cls=AirflowJsonEncoder))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+
+    def get_cluster_state(self, clusterName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Cluster.
+
+        :param clusterName: The name of the cluster to check.
+        :type clusterName: str
+
+        :return: Returns the current status of a given Amazon EKS Cluster.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return eks_client.describe_cluster(name=clusterName).get('cluster').get('status')
+
+    def get_nodegroup_state(self, clusterName: str, nodegroupName: str) -> str:
+        """
+        Returns the current status of a given Amazon EKS Nodegroup.
+
+        :param clusterName: The name of the Amazon EKS Cluster associated with the nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to check.
+        :type nodegroupName: str
+
+        :return: Returns the current status of a given Amazon EKS Nodegroup.
+        :rtype: str
+        """
+        eks_client = self.conn
+
+        return (
+            eks_client.describe_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+            .get('nodegroup')
+            .get('status')
+        )
+
+    def list_clusters(
+        self,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Clusters in your AWS account.
+
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List containing the cluster names.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_clusters)
+
+        return self._list_all(api_call=api_call, response_key="clusters", verbose=verbose)
+
+    def list_nodegroups(
+        self,
+        clusterName: str,
+        verbose: Optional[bool] = False,
+    ) -> List:
+        """
+        Lists all Amazon EKS Nodegroups associated with the specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster containing nodegroups to list.
+        :type clusterName: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of nodegroup names within the given cluster.
+        :rtype: List
+        """
+        eks_client = self.conn
+        api_call = partial(eks_client.list_nodegroups, clusterName=clusterName)
+
+        return self._list_all(api_call=api_call, response_key="nodegroups", verbose=verbose)
+
+    def _list_all(self, api_call: Callable, response_key: str, verbose: bool) -> List:
+        """
+        Repeatedly calls a provided boto3 API Callable and collates the responses into a List.
+
+        :param api_call: The api command to execute.
+        :type api_call: Callable
+        :param response_key: Which dict key to collect into the final list.
+        :type response_key: str
+        :param verbose: Provides additional logging if set to True.  Defaults to False.
+        :type verbose: bool
+
+        :return: A List of the combined results of the provided API call.
+        :rtype: List
+        """
+        name_collection = []
+        token = DEFAULT_PAGINATION_TOKEN
+
+        try:
+            while token != "null":
+                response = api_call(nextToken=token)
+                # If response list is not empty, append it to the running list.
+                name_collection += filter(None, response.get(response_key))
+                token = response.get("nextToken")
+
+            self.log.info("Retrieved list of %s %s.", len(name_collection), response_key)
+            if verbose:
+                self.log.info("%s found: %s", response_key.title(), name_collection)
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e
+        return name_collection
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    pod_name: str,
+    aws_profile: Optional[str],
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    aws_region: Optional[str] = None,
+) -> str:
+    """
+    Writes the kubeconfig file given an EKS Cluster.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param pod_name: The unique name to give the pod.  Used as an identifier in the config filename.
+    :type pod_name: str
+    :param aws_profile: The named profile containing the credentials to use.
+    :type aws_profile: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # Get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    token = _get_bearer_token(session=session, cluster_id=eks_cluster_name, aws_region=aws_region)
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "token": token,

Review comment:
       This token is only valid for an hour, but this operator can be used to run a longer job. We should take care of its rotation of this key to ensure it's valid.
   
   Unfortunately, it won't be trivial as we have to write a new [out-of-tree client
       authentication providers](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/kubectl-exec-plugins.md). To do this, you should write a new CLI tool that will generate a new credential.
   
   Some tips on adding a new CLI tool to providers:
   - You cannot modify the Airflow core so that this operator can be used on different Airflow versions and not create a dependency between the vendor and the core. You cannot modify any file in `airflow.cli` package.
   - You can create a new [executable module](https://docs.python.org/3/tutorial/modules.html#executing-modules-as-scripts). To run this module, it is best to use python module resolution i.e. you should use [python -m](https://docs.python.org/3/using/cmdline.html#cmdoption-m). BTW, To run airflow, you can use `airflow` or `python -m airflow`. See: https://github.com/apache/airflow/pull/7808
   - The `python`/`python3` may not refer to the current interpreter. You should [`sys.executable`](https://docs.python.org/3/library/sys.html#sys.executable) instead.
   
   For example, see: [`airflow.providers.google.common.utils.id_token_credentials`](https://github.com/apache/airflow/blob/866a601b76e219b3c043e1dbbc8fb22300866351/airflow/providers/google/common/utils/id_token_credentials.py#L24])




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670855251



##########
File path: airflow/providers/amazon/aws/sensors/eks.py
##########
@@ -0,0 +1,129 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+"""Tracking the state of EKS Clusters and Nodegroups."""
+
+from typing import Optional
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.sensors.base import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+CONN_ID = "eks"
+TARGET_STATE = 'ACTIVE'
+
+
+class EKSClusterStateSensor(BaseSensorOperator):
+    """
+    Check the state of an Amazon EKS Cluster until the state of the Cluster equals the target state.
+
+    :param cluster_name: The name of the Cluster to watch.
+    :type cluster_name: str
+    :param target_state: Target state of the Cluster.
+    :type target_state: str
+    """
+
+    template_fields = ("target_state", "cluster_name")
+    ui_color = "#ff9900"
+    ui_fgcolor = "#232F3E"
+    valid_states = ["CREATING", "ACTIVE", "DELETING", "FAILED", "UPDATING"]
+
+    @apply_defaults
+    def __init__(
+        self,
+        *,
+        cluster_name: str,
+        target_state: Optional[str] = TARGET_STATE,
+        conn_id: Optional[str] = CONN_ID,

Review comment:
       gah.  I caught/corrected that in the Operators but missed it in the Sensors, thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-866095317


   Hi Ash, thanks for the quick look.  I'll hit your comments inline, but as far as this one:
   
   > Can you explain why you had to change the label-when-reviewed action in this PR?
   
   I actually never manually changed that file.  It kept getting flagged as a changed file but git would never show me the diff or allow me to roll it back so I thought those changes kept coming from upstream.  If that isn't the case, and it sounds like it wasn't since you are asking, I'll pull that file from the commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661048927



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - ``example_eks_create_cluster.py``
+ - ``example_eks_create_cluster_with_nodegroup.py``
+ - ``example_eks_create_nodegroup.py``
+ - ``example_eks_pod_operator.py``
+
+
+.. _howto/operator:EKSCreateClusterOperator:
+
+Creating Amazon EKS Clusters
+----------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_eks_create_cluster.py`` uses ``EKSCreateClusterOperator`` to create an Amazon

Review comment:
       I once wrote a document on how to write documentation for operators, but now it has been deleted and I don't think I have any copy.
   
   You can be inspired by the guides below.
   http://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/natural_language.html 
   http://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/datacatalog.html
   http://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/mlengine.html
   
   If you want to add something to the guides, feel free to describe it, but I'm not sure if filenames just needs to be added.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r678602421



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)

Review comment:
       Oh one more thing: is `delete_cluster` sync or async? If it's async (and we just wait for the request) then we don't need the log.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r678377898



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,
+        region: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.cluster_name = cluster_name
+        self.cluster_role_arn = cluster_role_arn
+        self.resources_vpc_config = resources_vpc_config
+        self.compute = compute
+        self.aws_conn_id = aws_conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroup_name = nodegroup_name or self.cluster_name + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroup_role_arn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.aws_conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.cluster_name,
+            roleArn=self.cluster_role_arn,
+            resourcesVpcConfig=self.resources_vpc_config,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.cluster_name) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = (
+                        "Cluster is still inactive after the allocated time limit.  "
+                        "Failed cluster will be torn down."
+                    )
+                    self.log.error(message)
+                    # If there is something preventing the cluster for activating, tear it down and abort.
+                    eks_hook.delete_cluster(name=self.cluster_name)
+                    raise RuntimeError(message)

Review comment:
       How long does deleting the cluster take? if it's more than a few seconds it's probably useful to keep both here, but on L300 the log and the exception are right next to each other, so we should remove the log form there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r676955747



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,797 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Amazon EKS operators."""
+import json
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_CONTEXT_NAME, DEFAULT_POD_USERNAME, EKSHook
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_CONN_ID = "aws_default"
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+    :param region: Which AWS region the connection should use.
+        If this is None or empty then the default boto3 behaviour is used.
+    :type region: str
+
+    If compute is assigned the value of ``nodegroup``, the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    template_fields = (
+        "cluster_name",
+        "cluster_role_arn",
+        "resources_vpc_config",
+        "nodegroup_name",
+        "nodegroup_role_arn",
+        "compute",
+        "aws_conn_id",
+        "region",
+    )
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        aws_conn_id: Optional[str] = DEFAULT_CONN_ID,

Review comment:
       Connection I'D is optional for all AWS operators. See: https://github.com/apache/airflow/pull/8534




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660139853



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       I'm not sure I follow the question.   These are passed in from the DAG like any other parameter.
   
   Many of the EKS APIs have a max result value of 100 entries so there should be a way to get the next paginated set.  They could get the nextToken through a prior task's XCOM or whatever, but I felt it should be available.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660136512



##########
File path: airflow/providers/amazon/aws/example_dags/example_eks_pod_operation.py
##########
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from datetime import datetime
+
+from airflow.models.dag import DAG
+from airflow.providers.amazon.aws.operators.eks import EKSPodOperator
+from airflow.utils.dates import days_ago
+
+####
+# NOTE: This example requires an existing EKS Cluster with a compute backend.
+# see: example_eks_create_cluster_with_nodegroup.py
+####
+
+CLUSTER_NAME = 'existing-cluster-with-nodegroup-ready-for-pod'
+BUCKET_SUFFIX = datetime.now().strftime("-%Y%b%d-%H%M").lower()
+ROLE_ARN = os.environ.get('ROLE_ARN', 'arn:aws:iam::123456789012:role/role_name')
+
+with DAG(
+    dag_id='eks_run_pod_dag',
+    schedule_interval=None,
+    start_date=days_ago(2),
+    max_active_runs=1,
+    tags=['example'],
+) as dag:
+
+    # [START howto_operator_eks_pod_operator]
+    start_pod = EKSPodOperator(
+        task_id="run_pod",
+        cluster_name=CLUSTER_NAME,
+        # Optional IAM Role to assume for credentials when signing the token.
+        cluster_role_arn=ROLE_ARN,
+        image="amazon/aws-cli:latest",
+        cmds=["sh", "-c", "aws s3 mb s3://hello-world" + BUCKET_SUFFIX],

Review comment:
       ACK.  I'll change this to just echo to the terminal or something.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660173658



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])

Review comment:
       I still agree that it would be best to drop the dependency on the AWS CLI, I was just trying to understand the difference between the two cases... and that answers it. :)
   
   My main concern with most of the alternatives/workarounds that I found online was that if something changes in the way STS vends the tokens then we're stuck with a broken product and trying to catch up.  We can (safely???) assume that the AWS CLI would always have the correct current means to get the token.  The AWS CLI code is open source, it's trivial to just copy/paste the relevant portions here, but that is asking for a problem, later IMHO.
   
   I'm going to try to implement your suggested `get_bearer_token` code and see where that takes me.  I appreciate your patience explaining the issue.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r670885704



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       Should be addressed by https://github.com/apache/airflow/pull/16571/commits/4b29a39e6a9800bf69f3ebeb4cadc4a1559956ff




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r683872000



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,420 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Interact with Amazon EKS, using the boto3 library."""
+import base64
+import json
+import re
+import tempfile
+from contextlib import contextmanager
+from functools import partial
+from typing import Callable, Dict, List, Optional
+
+import yaml
+from botocore.signers import RequestSigner
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_PAGINATION_TOKEN = ''
+DEFAULT_POD_USERNAME = 'aws'
+STS_TOKEN_EXPIRES_IN = 60
+
+
+class EKSHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    client_type = 'eks'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.create_cluster(
+            name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+        )
+
+        self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+        return response
+
+    def create_nodegroup(
+        self, clusterName: str, nodegroupName: str, subnets: List[str], nodeRole: str, **kwargs
+    ) -> Dict:
+        """
+        Creates an Amazon EKS Managed Nodegroup for an EKS Cluster.
+
+        :param clusterName: The name of the cluster to create the EKS Managed Nodegroup in.
+        :type clusterName: str
+        :param nodegroupName: The unique name to give your managed nodegroup.
+        :type nodegroupName: str
+        :param subnets: The subnets to use for the Auto Scaling group that is created for your nodegroup.
+        :type subnets: List[str]
+        :param nodeRole: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+        :type nodeRole: str
+
+        :return: Returns descriptive information about the created EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+        # The below tag is mandatory and must have a value of either 'owned' or 'shared'
+        # A value of 'owned' denotes that the subnets are exclusive to the nodegroup.
+        # The 'shared' value allows more than one resource to use the subnet.
+        tags = {'kubernetes.io/cluster/' + clusterName: 'owned'}
+        if "tags" in kwargs:
+            tags = {**tags, **kwargs["tags"]}
+            kwargs.pop("tags")
+
+        response = eks_client.create_nodegroup(
+            clusterName=clusterName,
+            nodegroupName=nodegroupName,
+            subnets=subnets,
+            nodeRole=nodeRole,
+            tags=tags,
+            **kwargs,
+        )
+
+        self.log.info(
+            "Created a managed nodegroup named %s in cluster %s",
+            response.get('nodegroup').get('nodegroupName'),
+            response.get('nodegroup').get('clusterName'),
+        )
+        return response
+
+    def delete_cluster(self, name: str) -> Dict:
+        """
+        Deletes the Amazon EKS Cluster control plane.
+
+        :param name: The name of the cluster to delete.
+        :type name: str
+
+        :return: Returns descriptive information about the deleted EKS Cluster.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.delete_cluster(name=name)
+
+        self.log.info("Deleted cluster with the name %s.", response.get('cluster').get('name'))
+        return response
+
+    def delete_nodegroup(self, clusterName: str, nodegroupName: str) -> Dict:
+        """
+        Deletes an Amazon EKS Nodegroup from a specified cluster.
+
+        :param clusterName: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+        :type clusterName: str
+        :param nodegroupName: The name of the nodegroup to delete.
+        :type nodegroupName: str
+
+        :return: Returns descriptive information about the deleted EKS Managed Nodegroup.
+        :rtype: Dict
+        """
+        eks_client = self.conn
+
+        response = eks_client.delete_nodegroup(clusterName=clusterName, nodegroupName=nodegroupName)
+
+        self.log.info(
+            "Deleted nodegroup named %s from cluster %s.",
+            response.get('nodegroup').get('nodegroupName'),
+            response.get('nodegroup').get('clusterName'),
+        )
+        return response
+
+    def describe_cluster(self, name: str, verbose: Optional[bool] = False) -> Dict:

Review comment:
       > You can use the Optional type modifier to define a type variant that allows None, such as Optional[int] (Optional[X] is the preferred shorthand for Union[X, None]):
   
   https://mypy.readthedocs.io/en/latest/kinds_of_types.html#optional-types-and-the-none-type
   
   I agree. We should use `bool` type here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660159041



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "exec": {
+                        "apiVersion": "client.authentication.k8s.io/v1alpha1",
+                        "args": cli_args,
+                        "command": "aws",
+                    }
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+    open(kube_config_file_location, "w").write(config_text)

Review comment:
       Fixed in 616d249




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] subashcanapathy commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
subashcanapathy commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-870281108


   > What do you think about adding system tests for these operators? We are constantly striving to improve test coverage and we have impressive results with Google integration, where the lack of testing is the exception, not the daily routine. See: #8280
   
   We do have integration test coverage as part of this check-in. The only difference is that they aren't system tests. We are not sure if these qualify there as we build these into separate provider packages these days. Please advise if I have misunderstood that assumption.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660153091



##########
File path: docs/apache-airflow-providers-amazon/operators/eks.rst
##########
@@ -0,0 +1,265 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Amazon Elastic Kubernetes Service (EKS) Operators
+=================================================
+
+`Amazon Elastic Kubernetes Service (Amazon EKS) <https://aws.amazon.com/eks/>`__  is a managed service
+that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own
+Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling,
+and management of containerized applications.
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+------------------
+
+.. include:: _partials/prerequisite_tasks.rst
+
+Overview
+--------
+
+Airflow to Amazon Elastic Kubernetes Service (EKS) integration provides Operators to create and
+interact with the EKS clusters and compute infrastructure.
+
+ - :class:`~airflow.providers.amazon.aws.operators.eks`
+
+4 example_dags are provided which showcase these operators in action.
+
+ - example_eks_create_cluster.py
+ - example_eks_create_cluster_with_nodegroup.py
+ - example_eks_create_nodegroup.py
+ - example_eks_pod_operator.py

Review comment:
       Yes. It ignores any code literal or code block.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r660889168



##########
File path: airflow/providers/amazon/aws/utils/eks_kube_config.py
##########
@@ -0,0 +1,133 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import os
+from shutil import which
+from typing import Optional
+
+import boto3
+import yaml
+
+HOME = os.environ.get('HOME', '/tmp')
+DEFAULT_KUBE_CONFIG_FILENAME = 'config'
+DEFAULT_KUBE_CONFIG_PATH = str(os.path.join(HOME, '/.kube/', DEFAULT_KUBE_CONFIG_FILENAME))
+DEFAULT_CONTEXT_NAME = 'aws'
+DEFAULT_NAMESPACE_NAME = 'default'
+DEFAULT_POD_USERNAME = 'aws'
+
+
+def generate_config_file(
+    eks_cluster_name: str,
+    eks_namespace_name: str,
+    aws_profile: Optional[str],
+    kube_config_file_location: Optional[str] = DEFAULT_KUBE_CONFIG_PATH,
+    pod_username: Optional[str] = DEFAULT_POD_USERNAME,
+    pod_context: Optional[str] = DEFAULT_CONTEXT_NAME,
+    role_arn: Optional[str] = None,
+    aws_region: Optional[str] = None,
+) -> None:
+    """
+    Writes the kubeconfig file given an EKS Cluster name, AWS region, and file path.
+
+    :param eks_cluster_name: The name of the cluster to create the EKS Managed Nodegroup in.
+    :type eks_cluster_name: str
+    :param eks_namespace_name: The namespace to run within kubernetes.
+    :type eks_namespace_name: str
+    :param aws_profile: The named profile containing the credentials for the AWS CLI tool to use.
+    :type aws_profile: str
+    :param kube_config_file_location: Path to save the generated kube_config file to.
+    :type kube_config_file_location: str
+    :param pod_username: The username under which to execute the pod.
+    :type pod_username: str
+    :param pod_context: The name of the context access parameters to use.
+    :type pod_context: str
+    :param role_arn: The Amazon Resource Name (ARN) of the IAM role to associate with your nodegroup.
+    :type role_arn: str
+    :param aws_region: The name of the AWS Region the EKS Cluster resides in.
+    :type aws_region: str
+    """
+    installed = which("aws")
+    if installed is None:
+        message = (
+            "AWS CLI version 2 must be installed on the worker.  See: "
+            "https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html"
+        )
+        print(message)
+        raise UnmetDependency(message)
+
+    # Set up the client
+    session = boto3.Session(region_name=aws_region, profile_name=aws_profile)
+    eks_client = session.client("eks")
+
+    # get cluster details
+    cluster = eks_client.describe_cluster(name=eks_cluster_name)
+    cluster_cert = cluster["cluster"]["certificateAuthority"]["data"]
+    cluster_ep = cluster["cluster"]["endpoint"]
+
+    # build the cluster config hash
+    cli_args = [
+        "--region",
+        aws_region,
+        "eks",
+        "get-token",
+        "--cluster-name",
+        eks_cluster_name,
+    ]
+    if role_arn:
+        cli_args.extend(["--role-arn", role_arn])
+
+    cluster_config = {
+        "apiVersion": "v1",
+        "kind": "Config",
+        "clusters": [
+            {
+                "cluster": {"server": cluster_ep, "certificate-authority-data": cluster_cert},
+                "name": eks_cluster_name,
+            }
+        ],
+        "contexts": [
+            {
+                "context": {
+                    "cluster": eks_cluster_name,
+                    "namespace": eks_namespace_name,
+                    "user": pod_username,
+                },
+                "name": pod_context,
+            }
+        ],
+        "current-context": pod_context,
+        "preferences": {},
+        "users": [
+            {
+                "name": pod_username,
+                "user": {
+                    "exec": {
+                        "apiVersion": "client.authentication.k8s.io/v1alpha1",
+                        "args": cli_args,
+                        "command": "aws",
+                    }
+                },
+            }
+        ],
+    }
+
+    config_text = yaml.dump(cluster_config, default_flow_style=False)
+    with open(kube_config_file_location, "w") as config_file:
+        config_file.write(config_text)
+
+
+class UnmetDependency(BaseException):

Review comment:
       ACK, will be in the next revision.  Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r672566540



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       Removing in coming revision.

##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EksHook(AwsBaseHook):
+    """
+    Interact with Amazon EKS, using the boto3 library.
+
+    Additional arguments (such as ``aws_conn_id``) may be specified and
+    are passed down to the underlying AwsBaseHook.
+
+    .. seealso::
+        :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
+    """
+
+    conn_type = 'eks'
+    conn_name = 'eks'
+    client_type = 'eks'
+    hook_name = 'EKS'
+
+    def __init__(self, *args, **kwargs) -> None:
+        kwargs["client_type"] = self.client_type
+        super().__init__(*args, **kwargs)
+
+    def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> Dict:
+        """
+        Creates an Amazon EKS control plane.
+
+        :param name: The unique name to give to your Amazon EKS Cluster.
+        :type name: str
+        :param roleArn: The Amazon Resource Name (ARN) of the IAM role that provides permissions
+          for the Kubernetes control plane to make calls to AWS API operations on your behalf.
+        :type roleArn: str
+        :param resourcesVpcConfig: The VPC configuration used by the cluster control plane.
+        :type resourcesVpcConfig: Dict
+
+        :return: Returns descriptive information about the created EKS Cluster.
+        :rtype: Dict
+        """
+        try:
+            eks_client = self.get_conn()
+
+            response = eks_client.create_cluster(
+                name=name, roleArn=roleArn, resourcesVpcConfig=resourcesVpcConfig, **kwargs
+            )
+
+            self.log.info("Created cluster with the name %s.", response.get('cluster').get('name'))
+            return response
+
+        except ClientError as e:
+            self.log.error(e.response["Error"]["Message"])
+            raise e

Review comment:
       Corrected in https://github.com/apache/airflow/pull/16571/commits/3d8cc9379ab9e7179f837f45bacda64fa6f2a44e




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r669909559



##########
File path: tests/providers/amazon/aws/operators/test_eks.py
##########
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from typing import List
+from unittest import mock
+
+from moto.eks.responses import DEFAULT_NEXT_TOKEN
+
+from airflow.providers.amazon.aws.hooks.eks import EKSHook
+from airflow.providers.amazon.aws.operators.eks import (
+    EKSCreateClusterOperator,
+    EKSCreateNodegroupOperator,
+    EKSDeleteClusterOperator,
+    EKSDeleteNodegroupOperator,
+    EKSDescribeAllClustersOperator,
+    EKSDescribeAllNodegroupsOperator,
+    EKSDescribeClusterOperator,
+    EKSDescribeNodegroupOperator,
+    EKSListClustersOperator,
+    EKSListNodegroupsOperator,
+)
+from tests.providers.amazon.aws.utils.eks_test_constants import (
+    NODEROLE_ARN_VALUE,
+    RESOURCES_VPC_CONFIG_VALUE,
+    ROLE_ARN_VALUE,
+    STATUS_VALUE,
+    SUBNETS_VALUE,
+    TASK_ID,
+)
+from tests.providers.amazon.aws.utils.eks_test_utils import convert_keys, random_names
+
+DESCRIBE_CLUSTER_RESULT = f'{{"cluster": "{random_names()}"}}'
+DESCRIBE_NODEGROUP_RESULT = f'{{"nodegroup": "{random_names()}"}}'
+EMPTY_CLUSTER = '{"cluster": {}}'
+EMPTY_NODEGROUP = '{"nodegroup": {}}'
+NAME_LIST = ["foo", "bar", "baz", "qux"]
+
+
+class TestEKSCreateClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_cluster_params = dict(
+            cluster_name=self.cluster_name,
+            cluster_role_arn=ROLE_ARN_VALUE,
+            resources_vpc_config=RESOURCES_VPC_CONFIG_VALUE,
+        )
+        # These two are added when creating both the cluster and nodegroup together.
+        self.base_nodegroup_params = dict(
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        # This one is used in the tests to validate method calls.
+        self.create_nodegroup_params = dict(
+            **self.base_nodegroup_params,
+            cluster_name=self.cluster_name,
+            subnets=SUBNETS_VALUE,
+        )
+
+        self.create_cluster_operator = EKSCreateClusterOperator(
+            task_id=TASK_ID, **self.create_cluster_params, compute=None
+        )
+
+        self.create_cluster_operator_with_nodegroup = EKSCreateClusterOperator(
+            task_id=TASK_ID,
+            **self.create_cluster_params,
+            **self.base_nodegroup_params,
+        )
+
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_create_cluster(self, mock_create_nodegroup, mock_create_cluster):
+        self.create_cluster_operator.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_not_called()
+
+    @mock.patch.object(EKSHook, "get_cluster_state")
+    @mock.patch.object(EKSHook, "create_cluster")
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_called_with_nodegroup_creates_both(
+        self, mock_create_nodegroup, mock_create_cluster, mock_cluster_state
+    ):
+        mock_cluster_state.return_value = STATUS_VALUE
+
+        self.create_cluster_operator_with_nodegroup.execute({})
+
+        mock_create_cluster.assert_called_once_with(**convert_keys(self.create_cluster_params))
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSCreateNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()
+
+        self.create_nodegroup_params = dict(
+            cluster_name=self.cluster_name,
+            nodegroup_name=self.nodegroup_name,
+            nodegroup_subnets=SUBNETS_VALUE,
+            nodegroup_role_arn=NODEROLE_ARN_VALUE,
+        )
+
+        self.create_nodegroup_operator = EKSCreateNodegroupOperator(
+            task_id=TASK_ID, **self.create_nodegroup_params
+        )
+
+    @mock.patch.object(EKSHook, "create_nodegroup")
+    def test_execute_when_nodegroup_does_not_already_exist(self, mock_create_nodegroup):
+        self.create_nodegroup_operator.execute({})
+
+        mock_create_nodegroup.assert_called_once_with(**convert_keys(self.create_nodegroup_params))
+
+
+class TestEKSDeleteClusterOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+
+        self.delete_cluster_operator = EKSDeleteClusterOperator(
+            task_id=TASK_ID, cluster_name=self.cluster_name
+        )
+
+    @mock.patch.object(EKSHook, "list_nodegroups")
+    @mock.patch.object(EKSHook, "delete_cluster")
+    def test_existing_cluster_not_in_use(self, mock_delete_cluster, mock_list_nodegroups):
+        mock_list_nodegroups.return_value = dict(nodegroups=list())
+
+        self.delete_cluster_operator.execute({})
+
+        mock_list_nodegroups.assert_called_once
+        mock_delete_cluster.assert_called_once_with(name=self.cluster_name)
+
+
+class TestEKSDeleteNodegroupOperator(unittest.TestCase):
+    def setUp(self) -> None:
+        self.cluster_name: str = random_names()
+        self.nodegroup_name: str = random_names()

Review comment:
       Is it needed?  Can you use a fixed value here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661848515



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EKSHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        if self.compute == 'nodegroup':
+            eks_hook.create_nodegroup(
+                clusterName=self.clusterName,
+                nodegroupName=self.nodegroupName,
+                subnets=self.resourcesVpcConfig.get('subnetIds'),
+                nodeRole=self.nodegroupRoleArn,
+            )
+
+
+class EKSCreateNodegroupOperator(BaseOperator):
+    """
+    Creates am Amazon EKS Managed Nodegroup for an existing Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to create the managed nodegroup in.
+    :type cluster_name: str
+    :param nodegroup_name: The unique name to give your managed nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_subnets:
+        The subnets to use for the Auto Scaling group that is created for the managed nodegroup.
+    :type nodegroup_subnets: List[str]
+    :param nodegroup_role_arn:
+        The Amazon Resource Name (ARN) of the IAM role to associate with the managed nodegroup.
+    :type nodegroup_role_arn: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_subnets: List[str],
+        nodegroup_role_arn: str,
+        nodegroup_name: Optional[str],
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupSubnets = nodegroup_subnets
+        self.nodegroupRoleArn = nodegroup_role_arn
+        self.nodegroupName = nodegroup_name or cluster_name + datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.create_nodegroup(
+            clusterName=self.clusterName,
+            nodegroupName=self.nodegroupName,
+            subnets=self.nodegroupSubnets,
+            nodeRole=self.nodegroupRoleArn,
+        )
+
+
+class EKSDeleteClusterOperator(BaseOperator):
+    """
+    Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteClusterOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster to delete.
+    :type cluster_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self, cluster_name: str, conn_id: Optional[str] = CONN_ID, region: Optional[str] = REGION, **kwargs
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        nodegroups = eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')
+        nodegroup_count = len(nodegroups)
+        if nodegroup_count > 0:
+            self.log.info(
+                "A cluster can not be deleted with attached nodegroups.  Deleting %d nodegroups.",
+                nodegroup_count,
+            )
+            for group in nodegroups:
+                eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=group)
+
+            # Scaling up the timeout based on the number of nodegroups that are being processed.
+            additional_seconds = 5 * 60
+            countdown = TIMEOUT_SECONDS + (nodegroup_count * additional_seconds)
+            while len(eks_hook.list_nodegroups(clusterName=self.clusterName).get('nodegroups')) > 0:
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    sleep(CHECK_INTERVAL_SECONDS)
+                    self.log.info(
+                        "Waiting for the remaining %s nodegroups to delete.  Checking again in %d seconds.",
+                        nodegroup_count,
+                        CHECK_INTERVAL_SECONDS,
+                    )
+                else:
+                    message = "Nodegroups are still inactive after the allocated time limit.  Aborting."
+                    self.log.error(message)
+                    raise RuntimeError(message)
+
+        self.log.info("No nodegroups remain, deleting cluster.")
+        return eks_hook.delete_cluster(name=self.clusterName)
+
+
+class EKSDeleteNodegroupOperator(BaseOperator):
+    """
+    Deletes an Amazon EKS Nodegroup from an Amazon EKS Cluster.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSDeleteNodegroupOperator`
+
+    :param cluster_name: The name of the Amazon EKS Cluster that is associated with your nodegroup.
+    :type cluster_name: str
+    :param nodegroup_name: The name of the nodegroup to delete.
+    :type nodegroup_name: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        nodegroup_name: str,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.nodegroupName = nodegroup_name
+        self.conn_id = conn_id
+        self.region = region
+
+    def execute(self, context):
+        eks_hook = EKSHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        return eks_hook.delete_nodegroup(clusterName=self.clusterName, nodegroupName=self.nodegroupName)
+
+
+class EKSDescribeAllClustersOperator(BaseOperator):
+    """
+    Describes all Amazon EKS Clusters in your AWS account.
+
+    :param max_results: The maximum number of results to return.
+    :type max_results: int
+    :param next_token: The nextToken value returned from a previous paginated execution.

Review comment:
       I think I understand what you are saying, thanks.  A comment below suggests that the List operators should be dropped.  I will either modify this or drop it, depending on how that discussion goes.  Thanks for the suggestions, I'll keep it mind int he future either way.
   
   See: https://github.com/apache/airflow/pull/16571#discussion_r660444860




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-882797437


   Will do.  I have another round of fixes coming today, I'll rebase before I do that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r661033517



##########
File path: airflow/providers/amazon/aws/hooks/eks.py
##########
@@ -0,0 +1,346 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""Interact with Amazon EKS, using the boto3 library."""
+
+import json
+from typing import Dict, List, Optional
+
+from botocore.exceptions import ClientError
+
+from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+from airflow.utils.json import AirflowJsonEncoder
+
+DEFAULT_RESULTS_PER_PAGE = 100
+DEFAULT_PAGINATION_TOKEN = ''
+
+
+class EKSHook(AwsBaseHook):

Review comment:
       ACK.  Will revert this back to EKSHook and have more spine in the future :P




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi edited a comment on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi edited a comment on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-896187912


   Hey folks, sorry for the delay.  I wanted to get a better understanding of Jinja before I implemented it.  I should have another round of updates for you today or tomorrow.  
   
   [EDIT: Updates pushed.  I think they address all concerns from @ashb  and @mik-laj so far; I have another one coming to correct the misuse of `Optional` in a couple places which @zkan caught.]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#discussion_r664096250



##########
File path: airflow/providers/amazon/aws/operators/eks.py
##########
@@ -0,0 +1,737 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=invalid-name
+"""This module contains Amazon EKS operators."""
+import json
+import os
+from datetime import datetime
+from time import sleep
+from typing import Dict, List, Optional
+
+from boto3 import Session
+
+from airflow.models import BaseOperator
+from airflow.providers.amazon.aws.hooks.eks import DEFAULT_PAGINATION_TOKEN, DEFAULT_RESULTS_PER_PAGE, EksHook
+from airflow.providers.amazon.aws.utils.eks_kube_config import (
+    DEFAULT_CONTEXT_NAME,
+    DEFAULT_KUBE_CONFIG_PATH,
+    DEFAULT_NAMESPACE_NAME,
+    DEFAULT_POD_USERNAME,
+    generate_config_file,
+)
+from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
+
+CHECK_INTERVAL_SECONDS = 15
+TIMEOUT_SECONDS = 25 * 60
+CONN_ID = "eks"
+REGION = Session().region_name
+DEFAULT_COMPUTE_TYPE = 'nodegroup'
+DEFAULT_NODEGROUP_NAME_SUFFIX = '-nodegroup'
+DEFAULT_POD_NAME = 'pod'
+KUBE_CONFIG_ENV_VAR = 'KUBECONFIG'
+
+
+class EKSCreateClusterOperator(BaseOperator):
+    """
+    Creates an Amazon EKS Cluster control plane.
+
+    Optionally, can also create the supporting compute architecture:
+    If argument 'compute' is provided with a value of 'nodegroup', will also attempt to create an Amazon
+    EKS Managed Nodegroup for the cluster.  See EKSCreateNodegroupOperator documentation for requirements.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:EKSCreateClusterOperator`
+
+    :param cluster_name: The unique name to give to your Amazon EKS Cluster.
+    :type cluster_name: str
+    :param cluster_role_arn: The Amazon Resource Name (ARN) of the IAM role that provides permissions for the
+       Kubernetes control plane to make calls to AWS API operations on your behalf.
+    :type cluster_role_arn: str
+    :param resources_vpc_config: The VPC configuration used by the cluster control plane.
+    :type resources_vpc_config: Dict
+    :param compute: The type of compute architecture to generate along with the cluster.
+        Defaults to 'nodegroup' to generate an EKS Managed Nodegroup.
+    :type compute: str
+    :param aws_conn_id: The Airflow connection used for AWS credentials.
+         If this is None or empty then the default boto3 behaviour is used. If
+         running Airflow in a distributed manner and aws_conn_id is None or
+         empty, then the default boto3 configuration would be used (and must be
+         maintained on each worker node).
+    :type aws_conn_id: str
+
+    If 'compute' is 'nodegroup', the following are required:
+
+    :param nodegroup_name: The unique name to give your EKS Managed Nodegroup.
+    :type nodegroup_name: str
+    :param nodegroup_role_arn: The Amazon Resource Name (ARN) of the IAM role to associate
+         with the EKS Managed Nodegroup.
+    :type nodegroup_role_arn: str
+
+    """
+
+    def __init__(
+        self,
+        cluster_name: str,
+        cluster_role_arn: str,
+        resources_vpc_config: Dict,
+        nodegroup_name: Optional[str] = None,
+        nodegroup_role_arn: Optional[str] = None,
+        compute: Optional[str] = DEFAULT_COMPUTE_TYPE,
+        conn_id: Optional[str] = CONN_ID,
+        region: Optional[str] = REGION,
+        **kwargs,
+    ) -> None:
+        super().__init__(**kwargs)
+        self.clusterName = cluster_name
+        self.clusterRoleArn = cluster_role_arn
+        self.resourcesVpcConfig = resources_vpc_config
+        self.compute = compute
+        self.conn_id = conn_id
+        self.region = region
+
+        if self.compute == 'nodegroup':
+            self.nodegroupName = nodegroup_name or self.clusterName + DEFAULT_NODEGROUP_NAME_SUFFIX
+            if nodegroup_role_arn:
+                self.nodegroupRoleArn = nodegroup_role_arn
+            else:
+                message = "Creating an EKS Managed Nodegroup requires nodegroup_role_arn to be passed in."
+                self.log.error(message)
+                raise AttributeError(message)
+
+    def execute(self, context):
+        eks_hook = EksHook(
+            aws_conn_id=self.conn_id,
+            region_name=self.region,
+        )
+
+        eks_hook.create_cluster(
+            name=self.clusterName,
+            roleArn=self.clusterRoleArn,
+            resourcesVpcConfig=self.resourcesVpcConfig,
+        )
+
+        if self.compute is not None:
+            self.log.info("Waiting for EKS Cluster to provision.  This will take some time.")
+
+            countdown = TIMEOUT_SECONDS
+            while eks_hook.get_cluster_state(clusterName=self.clusterName) != "ACTIVE":
+                if countdown >= CHECK_INTERVAL_SECONDS:
+                    countdown -= CHECK_INTERVAL_SECONDS
+                    self.log.info(
+                        "Waiting for cluster to start.  Checking again in %d seconds", CHECK_INTERVAL_SECONDS
+                    )
+                    sleep(CHECK_INTERVAL_SECONDS)
+                else:
+                    message = "Cluster is still inactive after the allocated time limit.  Aborting."

Review comment:
       Teardown added in coming revision.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-870089112


   > In what way? Are you talking specifically about getting rid of the AWS CLI tool as you mentioned above, or did you have something else in mind? If that is what you are thinking, then please see the reply to your comment above and let's come up with something.
   
   I couldn't give a negative review without leaving this comment. I didn't mean one issue, but I did have more comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #16571: Implemented Basic EKS Integration

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on pull request #16571:
URL: https://github.com/apache/airflow/pull/16571#issuecomment-880135534


   After a discussion with mik-laj on Slack, we agreed to move forward with the 1-hour limit for now, clearly called out in the docs, and start work immediately on a fix for extending that timeframe.
   
   I've gone through the comments above and it looks like there are three remaining fixes required:
   
   1. Add template fields to all operators
   2. Add system tests
   3. Adjust the eks.rst doc file to standard
   
   If anyone sees or remembers something I've missed, please remind me.  It's been a very long discussion, I may have missed something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org